Skip to content

[BUG] AccessControlException exception in rolling upgrade with recent 3.0.0 snapshot #4788

Closed
@martin-gaievski

Description

@martin-gaievski

Describe the bug
With recent 3.0.0 snapshot we're facing java.security.AccessControlException when doing rolling upgrade for k-NN plugin from 2.3.0 to 3.0.0-snapshot.

Exception that we're seeing in log

Caused by: java.security.AccessControlException: access denied ("java.lang.RuntimePermission" "accessClassInPackage.jdk.internal.org.objectweb.asm")
  [2022-10-13T22:20:29,694][WARN ][stderr                   ] [knnBwcCluster-rolling-0]   at java.base/java.security.AccessControlContext.checkPermission(AccessControlContext.java:472)
  [2022-10-13T22:20:29,695][WARN ][stderr                   ] [knnBwcCluster-rolling-0]   at java.base/java.security.AccessController.checkPermission(AccessController.java:897)
  [2022-10-13T22:20:29,695][WARN ][stderr                   ] [knnBwcCluster-rolling-0]   at java.base/java.lang.SecurityManager.checkPermission(SecurityManager.java:322)
  [2022-10-13T22:20:29,695][WARN ][stderr                   ] [knnBwcCluster-rolling-0]   at java.base/java.lang.SecurityManager.checkPackageAccess(SecurityManager.java:1238)
  [2022-10-13T22:20:29,696][WARN ][stderr                   ] [knnBwcCluster-rolling-0]   at java.base/jdk.internal.loader.ClassLoaders$AppClassLoader.loadClass(ClassLoaders.java:174)
  [2022-10-13T22:20:29,696][WARN ][stderr                   ] [knnBwcCluster-rolling-0]   at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:522)
  [2022-10-13T22:20:29,696][WARN ][stderr                   ] [knnBwcCluster-rolling-0]   at Log4jHotPatch.asmVersion(Log4jHotPatch.java:71)
  [2022-10-13T22:20:29,696][WARN ][stderr                   ] [knnBwcCluster-rolling-0]   at Log4jHotPatch.agentmain(Log4jHotPatch.java:93)

To Reproduce
Steps to reproduce the behavior:
Local environment approach:

  1. Get latest code from https://github.com/opensearch-project/k-NN, check that you are on main branch.
  2. Run rolling-upgrade (upgrade to next version by upgrading cluster nodes one by one) by doing ./gradlew :qa:rolling-upgrade:testAgainstOneThirdUpgradedCluster -Dbwc.version=2.3.0.
  3. Output
> Task :qa:rolling-upgrade:testAgainstOneThirdUpgradedCluster FAILED

FAILURE: Build failed with an exception.

* Where:
Build file '/local/home/gaievski/dev/opensearch/k-NN/qa/rolling-upgrade/build.gradle' line: 65

* What went wrong:
Execution failed for task ':qa:rolling-upgrade:testAgainstOneThirdUpgradedCluster'.
> `cluster{:qa:rolling-upgrade:knnBwcCluster-rolling}` failed to wait for cluster health yellow after 40 SECONDS
    IO error while waiting cluster
    503 Service Unavailable

* Try:
> Run with --stacktrace option to get the stack trace.
> Run with --info or --debug option to get more log output.
> Run with --scan to get full insights.

Interesting fact - if we're doing restart upgrade (start cluster on old version, stop and upgrade all nodes at once and start cluster) or just start multi-node cluster for 3.0.0 snapshot exception doesn't appear.

Alternative way - trigger Github workflow for k-NN plugin by making any change and creating new PR, for example https://github.com/martin-gaievski/k-NN/tree/fix-build-replace-basenoderequest-by-transportrequest. Example of such PR - https://github.com/opensearch-project/k-NN/pull/577/checks

Additional context
Full exception trace is below

[d8kM05naRWaAzESaz4IqqA, yoY-Jd9TSk6L4oihZ2lR5w, Za30ZkG4Q6qdpvincalAaQ], have discovered [{knnBwcCluster-rolling-0}{Za30ZkG4Q6qdpvincalAaQ}{_q7BHUm7QDeapjSOITBO_w}{127.0.0.1}{127.0.0.1:34285}{dimr}{upgraded=true,         testattr=test, shard_indexing_pressure_enabled=true}] which is not a quorum; discovery will continue using [[::1]:33097, 127.0.0.1:33747, [::1]:34467, 127.0.0.1:36139] from hosts providers and [{knnBwcCluster-rolling-     0}{Za30ZkG4Q6qdpvincalAaQ}{_q7BHUm7QDeapjSOITBO_w}{127.0.0.1}{127.0.0.1:34285}{dimr}{upgraded=true, testattr=test, shard_indexing_pressure_enabled=true}] from last-known cluster state; node term 1, last-accepted version   4 in term 1
  [2022-10-13T22:20:29,692][WARN ][stderr                   ] [knnBwcCluster-rolling-0] java.lang.reflect.InvocationTargetException
  [2022-10-13T22:20:29,692][WARN ][stderr                   ] [knnBwcCluster-rolling-0]   at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
  [2022-10-13T22:20:29,693][WARN ][stderr                   ] [knnBwcCluster-rolling-0]   at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)permission ipererererer
  [2022-10-13T22:20:29,693][WARN ][stderr                   ] [knnBwcCluster-rolling-0]   at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
  [2022-10-13T22:20:29,693][WARN ][stderr                   ] [knnBwcCluster-rolling-0]   at java.base/java.lang.reflect.Method.invoke(Method.java:566)
  [2022-10-13T22:20:29,693][WARN ][stderr                   ] [knnBwcCluster-rolling-0]   at java.instrument/sun.instrument.InstrumentationImpl.loadClassAndStartAgent(InstrumentationImpl.java:513)
  [2022-10-13T22:20:29,694][WARN ][stderr                   ] [knnBwcCluster-rolling-0]   at java.instrument/sun.instrument.InstrumentationImpl.loadClassAndCallAgentmain(InstrumentationImpl.java:535)
  [2022-10-13T22:20:29,694][WARN ][stderr                   ] [knnBwcCluster-rolling-0] Caused by: java.security.AccessControlException: access denied ("java.lang.RuntimePermission" "accessClassInPackage.jdk.internal.org.objectweb.asm")
  [2022-10-13T22:20:29,694][WARN ][stderr                   ] [knnBwcCluster-rolling-0]   at java.base/java.security.AccessControlContext.checkPermission(AccessControlContext.java:472)
  [2022-10-13T22:20:29,695][WARN ][stderr                   ] [knnBwcCluster-rolling-0]   at java.base/java.security.AccessController.checkPermission(AccessController.java:897)
  [2022-10-13T22:20:29,695][WARN ][stderr                   ] [knnBwcCluster-rolling-0]   at java.base/java.lang.SecurityManager.checkPermission(SecurityManager.java:322)
  [2022-10-13T22:20:29,695][WARN ][stderr                   ] [knnBwcCluster-rolling-0]   at java.base/java.lang.SecurityManager.checkPackageAccess(SecurityManager.java:1238)
  [2022-10-13T22:20:29,696][WARN ][stderr                   ] [knnBwcCluster-rolling-0]   at java.base/jdk.internal.loader.ClassLoaders$AppClassLoader.loadClass(ClassLoaders.java:174)
  [2022-10-13T22:20:29,696][WARN ][stderr                   ] [knnBwcCluster-rolling-0]   at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:522)
  [2022-10-13T22:20:29,696][WARN ][stderr                   ] [knnBwcCluster-rolling-0]   at Log4jHotPatch.asmVersion(Log4jHotPatch.java:71)
  [2022-10-13T22:20:29,696][WARN ][stderr                   ] [knnBwcCluster-rolling-0]   at Log4jHotPatch.agentmain(Log4jHotPatch.java:93)
  [2022-10-13T22:20:29,697][WARN ][stderr                   ] [knnBwcCluster-rolling-0]   ... 6 more
  [2022-10-13T22:20:35,487][WARN ][o.o.c.c.ClusterFormationFailureHelper] [knnBwcCluster-rolling-0] cluster-manager not discovered or elected yet, an election requires at least 2 nodes with ids from                          [d8kM05naRWaAzESaz4IqqA, yoY-Jd9TSk6L4oihZ2lR5w, Za30ZkG4Q6qdpvincalAaQ], have discovered [{knnBwcCluster-rolling-0}{Za30ZkG4Q6qdpvincalAaQ}{_q7BHUm7QDeapjSOITBO_w}{127.0.0.1}{127.0.0.1:34285}{dimr}{upgraded=true,         testattr=test, shard_indexing_pressure_enabled=true}] which is not a quorum; discovery will continue using [[::1]:33097, 127.0.0.1:33747, [::1]:34467, 127.0.0.1:36139] from hosts providers and [{knnBwcCluster-rolling-     0}{Za30ZkG4Q6qdpvincalAaQ}{_q7BHUm7QDeapjSOITBO_w}{127.0.0.1}{127.0.0.1:34285}{dimr}{upgraded=true, testattr=test, shard_indexing_pressure_enabled=true}] from last-known cluster state; node term 1, last-accepted version   4 in term 1
  [2022-10-13T22:20:45,489][WARN ][o.o.c.c.ClusterFormationFailureHelper] [knnBwcCluster-rolling-0] cluster-manager not discovered or elected yet, an election requires at least 2 nodes with ids from                          [d8kM05naRWaAzESaz4IqqA, yoY-Jd9TSk6L4oihZ2lR5w, Za30ZkG4Q6qdpvincalAaQ], have discovered [{knnBwcCluster-rolling-0}{Za30ZkG4Q6qdpvincalAaQ}{_q7BHUm7QDeapjSOITBO_w}{127.0.0.1}{127.0.0.1:34285}{dimr}{upgraded=true,         testattr=test, shard_indexing_pressure_enabled=true}] which is not a quorum; discovery will continue using [[::1]:33097, 127.0.0.1:33747, [::1]:34467, 127.0.0.1:36139] from hosts providers and [{knnBwcCluster-rolling-     0}{Za30ZkG4Q6qdpvincalAaQ}{_q7BHUm7QDeapjSOITBO_w}{127.0.0.1}{127.0.0.1:34285}{dimr}{upgraded=true, testattr=test, shard_indexing_pressure_enabled=true}] from last-known cluster state; node term 1, last-accepted version   4 in term 1
  [2022-10-13T22:20:45,910][WARN ][r.suppressed             ] [knnBwcCluster-rolling-0] path: /_cluster/health, params: {wait_for_status=yellow, wait_for_nodes=>=3}
  org.opensearch.discovery.ClusterManagerNotDiscoveredException: null
          at org.opensearch.action.support.clustermanager.TransportClusterManagerNodeAction$AsyncSingleAction$2.onTimeout(TransportClusterManagerNodeAction.java:305) [opensearch-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
          at org.opensearch.cluster.ClusterStateObserver$ContextPreservingListener.onTimeout(ClusterStateObserver.java:394) [opensearch-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
          at org.opensearch.cluster.ClusterStateObserver$ObserverClusterStateListener.onTimeout(ClusterStateObserver.java:294) [opensearch-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
          at org.opensearch.cluster.service.ClusterApplierService$NotifyTimeout.run(ClusterApplierService.java:707) [opensearch-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
          at org.opensearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:747) [opensearch-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
          at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [?:?]
          at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [?:?]
          at java.lang.Thread.run(Thread.java:829) [?:?]

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions