Description
Describe the bug
With recent 3.0.0 snapshot we're facing java.security.AccessControlException when doing rolling upgrade for k-NN plugin from 2.3.0 to 3.0.0-snapshot.
Exception that we're seeing in log
Caused by: java.security.AccessControlException: access denied ("java.lang.RuntimePermission" "accessClassInPackage.jdk.internal.org.objectweb.asm")
[2022-10-13T22:20:29,694][WARN ][stderr ] [knnBwcCluster-rolling-0] at java.base/java.security.AccessControlContext.checkPermission(AccessControlContext.java:472)
[2022-10-13T22:20:29,695][WARN ][stderr ] [knnBwcCluster-rolling-0] at java.base/java.security.AccessController.checkPermission(AccessController.java:897)
[2022-10-13T22:20:29,695][WARN ][stderr ] [knnBwcCluster-rolling-0] at java.base/java.lang.SecurityManager.checkPermission(SecurityManager.java:322)
[2022-10-13T22:20:29,695][WARN ][stderr ] [knnBwcCluster-rolling-0] at java.base/java.lang.SecurityManager.checkPackageAccess(SecurityManager.java:1238)
[2022-10-13T22:20:29,696][WARN ][stderr ] [knnBwcCluster-rolling-0] at java.base/jdk.internal.loader.ClassLoaders$AppClassLoader.loadClass(ClassLoaders.java:174)
[2022-10-13T22:20:29,696][WARN ][stderr ] [knnBwcCluster-rolling-0] at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:522)
[2022-10-13T22:20:29,696][WARN ][stderr ] [knnBwcCluster-rolling-0] at Log4jHotPatch.asmVersion(Log4jHotPatch.java:71)
[2022-10-13T22:20:29,696][WARN ][stderr ] [knnBwcCluster-rolling-0] at Log4jHotPatch.agentmain(Log4jHotPatch.java:93)
To Reproduce
Steps to reproduce the behavior:
Local environment approach:
- Get latest code from https://github.com/opensearch-project/k-NN, check that you are on main branch.
- Run rolling-upgrade (upgrade to next version by upgrading cluster nodes one by one) by doing
./gradlew :qa:rolling-upgrade:testAgainstOneThirdUpgradedCluster -Dbwc.version=2.3.0
. - Output
> Task :qa:rolling-upgrade:testAgainstOneThirdUpgradedCluster FAILED
FAILURE: Build failed with an exception.
* Where:
Build file '/local/home/gaievski/dev/opensearch/k-NN/qa/rolling-upgrade/build.gradle' line: 65
* What went wrong:
Execution failed for task ':qa:rolling-upgrade:testAgainstOneThirdUpgradedCluster'.
> `cluster{:qa:rolling-upgrade:knnBwcCluster-rolling}` failed to wait for cluster health yellow after 40 SECONDS
IO error while waiting cluster
503 Service Unavailable
* Try:
> Run with --stacktrace option to get the stack trace.
> Run with --info or --debug option to get more log output.
> Run with --scan to get full insights.
Interesting fact - if we're doing restart upgrade (start cluster on old version, stop and upgrade all nodes at once and start cluster) or just start multi-node cluster for 3.0.0 snapshot exception doesn't appear.
Alternative way - trigger Github workflow for k-NN plugin by making any change and creating new PR, for example https://github.com/martin-gaievski/k-NN/tree/fix-build-replace-basenoderequest-by-transportrequest. Example of such PR - https://github.com/opensearch-project/k-NN/pull/577/checks
Additional context
Full exception trace is below
[d8kM05naRWaAzESaz4IqqA, yoY-Jd9TSk6L4oihZ2lR5w, Za30ZkG4Q6qdpvincalAaQ], have discovered [{knnBwcCluster-rolling-0}{Za30ZkG4Q6qdpvincalAaQ}{_q7BHUm7QDeapjSOITBO_w}{127.0.0.1}{127.0.0.1:34285}{dimr}{upgraded=true, testattr=test, shard_indexing_pressure_enabled=true}] which is not a quorum; discovery will continue using [[::1]:33097, 127.0.0.1:33747, [::1]:34467, 127.0.0.1:36139] from hosts providers and [{knnBwcCluster-rolling- 0}{Za30ZkG4Q6qdpvincalAaQ}{_q7BHUm7QDeapjSOITBO_w}{127.0.0.1}{127.0.0.1:34285}{dimr}{upgraded=true, testattr=test, shard_indexing_pressure_enabled=true}] from last-known cluster state; node term 1, last-accepted version 4 in term 1
[2022-10-13T22:20:29,692][WARN ][stderr ] [knnBwcCluster-rolling-0] java.lang.reflect.InvocationTargetException
[2022-10-13T22:20:29,692][WARN ][stderr ] [knnBwcCluster-rolling-0] at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
[2022-10-13T22:20:29,693][WARN ][stderr ] [knnBwcCluster-rolling-0] at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)permission ipererererer
[2022-10-13T22:20:29,693][WARN ][stderr ] [knnBwcCluster-rolling-0] at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
[2022-10-13T22:20:29,693][WARN ][stderr ] [knnBwcCluster-rolling-0] at java.base/java.lang.reflect.Method.invoke(Method.java:566)
[2022-10-13T22:20:29,693][WARN ][stderr ] [knnBwcCluster-rolling-0] at java.instrument/sun.instrument.InstrumentationImpl.loadClassAndStartAgent(InstrumentationImpl.java:513)
[2022-10-13T22:20:29,694][WARN ][stderr ] [knnBwcCluster-rolling-0] at java.instrument/sun.instrument.InstrumentationImpl.loadClassAndCallAgentmain(InstrumentationImpl.java:535)
[2022-10-13T22:20:29,694][WARN ][stderr ] [knnBwcCluster-rolling-0] Caused by: java.security.AccessControlException: access denied ("java.lang.RuntimePermission" "accessClassInPackage.jdk.internal.org.objectweb.asm")
[2022-10-13T22:20:29,694][WARN ][stderr ] [knnBwcCluster-rolling-0] at java.base/java.security.AccessControlContext.checkPermission(AccessControlContext.java:472)
[2022-10-13T22:20:29,695][WARN ][stderr ] [knnBwcCluster-rolling-0] at java.base/java.security.AccessController.checkPermission(AccessController.java:897)
[2022-10-13T22:20:29,695][WARN ][stderr ] [knnBwcCluster-rolling-0] at java.base/java.lang.SecurityManager.checkPermission(SecurityManager.java:322)
[2022-10-13T22:20:29,695][WARN ][stderr ] [knnBwcCluster-rolling-0] at java.base/java.lang.SecurityManager.checkPackageAccess(SecurityManager.java:1238)
[2022-10-13T22:20:29,696][WARN ][stderr ] [knnBwcCluster-rolling-0] at java.base/jdk.internal.loader.ClassLoaders$AppClassLoader.loadClass(ClassLoaders.java:174)
[2022-10-13T22:20:29,696][WARN ][stderr ] [knnBwcCluster-rolling-0] at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:522)
[2022-10-13T22:20:29,696][WARN ][stderr ] [knnBwcCluster-rolling-0] at Log4jHotPatch.asmVersion(Log4jHotPatch.java:71)
[2022-10-13T22:20:29,696][WARN ][stderr ] [knnBwcCluster-rolling-0] at Log4jHotPatch.agentmain(Log4jHotPatch.java:93)
[2022-10-13T22:20:29,697][WARN ][stderr ] [knnBwcCluster-rolling-0] ... 6 more
[2022-10-13T22:20:35,487][WARN ][o.o.c.c.ClusterFormationFailureHelper] [knnBwcCluster-rolling-0] cluster-manager not discovered or elected yet, an election requires at least 2 nodes with ids from [d8kM05naRWaAzESaz4IqqA, yoY-Jd9TSk6L4oihZ2lR5w, Za30ZkG4Q6qdpvincalAaQ], have discovered [{knnBwcCluster-rolling-0}{Za30ZkG4Q6qdpvincalAaQ}{_q7BHUm7QDeapjSOITBO_w}{127.0.0.1}{127.0.0.1:34285}{dimr}{upgraded=true, testattr=test, shard_indexing_pressure_enabled=true}] which is not a quorum; discovery will continue using [[::1]:33097, 127.0.0.1:33747, [::1]:34467, 127.0.0.1:36139] from hosts providers and [{knnBwcCluster-rolling- 0}{Za30ZkG4Q6qdpvincalAaQ}{_q7BHUm7QDeapjSOITBO_w}{127.0.0.1}{127.0.0.1:34285}{dimr}{upgraded=true, testattr=test, shard_indexing_pressure_enabled=true}] from last-known cluster state; node term 1, last-accepted version 4 in term 1
[2022-10-13T22:20:45,489][WARN ][o.o.c.c.ClusterFormationFailureHelper] [knnBwcCluster-rolling-0] cluster-manager not discovered or elected yet, an election requires at least 2 nodes with ids from [d8kM05naRWaAzESaz4IqqA, yoY-Jd9TSk6L4oihZ2lR5w, Za30ZkG4Q6qdpvincalAaQ], have discovered [{knnBwcCluster-rolling-0}{Za30ZkG4Q6qdpvincalAaQ}{_q7BHUm7QDeapjSOITBO_w}{127.0.0.1}{127.0.0.1:34285}{dimr}{upgraded=true, testattr=test, shard_indexing_pressure_enabled=true}] which is not a quorum; discovery will continue using [[::1]:33097, 127.0.0.1:33747, [::1]:34467, 127.0.0.1:36139] from hosts providers and [{knnBwcCluster-rolling- 0}{Za30ZkG4Q6qdpvincalAaQ}{_q7BHUm7QDeapjSOITBO_w}{127.0.0.1}{127.0.0.1:34285}{dimr}{upgraded=true, testattr=test, shard_indexing_pressure_enabled=true}] from last-known cluster state; node term 1, last-accepted version 4 in term 1
[2022-10-13T22:20:45,910][WARN ][r.suppressed ] [knnBwcCluster-rolling-0] path: /_cluster/health, params: {wait_for_status=yellow, wait_for_nodes=>=3}
org.opensearch.discovery.ClusterManagerNotDiscoveredException: null
at org.opensearch.action.support.clustermanager.TransportClusterManagerNodeAction$AsyncSingleAction$2.onTimeout(TransportClusterManagerNodeAction.java:305) [opensearch-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at org.opensearch.cluster.ClusterStateObserver$ContextPreservingListener.onTimeout(ClusterStateObserver.java:394) [opensearch-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at org.opensearch.cluster.ClusterStateObserver$ObserverClusterStateListener.onTimeout(ClusterStateObserver.java:294) [opensearch-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at org.opensearch.cluster.service.ClusterApplierService$NotifyTimeout.run(ClusterApplierService.java:707) [opensearch-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at org.opensearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:747) [opensearch-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [?:?]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [?:?]
at java.lang.Thread.run(Thread.java:829) [?:?]