Open
Description
Is your feature request related to a problem? Please describe
- If the recoveries have large amount of uncommitted operations, they timeout and the translogs have to be download afresh causing a loop of failed recoveries each timing out as there are no incremental downloads on retries
- The remote translog recovery acquires a shard lock while downloading translog files. Now if the recovery fails the shard cannot be closed until the recovery of translog completes which causes the cluster applier thread to get blocked, causing nodes to lag the cluster state and ultimately drop out of the cluster
"opensearch[691aeed35826ecc93653e3011d18c9b1][clusterApplierService#updateTask][T#1]" #268 daemon prio=5 os_prio=0 cpu=69394.87ms elapsed=10325.08s tid=0x0000ffdde862cd40 nid=0x487c waiting for monitor entry [0x0000ffdc2f4fd000]
java.lang.Thread.State: BLOCKED (on object monitor)
at org.opensearch.indices.cluster.IndicesClusterStateService.applyClusterState(IndicesClusterStateService.java:265)
- waiting to lock <0x0000ffe0683bbfa8> (a org.opensearch.indices.cluster.IndicesClusterStateService)
at org.opensearch.cluster.service.ClusterApplierService.callClusterStateAppliers(ClusterApplierService.java:608)
at org.opensearch.cluster.service.ClusterApplierService.callClusterStateAppliers(ClusterApplierService.java:595)
at org.opensearch.cluster.service.ClusterApplierService.applyChanges(ClusterApplierService.java:563)
at org.opensearch.cluster.service.ClusterApplierService.runTask(ClusterApplierService.java:486)
at org.opensearch.cluster.service.ClusterApplierService$UpdateTask.run(ClusterApplierService.java:188)
at org.opensearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:863)
at org.opensearch.common.util.concurrent.PrioritizedOpenSearchThreadPoolExecutor$TieBreakingPrioritizedRunnable.runAndClean(PrioritizedOpenSearchThreadPoolExecutor.java:283)
at org.opensearch.common.util.concurrent.PrioritizedOpenSearchThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(PrioritizedOpenSearchThreadPoolExecutor.java:246)
at java.util.concurrent.ThreadPoolExecutor.runWorker([email protected]/ThreadPoolExecutor.java:1136)
at java.util.concurrent.ThreadPoolExecutor$Worker.run([email protected]/ThreadPoolExecutor.java:635)
at java.lang.Thread.run([email protected]/Thread.java:840)
Locked ownable synchronizers:
- <0x0000ffe06a6cba78> (a java.util.concurrent.ThreadPoolExecutor$Worker)
"opensearch[691aeed35826ecc93653e3011d18c9b1][generic][T#26]" #290 daemon prio=5 os_prio=0 cpu=464.40ms elapsed=10325.07s tid=0x0000ffdc90029390 nid=0x4892 waiting for monitor entry [0x0000ffdc2defd000]
java.lang.Thread.State: BLOCKED (on object monitor)
at org.opensearch.index.shard.IndexShard.close(IndexShard.java:2110)
- waiting to lock <0x0000ffe1449c1d98> (a java.lang.Object)
at org.opensearch.index.IndexService.closeShard(IndexService.java:644)
at org.opensearch.index.IndexService.removeShard(IndexService.java:620)
- locked <0x0000ffe0931fbd80> (a org.opensearch.index.IndexService)
at org.opensearch.indices.cluster.IndicesClusterStateService.failAndRemoveShard(IndicesClusterStateService.java:817)
at org.opensearch.indices.cluster.IndicesClusterStateService.handleRecoveryFailure(IndicesClusterStateService.java:797)
- locked <0x0000ffe0683bbfa8> (a org.opensearch.indices.cluster.IndicesClusterStateService)
at org.opensearch.indices.recovery.RecoveryListener.onFailure(RecoveryListener.java:55)
at org.opensearch.indices.recovery.RecoveryTarget.notifyListener(RecoveryTarget.java:136)
at org.opensearch.indices.replication.common.ReplicationTarget.fail(ReplicationTarget.java:180)
at org.opensearch.indices.replication.common.ReplicationCollection.fail(ReplicationCollection.java:212)
at org.opensearch.indices.recovery.PeerRecoveryTargetService$RecoveryResponseHandler.onException(PeerRecoveryTargetService.java:756)
at org.opensearch.indices.recovery.PeerRecoveryTargetService$RecoveryResponseHandler.handleException(PeerRecoveryTargetService.java:682)
at org.opensearch.security.transport.SecurityInterceptor$RestoringTransportResponseHandler.handleException(SecurityInterceptor.java:430)
at org.opensearch.transport.TransportService$ContextRestoreResponseHandler.handleException(TransportService.java:1515)
at org.opensearch.transport.InboundHandler.lambda$handleException$5(InboundHandler.java:447)
at org.opensearch.transport.InboundHandler$$Lambda$8371/0x000000a00227e220.run(Unknown Source)
at org.opensearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:863)
at java.util.concurrent.ThreadPoolExecutor.runWorker([email protected]/ThreadPoolExecutor.java:1136)
at java.util.concurrent.ThreadPoolExecutor$Worker.run([email protected]/ThreadPoolExecutor.java:635)
at java.lang.Thread.run([email protected]/Thread.java:840)
Locked ownable synchronizers:
- <0x0000ffe06a605900> (a java.util.concurrent.ThreadPoolExecutor$Worker)
"opensearch[691aeed35826ecc93653e3011d18c9b1][generic][T#19]" #283 daemon prio=5 os_prio=0 cpu=187528.43ms elapsed=10325.07s tid=0x0000ffdc90021ff0 nid=0x488b waiting on condition [0x0000ffdc2e5fd000]
java.lang.Thread.State: WAITING (parking)
at jdk.internal.misc.Unsafe.park([email protected]/Native Method)
- parking to wait for <0x0000fffa222465d0> (a java.util.concurrent.FutureTask)
at java.util.concurrent.locks.LockSupport.park([email protected]/LockSupport.java:211)
at java.util.concurrent.FutureTask.awaitDone([email protected]/FutureTask.java:447)
at java.util.concurrent.FutureTask.get([email protected]/FutureTask.java:190)
at org.opensearch.encryption.frame.CryptoInputStream.read(CryptoInputStream.java:193)
at java.io.InputStream.transferTo([email protected]/InputStream.java:782)
at java.nio.file.Files.copy([email protected]/Files.java:3171)
at org.opensearch.index.translog.transfer.TranslogTransferManager.downloadToFS(TranslogTransferManager.java:312)
at org.opensearch.index.translog.transfer.TranslogTransferManager.downloadTranslog(TranslogTransferManager.java:258)
at org.opensearch.index.translog.RemoteFsTranslog.downloadOnce(RemoteFsTranslog.java:246)
at org.opensearch.index.translog.RemoteFsTranslog.download(RemoteFsTranslog.java:213)
at org.opensearch.index.translog.RemoteFsTranslog.download(RemoteFsTranslog.java:196)
at org.opensearch.index.shard.IndexShard.syncTranslogFilesFromRemoteTranslog(IndexShard.java:5000)
at org.opensearch.index.shard.IndexShard.syncRemoteTranslogAndUpdateGlobalCheckpoint(IndexShard.java:4978)
at org.opensearch.index.shard.IndexShard.innerOpenEngineAndTranslog(IndexShard.java:2584)
- locked <0x0000ffe1449c1d98> (a java.lang.Object)
at org.opensearch.index.shard.IndexShard.innerOpenEngineAndTranslog(IndexShard.java:2554)
Describe the solution you'd like
- Makes translog downloads on recovery incremental
- Make translog downloads on recovery cancellable
- Parallelise downloads and and translog replays
- Attempt & trigger flush on recovery failures
Related component
Storage:Remote
Describe alternatives you've considered
No response
Additional context
No response
Metadata
Metadata
Assignees
Type
Projects
Status
🆕 New