Akka.Cluster.Sharding: Shard
can fail to HandOff
indefinitely
#7500
Milestone
Shard
can fail to HandOff
indefinitely
#7500
Version Information
Version of Akka.NET? v1.5.37
Which Akka.NET Modules? Akka.Cluster.Sharding
Describe the bug
This is a pretty rare bug as far as I can tell - today was the first time I've ever seen this log message ever get logged in 12 years of working with Akka.NET:
akka.net/src/contrib/cluster/Akka.Cluster.Sharding/ShardCoordinator.cs
Lines 1850 to 1853 in 1f7ffa7
Looking more closely at the issue, we see A LOT of unhandled
HandOff
messages over the course of 10-30 minutes:This continues indefinitely.
To Reproduce
Not sure how to reproduce it yet.
Expected behavior
Shards should terminate their entities during a handoff and deallocate all entity actors.
Actual behavior
Not only did the shard not deallocate, but it looks like it didn't attempt to kill off any of its entity actors - otherwise the fail safe from the
HandoffStopper
should kick in:akka.net/src/contrib/cluster/Akka.Cluster.Sharding/ShardRegion.cs
Lines 315 to 324 in 6ffd304
This didn't happen, so it makes me think that the
Shard
got behavior-switched to a state where it couldn't receiveHandOff
messages long before actually attempting to hand off.Screenshots
If applicable, add screenshots to help explain your problem.
Environment
Are you running on Linux? Windows? Docker? Which version of .NET?
Additional context
The text was updated successfully, but these errors were encountered: