You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
[FIXED] Don't InstallSnapshot during shutdown, would race with monitorStream/monitorConsumer (#6153)
When stopping a stream or consumer, we would attempt to install a
snapshot. However, this would race with what's happening in
`monitorStream`/`monitorConsumer` at that time.
For example:
1. In `applyStreamEntries` we call into `mset.processJetStreamMsg` to
persist one or multiple messages.
2. We call `mset.stop(..)` either before or during the above.
3. In `mset.stop(..)` we'd wait for `mset.processJetStreamMsg` to
release the lock so we can enter `mset.stateSnapshotLocked()`. **We
create a snapshot with new state here!**
4. Now we call into `InstallSnapshot` to persist above snapshot, but
`n.applied` does not contain the right value, the value will be lower.
5. Then `applyStreamEntries` finishes and we end with calling
`n.Applied(..)`.
This would be a race condition depending on if 4 happened before or
after 5.
It's essential that the snapshot we make is aligned with the `n.applied`
value. If we don't that means we'll replay and need to increase
`mset.clfs` which will snowball into stream desync due to this shift.
The only place where we can guarantee that the snapshot and applied are
aligned is in `doSnapshot` of `monitorStream` and `monitorConsumer` (and
`monitorCluster`), so we must not attempt installing snapshots outside
of those.
Signed-off-by: Maurice van Veen <[email protected]>
0 commit comments