-
Notifications
You must be signed in to change notification settings - Fork 1k
Enhancement: persist commit index in LogStore to accelerate recovery #613
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from 5 commits
2e5a8a0
ffc6b3b
7383d96
f6295e0
f2ae7a9
ce1895c
ab50a58
4e7e04b
41df55e
400a27d
e2617e8
6daca47
cc09317
fe57b32
20e8701
6f146e1
a8438b0
5e6d8a4
e248f00
2a913ab
7cd6732
92c04a0
8e8ba07
2020cab
2a7d584
bdac45b
ed47a25
ad87d86
30fc43e
e797962
500567f
cfffcb5
560c0b9
8c722fa
300a6e7
8d11a28
1bdf161
3a5d299
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change | ||||||
---|---|---|---|---|---|---|---|---|
|
@@ -217,6 +217,10 @@ type Raft struct { | |||||||
// preVoteDisabled control if the pre-vote feature is activated, | ||||||||
// prevote feature is disabled if set to true. | ||||||||
preVoteDisabled bool | ||||||||
|
||||||||
// fastRecovery is used to enable fast recovery mode | ||||||||
// fast recovery mode is disabled if set to false. | ||||||||
fastRecovery bool | ||||||||
} | ||||||||
|
||||||||
// BootstrapCluster initializes a server's storage with the given cluster | ||||||||
|
@@ -566,6 +570,7 @@ func NewRaft(conf *Config, fsm FSM, logs LogStore, stable StableStore, snaps Sna | |||||||
followerNotifyCh: make(chan struct{}, 1), | ||||||||
mainThreadSaturation: newSaturationMetric([]string{"raft", "thread", "main", "saturation"}, 1*time.Second), | ||||||||
preVoteDisabled: conf.PreVoteDisabled || !transportSupportPreVote, | ||||||||
fastRecovery: conf.FastRecovery, | ||||||||
} | ||||||||
if !transportSupportPreVote && !conf.PreVoteDisabled { | ||||||||
r.logger.Warn("pre-vote is disabled because it is not supported by the Transport") | ||||||||
|
@@ -606,6 +611,8 @@ func NewRaft(conf *Config, fsm FSM, logs LogStore, stable StableStore, snaps Sna | |||||||
// to be called concurrently with a blocking RPC. | ||||||||
trans.SetHeartbeatHandler(r.processHeartbeat) | ||||||||
|
||||||||
r.recoverFromCommitedLogs() | ||||||||
|
||||||||
if conf.skipStartup { | ||||||||
return r, nil | ||||||||
} | ||||||||
|
@@ -697,6 +704,29 @@ func (r *Raft) tryRestoreSingleSnapshot(snapshot *SnapshotMeta) bool { | |||||||
return true | ||||||||
} | ||||||||
|
||||||||
// recoverFromCommitedLogs recovers the Raft node from committed logs. | ||||||||
func (r *Raft) recoverFromCommitedLogs() error { | ||||||||
if !r.fastRecovery { | ||||||||
return nil | ||||||||
} | ||||||||
// If the store implements CommitTrackingLogStore, we can read the commit index from the store. | ||||||||
// This is useful when the store is able to track the commit index and we can avoid replaying logs. | ||||||||
store, ok := r.logs.(CommitTrackingLogStore) | ||||||||
if !ok { | ||||||||
return nil | ||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
I'm worried about the case where someone opts into fast recovery but never actually gets to use it because their log store doesn't support it. Even for uses like mine (Nomad) where the logstore is hardcoded, reducing this to a programming error not a runtime misconfiguration, I don't see how I could observe whether our implementation was using the fast recovery path or not. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 👍 I missed that this was There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. thanks! |
||||||||
} | ||||||||
commitIndex, err := store.GetCommitIndex() | ||||||||
if err != nil { | ||||||||
return fmt.Errorf("failed to read commit index from store: %w", err) | ||||||||
} | ||||||||
if commitIndex == 0 { | ||||||||
return nil | ||||||||
} | ||||||||
|
||||||||
r.processLogs(commitIndex, nil) | ||||||||
lalalalatt marked this conversation as resolved.
Show resolved
Hide resolved
|
||||||||
return nil | ||||||||
} | ||||||||
|
||||||||
func (r *Raft) config() Config { | ||||||||
return r.conf.Load().(Config) | ||||||||
} | ||||||||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -235,6 +235,11 @@ type Config struct { | |
// PreVoteDisabled deactivate the pre-vote feature when set to true | ||
PreVoteDisabled bool | ||
|
||
// FastRecovery controls if the Raft server should use the fast recovery | ||
// mechanism. This mechanism allows a server to apply logs to the FSM till | ||
// the last committed log | ||
lalalalatt marked this conversation as resolved.
Show resolved
Hide resolved
|
||
FastRecovery bool | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Not sure if this is a naming nitpick or just a question: From the perspective of a caller to NewRaft, IIUC the "fast" aspect is due to more logs being replayed locally instead of streamed from a peer. Logs committed while this member was down will need to be streamed, but presumably that's often a fraction of the total log size. If my understanding is accurate, an alternative name might be There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is great point actually. I'm not sure "Fast" does capture the semantics in any case really: mostly startup will take marginally to a lot longer, but on the plus side the FSM will actually startup in the same state it was before the node restarted which is probably what most users of the library assumed was the case already!
Yeah I think you perfectly described the tradeoff. I think There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @schmichael @banks How about There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Naming is the worst. 😅 Looking around a bit we're not totally consistent, but I think generally:
So I think switching After that I think I prefer |
||
|
||
// skipStartup allows NewRaft() to bypass all background work goroutines | ||
skipStartup bool | ||
} | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -190,3 +190,8 @@ func emitLogStoreMetrics(s LogStore, prefix []string, interval time.Duration, st | |
} | ||
} | ||
} | ||
|
||
type CommitTrackingLogStore interface { | ||
SetCommitIndex(idx uint64) error | ||
GetCommitIndex() (uint64, error) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. How would a There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. For BoltDB, I imagine commit index would be a single KV in a separate bucket from logs so it would just read that and return it. For WAL I anticipated extending the format slightly so that each commit entry in the log stores the most recently staged commit index and then re-populated that into memory when we open the log an scan it like we do with indexes. If there is no commit index stored, we should just return There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
I agree! I think that should be documented though. Because the API allow erroring on |
||
} |
Uh oh!
There was an error while loading. Please reload this page.