Skip to content

Add WaitCommitted #640

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

kolesnikovae
Copy link

Hello team,

Just wanted to say thanks for the awesome work on this project – it's been a pleasure to work with. We're using raft in a new experimental component we're building over at Grafana Pyroscope, and it's been super helpful.

As part of our integration, we encountered a use case where quorum replication is sufficient for correctness, and blocking until FSM application introduces unnecessary latency. This PR proposes a small addition that enables that workflow in a clean and backward-compatible way: the new WaitCommitted() method in the ApplyFuture interface. Unlike ApplyFuture.Error(), which blocks until the command has been committed and applied to the local FSM, WaitCommitted() allows users to wait only until the log entry has been replicated to a quorum and marked committed – without blocking on FSM application.

I'm happy to implement the feature in a different way if needed – the code in the PR is just one option I considered. Testing for the new API is also not particularly extensive at the moment, but I'd be more than willing to fill in the gaps if the general idea is accepted.

@kolesnikovae kolesnikovae requested review from a team as code owners April 4, 2025 09:30
@kolesnikovae kolesnikovae requested a review from rboyer April 4, 2025 09:30
Copy link

CLA assistant check

Thank you for your submission! We require that all contributors sign our Contributor License Agreement ("CLA") before we can accept the contribution. Read and sign the agreement

Learn more about why HashiCorp requires a CLA and what the CLA includes

Have you signed the CLA already but the status is still pending? Recheck it.

1 similar comment
Copy link

CLA assistant check

Thank you for your submission! We require that all contributors sign our Contributor License Agreement ("CLA") before we can accept the contribution. Read and sign the agreement

Learn more about why HashiCorp requires a CLA and what the CLA includes

Have you signed the CLA already but the status is still pending? Recheck it.

@@ -569,6 +569,7 @@ func (r *Raft) runLeader() {
// maintain that there exists at most one uncommitted configuration entry in
// any log, so we have to do proper no-ops here.
noop := &logFuture{log: Log{Type: LogNoop}}
noop.init()
Copy link
Author

@kolesnikovae kolesnikovae Apr 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems that the no-op command issued by the leader at the beginning of a term is not properly initialized (which is probably fine), but it is necessary for the change introduced in this PR.

Comment on lines 821 to +822
lastIdxInGroup = idx
close(commitLog.committed)
Copy link
Author

@kolesnikovae kolesnikovae Apr 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I went back and forth between a "safer" option with a nil check (similar to the future error channel) and a more direct approach with an unconditional close. I opted for the latter because I couldn't identify a case where the same logFuture would be committed twice, which, in that case, would likely lead to a bigger issue.

Comment on lines +39 to +43
// WaitCommitted blocks until the log entry has been committed to quorum.
// It does not wait for FSM application.
// The error returned follows the same semantics as the Error method,
// except for errors that occur after the log entry has been committed.
WaitCommitted() error
Copy link
Author

@kolesnikovae kolesnikovae Apr 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

However, there's an implicit dependency on the applyCh buffer size, which is controlled via the BatchApplyCh and MaxAppendEntries configuration parameters (defaults to 64, max 1024, if BatchApplyCh is set to true). Once the buffer is full, Apply blocks.

	// MaxAppendEntries controls the maximum number of append entries
	// to send at once. We want to strike a balance between efficiency
	// and avoiding waste if the follower is going to reject because of
	// an inconsistent log.

I'd like to control the applyCh buffer size explicitly (as 1024 is fairly conservative) via a separate option, e.g., ApplyBufferSize, which would default to MaxAppendEntries and must not be less than that. The option could supersede the BatchApplyCh option.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant