How etcd guarantee request executed exactly once? #7062

dearboll · 2016-12-26T07:44:47Z

Hi all, I read the raft paper and have confused with some differences between raft and etcd implements. While a client request a command to the server, the leader has handled with it but the response lost.The client retry the request, how ectd guarantees that each command would be executed exactly once?

mitake · 2016-12-26T13:20:56Z

Txn() RPC can switch its effect (getting and putting keys) based on a conditional branch. And the branch can compare revision or value of keys. If two conflicting transactions that modify a single key are issued, later one (maybe issued in retry process) can be failed if they read the revision of the key as a preparation and compare it with the latest one in If() part of txn. It is because the first one updates the revision during its commit process. This mechanism can provide the exactly once semantics in etcd v3.

Note that Raft itself doesn't care about the exactly once semantics, it applies a log entry to its state machine at most once. Providing the semantics is responsibility of a state machine replicated by Raft. Chapter 6 of Diego's dissertation (https://github.com/ongardie/dissertation) describes this topic. Also clientv3/concurrency and this blog post https://coreos.com/blog/transactional-memory-with-etcd3.html would be helpful for you.

gyuho · 2016-12-27T20:50:12Z

Related answer

https://groups.google.com/d/msg/etcd-dev/jWv7Kja1tMQ/G9DLHlomCgAJ

What's more, How the etcd deal with clients' replicated requests? For example, client requested to etcd servers and the leader handled it, but the response lost. The client may retry the requests to servers. Would ectd redo the request or abort it? How to abort a handled resquest? Like the raft paper recording a unique id and the response for each request for clients?

The etcd go client uses at-most-once semantics for writes and at-least-once for reads. A lost write response (e.g., from network disconnect) will return an error and lost reads will be retried. A request cannot be rolled back without rebuilding the cluster from a backup; if a write response is lost the application code is responsible for gracefully recovering.

xiang90 · 2016-12-29T17:35:35Z

@dearboll Answers make a lot of sense. So I am closing out this issue.

@heyitsanthony should we add this one into our dev guild somewhere?

xiang90 closed this as completed Dec 29, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How etcd guarantee request executed exactly once? #7062

How etcd guarantee request executed exactly once? #7062

dearboll commented Dec 26, 2016

mitake commented Dec 26, 2016

gyuho commented Dec 27, 2016

xiang90 commented Dec 29, 2016

How etcd guarantee request executed exactly once? #7062

How etcd guarantee request executed exactly once? #7062

Comments

dearboll commented Dec 26, 2016

mitake commented Dec 26, 2016

gyuho commented Dec 27, 2016

xiang90 commented Dec 29, 2016