Description
This was discussed during the spec meeting and is weakly linked to #932
Let's imagine we have two nodes Alice and Bob.
Alice takes her node offline, does weird stuff with her DB (e.g. a migration from sqlite to postgres) then comes back online.
She sends a channel_reestablish
to Bob with outdated values (because she messed up her migration).
Expected behavior
Bob detects that Alice is late, so Bob will likely need to publish his latest commitment to help Alice get her funds back.
Bob waits for Alice to send an error before publishing his commitment.
When Alice receives Bob's channel_reestablish
, she realizes she's late.
She stops her node (without sending an error), figures out where she messed up in her migration, fixes her DB, restarts.
Now she sends a channel_reestablish
with the up-to-date value, so the channel can resume operating.
Non optimal behavior
Bob detects that Alice is late, so Bob publishes his latest commitment to help Alice.
But now Alice lost her chance to fix her DB.
If Alice is a big node with a ton of channels, she just lost a ton of money on on-chain fees...
Conclusion
Implementations should really wait to receive an error
before publishing their commitment!
@Roasbeef is that clearer than during the spec meeting?
EDIT: the spec was clarified in #942 to highlight that this is the desired behavior.