You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
fix(buffers): deadlock when seeking after entire write fails to be flushed (vectordotdev#17657)
## Context
In vectordotdev#17644, a user reported disk buffers getting stuck in an infinite
loop, and thus deadlocking, when restarting after a crash. They provided
some very useful debug information, going as far to evaluate, add some
logging, and get some values of internal state for the reader.
When a disk buffer is initialized -- either for the first time or after
Vector is restarted and the buffer must resume where it left off -- both
the reader and writer perform a catch-up phase. For the writer, it
checks the current data file and tries to figure out if the last record
written matches where it believes it left off. For the reader, it
actually has to dynamically seek to where it left off within the the
given data file, since we can't just open the file and start from the
beginning: data files are append-only.
As part of the seek logic, there's a loop where we just call
`Reader::next` until we read the record we supposedly left off on, and
then we know we're caught up. This loop only breaks on two conditions:
- `self.last_reader_record_id < ledger_last`, which implies we haven't
yet read the last record we left off on (otherwise it would be equal to
`ledger_last`)
- `maybe_record.is_none() && self.last_reader_record_id == 0`, which
would tell us that we reached EOF on the data file (no more records) but
nothing was in the file (`last_reader_record_id` still being 0)
While the first conditional is correct, the second one is not. The user
that originally reported the issue [said as
much](vectordotdev#17644 (comment)),
but dropping the `&& self.last_reader_record_id == 0` fixes the issue.
In this case, there can exist a scenario where Vector crashes and writes
that the reader had read and acknowledged never actually make it to
disk. Both the reader/writer are able to outpace the data on disk
because the reader can read yet-to-be-flushed records since they exist
as dirty pages in the page cache.
When this happens, the reader may have indicated to the ledger that it,
for example, has read up to record ID 10 while the last record _on disk_
when Vector starts up is record ID 5. When the seek logic runs, it knows
the last read record ID was 10. It will do some number of reads while
seeking, eventually reading record ID 5, and updating
`self.last_reader_record_id` accordingly. On the next iteration of the
loop, it tries to read but hits EOF: the data file indeed has nothing
left. However, `self.last_reader_record_id < ledger_last` is still true
while `maybe_record.is_none() && self.last_reader_record_id == 0` is
not, as `self.last_reader_record_id` is set to `5`.
Alas, deadlock.
## Solution
The solution is painfully simple, and the user that originally reported
the issue [said as
much](vectordotdev#17644 (comment)):
drop `&& self.last_reader_record_id == 0`.
Given the loop's own condition, the inner check for
`self.last_reader_record_id == 0` was redundant... but obviously also
logically incorrect, too, in the case where we had missing writes. I'm
still not entirely sure how existing tests didn't already catch this,
but it was easy enough to spot the error once I knew where to look, and
the resulting unit test I added convincingly showed that it was broken,
and after making the change, indeed fixed.
## Reviewer Note(s)
I added two unit tests: one for the fix as shown and one for what I
thought was another bug. Turns out that the "other bug" wasn't a bug,
and this unit test isn't _explicitly_ required, but it's a simple
variation of other tests with a more straightforward invariant that it
tries to demonstrate, so I just left it in.
Fixesvectordotdev#17644.
0 commit comments