Skip to content

Make db-operations for all disk operations async and spawn-blocking #2402

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
Tracked by #2259
Arkenan opened this issue Apr 4, 2025 · 0 comments · Fixed by #2336
Closed
Tracked by #2259

Make db-operations for all disk operations async and spawn-blocking #2402

Arkenan opened this issue Apr 4, 2025 · 0 comments · Fixed by #2336
Assignees

Comments

@Arkenan
Copy link
Collaborator

Arkenan commented Apr 4, 2025

No description provided.

@Arkenan Arkenan mentioned this issue Apr 4, 2025
11 tasks
@Arkenan Arkenan assigned Arkenan and Oppen and unassigned Arkenan Apr 4, 2025
@Arkenan Arkenan moved this to In Progress in ethrex_l1 Apr 4, 2025
@Arkenan Arkenan moved this from In Progress to In Review in ethrex_l1 Apr 4, 2025
github-merge-queue bot pushed a commit that referenced this issue Apr 4, 2025
**Motivation**

Some of our sync APIs can produce starving when running on Tokio due to
taking a long time to reach the next `await`-point.
Specifically, writing to the DB tends to take a long time, which blocks
other tasks, sometimes the whole runtime due to how the scheduler in
Tokio works.
Thus, we need a way to inform the runtime we're going to be working for
a while, and give it control while we wait for stuff.

**Description**

Take the mutable APIs for the DB and mark them `async`. Then bubble that
up to their users. Then make the functions non-blocking by using
`spawn_blocking` to run on the blocking thread, releasing the runtime to
handle more work.
The DB writing APIs had to change to pass-by-value to satisfy the
borrow-checker in the blocking task context. I think I can use proper
lifetime bounds with a helper crate, if that's preferred. The values
were already being discarded after passing to the DB, so passing by
value should not be a problem either way.

Special considerations:
- For some work performed before benchmarks and EF tests which are
inherently synchronous I opted for calling with an ad-hoc runtime
instance and `block_on`, as that might reduce the changes needed by
localizing the async work. If desired, that can be changed up to making
a `tokio::main`. The same is true for some setup functions for tests.
- For the DBs I had to separate the Tokio import. This is because they
need to compile with L2, which means provers' custom compilers, and
those don't support the networking functions in the stdlib, which Tokio
with full features (as the workspace dep declares) brings them in.
- The InMemoryDB was left untouched other than updating the interfaces,
given hashmap access should be quick enough.
- I need to comment on [this
hack](https://github.com/lambdaclass/ethrex/pull/2336/files#diff-264636d3ee6ee67bd6e136b8c98f74152de6a8e2a07f597cfb5f622d4e0d815aR143-R146):
`and_then` can't be used on futures and everything became a mess without
that little helper.
- I'm unsure about whether or not we also want to cover the read APIs,
at least for consistency I would think so, but for now I left them out.

closes #2402
@github-project-automation github-project-automation bot moved this from In Review to Done in ethrex_l1 Apr 4, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

2 participants