Skip to content

Improve gradual compaction #401

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

uprendis
Copy link
Contributor

@uprendis uprendis commented Dec 9, 2022

  • compaction will try to recursively detect not only upper DB tables but also subtables, based on number of non-empty prefixes in current domain (if number of prefixes (N) > 1 and < 50, then we're probably looking at N tables). Like in previous version (Gradual compaction #384), each detected table is split into S steps (depending on range between first and last keys of the prefix) and compacted
  • number of compaction steps is scaled based on total DB size. Total DB size estimation is retrieved via Stat. Having too short steps leads to ineffective compaction for small DBs
  • compaction steps are 4x larger for db compact command and debug_chaindbCompact API call
  • "nicer" logging during compaction

@uprendis uprendis requested review from hadv and rus-alex December 9, 2022 03:29
@uprendis uprendis requested a review from andrecronje as a code owner December 9, 2022 03:29
@uprendis uprendis mentioned this pull request Dec 9, 2022
if len(nonEmptyPrefixes) == 0 {
return nil
}
if len(nonEmptyPrefixes) != 1 && len(nonEmptyPrefixes) < 50 {
Copy link
Contributor

@hadv hadv Dec 9, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand correctly when len(nonEmptyPrefixes) >= 50 we will compact the db similar to len(nonEmptyPrefixes) == 1 a.k.a at // once a table is located, split the range into *iters* chunks for compaction . Can you explain a little bit more about it, please?

Copy link
Contributor Author

@uprendis uprendis Dec 9, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It splits it into 3 cases:
== 1:
000001
000002
...
000FFF

It's probably a single table. If keys are numbers, then most likely they all will be behind 00 prefix. An example is epochs storage, which would be behind 00 prefix, because first byte is 0, unless there's more epochs than 2^24

> 1, < 50:
It's probably a collection of tables. An example is gossip DB, which has 34 tables

> 50:
It's probably a single table - that's more prefixes than there's tables in Opera. An example is a collection of transactions or MPTs - both would have 256 non-empty prefixes

@uprendis uprendis merged commit 55ee9e8 into Fantom-foundation:develop Dec 12, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants