feat: 8841: Need a way to change the number of object key buckets in MerkleDb #18910

artemananiev · 2025-04-18T01:07:13Z

Fix summary:

A new method is introduced to HalfDiskHashMap: resizeIfNeeded()
This method is called after every flush to MerkleDb data source. It checks if the average bucket entry count in the HDHM exceeds a threshold, and doubles the number of buckets if needed
A new method is introduced to Bucket: sanitize()
This method checks bucket's index and also removes all stale entries - entries that were valid for this bucket before HDHM resize, but now are in a different bucket
Changed the way how MerkleDb path and bucket index sizes are calculated, when a new MerkleDb instance is created
Many VirtualMap size hints in Hedera services are updated to fit better into this new way
New unit tests are introduced to cover HDHM.resizeIfNeeded()

Fixes: #8841
Signed-off-by: Artem Ananev [email protected]

… MerkleDb Signed-off-by: Artem Ananev <[email protected]>

jasperpotts · 2025-04-21T18:15:20Z

I think I get the idea. I was puzzled how you doubled the size of index with concurrent access but I think I understand index expansion only happens on startup? And because it is fast it should not be a big impact on start-up time. I wonder if there is a way to expand while reading from disk without being much slower. Rather than needing enough RAM for new and old indexes.

It would be really nice to key this off heuristics so that we record amount of buckets that have overflowed or number of buckets that have reached >= 90% full. Then if more then 20% of buckets have passed that threshold we double map size on next restart. That way there is no operations monitoring and config changes needed to maintain size. This would be great for non-mainnet networks as they could be configured with minimal sizes and expand as needed, as long as they are restarted. We could add atomic reference to index and do compare and swap to allow for live index doubling but not sure if that is too performance expensive.

jasperpotts · 2025-04-21T18:16:58Z

I think we should strive for goal of automatic management, minimum operations effort needed. Or what Oracle would call "Autonomous" 😀

artemananiev · 2025-04-28T17:03:59Z

It would be really nice to key this off heuristics so that we record amount of buckets that have overflowed or number of buckets that have reached >= 90% full. Then if more then 20% of buckets have passed that threshold we double map size on next restart. That way there is no operations monitoring and config changes needed to maintain size. This would be great for non-mainnet networks as they could be configured with minimal sizes and expand as needed, as long as they are restarted. We could add atomic reference to index and do compare and swap to allow for live index doubling but not sure if that is too performance expensive.

Automatic index resizing on startup looks doable, but I'd like to track it in a separate ticket. We currently don't collect stats on individual buckets, for example, the number of buckets >= 90% full. All we have is total number of entries and current number of buckets. With really bad hashing function, it's possible that the map is only 25% full, but more than 20% buckets have too many entries.

artemananiev · 2025-04-28T17:11:31Z

I think I get the idea. I was puzzled how you doubled the size of index with concurrent access but I think I understand index expansion only happens on startup? And because it is fast it should not be a big impact on start-up time. I wonder if there is a way to expand while reading from disk without being much slower. Rather than needing enough RAM for new and old indexes.

Yes, the idea is to resize at startup only. There will be no single point in time, when both old and new indexes are in memory. HDHM should detect that a resize is needed before its index is loaded from a snapshot.

Resizing on the fly is more complicated, but doesn't provide any benefits. When a bucket is loaded from disk (e.g. for update during flush), and it isn't found - is it because it's an empty bucket, or because its entries are still in the N-1-bit "old" bucket? Same checks need to be done on reads, too. When a bucket is finally updated, its data location in the index is set, and according to our current implementation the whole index up to the point is allocated in memory anyway.

Signed-off-by: Artem Ananev <[email protected]>

codacy-production · 2025-05-15T20:16:14Z

Coverage summary from Codacy

See diff coverage on Codacy

Coverage variation	Diff coverage
✅ +0.05% (target: -1.00%)	✅ 84.92%

Coverage variation details

	Coverable lines	Covered lines	Coverage
Common ancestor commit (`1e9b28d`)	100740	73848	73.31%
Head commit (`27a133f`)	100830 (+90)	73963 (+115)	73.35% (+0.05%)

Coverage variation is the difference between the coverage for the head and common ancestor commits of the pull request branch: <coverage of head commit> - <coverage of common ancestor commit>

Diff coverage details

	Coverable lines	Covered lines	Diff coverage
Pull request (#18910)	179	152	84.92%

Diff coverage is the percentage of lines that are covered by tests out of the coverable lines that the pull request added or modified: <covered lines added or modified>/<coverable lines added or modified> * 100%

See your quality gate settings Change summary preferences

codecov · 2025-05-15T20:16:36Z

Codecov Report

Attention: Patch coverage is 77.65363% with 40 lines in your changes missing coverage. Please review.

Files with missing lines	Patch %	Lines
...wirlds/merkledb/files/hashmap/HalfDiskHashMap.java	81.57%	11 Missing and 3 partials ⚠️
...ava/com/swirlds/merkledb/files/hashmap/Bucket.java	78.57%	9 Missing and 3 partials ⚠️
...java/com/swirlds/merkledb/MerkleDbTableConfig.java	70.00%	2 Missing and 4 partials ⚠️
...wirlds/merkledb/MerkleDbCompactionCoordinator.java	0.00%	1 Missing and 1 partial ⚠️
.../java/com/swirlds/merkledb/MerkleDbDataSource.java	75.00%	1 Missing and 1 partial ⚠️
...ava/com/swirlds/merkledb/files/DataFileCommon.java	0.00%	2 Missing ⚠️
...rlds/demo/migration/MigrationTestingToolState.java	0.00%	1 Missing ⚠️
...m/swirlds/merkledb/files/hashmap/ParsedBucket.java	90.00%	0 Missing and 1 partial ⚠️

@@             Coverage Diff              @@
##               main   #18910      +/-   ##
============================================
+ Coverage     69.34%   69.39%   +0.04%     
- Complexity    23296    23327      +31     
============================================
  Files          2648     2647       -1     
  Lines        100835   100925      +90     
  Branches      10435    10448      +13     
============================================
+ Hits          69926    70035     +109     
+ Misses        26940    26915      -25     
- Partials       3969     3975       +6

Files with missing lines	Coverage Δ	Complexity Δ
...hedera/node/app/hints/schemas/V059HintsSchema.java	`100.00% <ø> (ø)`	`4.00 <0.00> (ø)`
...hedera/node/app/hints/schemas/V060HintsSchema.java	`84.21% <ø> (ø)`	`8.00 <0.00> (ø)`
...e/consensus/impl/schemas/V0490ConsensusSchema.java	`100.00% <ø> (ø)`	`4.00 <0.00> (ø)`
...app/service/file/impl/schemas/V0490FileSchema.java	`86.54% <ø> (ø)`	`56.00 <0.00> (ø)`
...ice/schedule/impl/schemas/V0490ScheduleSchema.java	`100.00% <ø> (ø)`	`7.00 <0.00> (ø)`
...ice/schedule/impl/schemas/V0570ScheduleSchema.java	`100.00% <ø> (ø)`	`8.00 <0.00> (ø)`
...ice/contract/impl/schemas/V0490ContractSchema.java	`100.00% <ø> (ø)`	`6.00 <0.00> (ø)`
...p/service/token/impl/schemas/V0490TokenSchema.java	`95.45% <ø> (ø)`	`4.00 <0.00> (ø)`
...p/service/token/impl/schemas/V0530TokenSchema.java	`100.00% <ø> (ø)`	`3.00 <0.00> (ø)`
...va/com/swirlds/merkledb/config/MerkleDbConfig.java	`92.30% <ø> (ø)`	`5.00 <0.00> (ø)`
... and 11 more

... and 11 files with indirect coverage changes

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

… size hints Signed-off-by: Artem Ananev <[email protected]>

…HM.GOOD_AVERAGE_BUCKET_ENTRY_COUNT Signed-off-by: Artem Ananev <[email protected]>

Signed-off-by: Artem Ananev <[email protected]>

...m-sdk/swirlds-merkledb/src/main/java/com/swirlds/merkledb/files/hashmap/HalfDiskHashMap.java

platform-sdk/swirlds-merkledb/src/main/java/com/swirlds/merkledb/MerkleDbTableConfig.java

...vice-impl/src/main/java/com/hedera/node/app/service/token/impl/schemas/V0490TokenSchema.java

platform-sdk/swirlds-merkledb/src/main/java/com/swirlds/merkledb/config/MerkleDbConfig.java

Signed-off-by: Artem Ananev <[email protected]>

platform-sdk/docs/proposals/scaling-hdhm-buckets/proposal.md

...ervice-impl/src/main/java/com/hedera/node/app/service/file/impl/schemas/V0490FileSchema.java

...sdk/swirlds-merkledb/src/timingSensitive/java/com/swirlds/merkledb/MerkleDbSnapshotTest.java

platform-sdk/swirlds-benchmarks/src/jmh/java/com/swirlds/benchmark/VirtualMapBaseBench.java

platform-sdk/swirlds-merkledb/src/main/java/com/swirlds/merkledb/files/hashmap/Bucket.java

...m-sdk/swirlds-merkledb/src/main/java/com/swirlds/merkledb/files/hashmap/HalfDiskHashMap.java

...m-sdk/swirlds-merkledb/src/main/java/com/swirlds/merkledb/MerkleDbCompactionCoordinator.java

Signed-off-by: Artem Ananev <[email protected]>

platform-sdk/swirlds-merkledb/src/main/java/com/swirlds/merkledb/files/hashmap/Bucket.java

Neeharika-Sompalli

LGTM from execution changes. Thanks @artemananiev!

jasperpotts

Looks good in principle, reviewed design and had light look at code. Very cleaver idea :-)

Signed-off-by: Artem Ananev <[email protected]>

tinker-michaelj

LGTM, tyvm @artemananiev !

poulok

Approved for consensus module related changes

draft: 8841: Need a way to change the number of object key buckets in…

e8ec5b7

… MerkleDb Signed-off-by: Artem Ananev <[email protected]>

artemananiev added Platform Virtual Map Platform Data Structures Platform Tickets pertaining to the platform Scalability Issues related to the scalability problems labels Apr 18, 2025

artemananiev requested review from jasperpotts, imalygin and OlegMazurov April 18, 2025 01:07

artemananiev self-assigned this Apr 18, 2025

artemananiev added 5 commits May 13, 2025 16:15

Sync with main

b056ff6

Signed-off-by: Artem Ananev <[email protected]>

Initial implementation

fecb76e

Signed-off-by: Artem Ananev <[email protected]>

Decreased initial capacity values in most onDisk schema definitions

1e49489

Signed-off-by: Artem Ananev <[email protected]>

HDHM: updated file collection key range after resize

3608dc4

Signed-off-by: Artem Ananev <[email protected]>

Sync with main

c6375b2

Signed-off-by: Artem Ananev <[email protected]>

artemananiev added 4 commits May 15, 2025 14:32

Updated scaling-hdhm-buckets/proposal.md Adjusted HintsService schema…

9670e26

… size hints Signed-off-by: Artem Ananev <[email protected]>

Added MerkleDbConfig.goodAverageBucketEntryCount config to replace HD…

dc80d4a

…HM.GOOD_AVERAGE_BUCKET_ENTRY_COUNT Signed-off-by: Artem Ananev <[email protected]>

Unit tests for HDHM.resizeIfNeeded() and related methods

0cf2780

Signed-off-by: Artem Ananev <[email protected]>

Changed BucketEntry back to a static inner class in ParsedBucket

4844b43

Signed-off-by: Artem Ananev <[email protected]>

artemananiev requested review from akugal, thenswan and anthony-swirldslabs and removed request for jasperpotts May 17, 2025 00:11

artemananiev added this to the v0.63 milestone May 17, 2025

artemananiev changed the title ~~draft: 8841: Need a way to change the number of object key buckets in MerkleDb~~ feat: 8841: Need a way to change the number of object key buckets in MerkleDb May 17, 2025

artemananiev marked this pull request as ready for review May 17, 2025 00:12

artemananiev requested review from a team and tinker-michaelj as code owners May 17, 2025 00:12

artemananiev commented May 17, 2025

View reviewed changes

...m-sdk/swirlds-merkledb/src/main/java/com/swirlds/merkledb/files/hashmap/HalfDiskHashMap.java Outdated Show resolved Hide resolved

artemananiev commented May 17, 2025

View reviewed changes

platform-sdk/swirlds-merkledb/src/main/java/com/swirlds/merkledb/MerkleDbTableConfig.java Show resolved Hide resolved

artemananiev commented May 17, 2025

View reviewed changes

...vice-impl/src/main/java/com/hedera/node/app/service/token/impl/schemas/V0490TokenSchema.java Show resolved Hide resolved

artemananiev commented May 17, 2025

View reviewed changes

platform-sdk/swirlds-merkledb/src/main/java/com/swirlds/merkledb/config/MerkleDbConfig.java Show resolved Hide resolved

artemananiev requested a review from Neeharika-Sompalli May 19, 2025 23:51

Fixed some issues reported by Codacy

6866f60

Signed-off-by: Artem Ananev <[email protected]>

imalygin reviewed May 20, 2025

View reviewed changes

platform-sdk/docs/proposals/scaling-hdhm-buckets/proposal.md Outdated Show resolved Hide resolved

OlegMazurov reviewed May 21, 2025

View reviewed changes

thenswan reviewed May 21, 2025

View reviewed changes

...m-sdk/swirlds-merkledb/src/main/java/com/swirlds/merkledb/MerkleDbCompactionCoordinator.java Outdated Show resolved Hide resolved

artemananiev added 2 commits May 22, 2025 14:15

Addressed reviewers' comments

4c9ac5b

Signed-off-by: Artem Ananev <[email protected]>

Sync with main

a02d19d

Signed-off-by: Artem Ananev <[email protected]>

OlegMazurov reviewed May 23, 2025

View reviewed changes

platform-sdk/swirlds-merkledb/src/main/java/com/swirlds/merkledb/files/hashmap/Bucket.java Show resolved Hide resolved

OlegMazurov previously approved these changes May 23, 2025

View reviewed changes

Neeharika-Sompalli previously approved these changes May 23, 2025

View reviewed changes

jasperpotts approved these changes May 23, 2025

View reviewed changes

Sync with main

27a133f

Signed-off-by: Artem Ananev <[email protected]>

artemananiev dismissed stale reviews from Neeharika-Sompalli and OlegMazurov via 27a133f May 23, 2025 23:55

tinker-michaelj approved these changes May 27, 2025

View reviewed changes

OlegMazurov approved these changes May 27, 2025

View reviewed changes

poulok approved these changes May 27, 2025

View reviewed changes

artemananiev merged commit d5acbd8 into main May 27, 2025
56 of 57 checks passed

artemananiev deleted the 8841-M-hdhm-bucket-index-resize branch May 27, 2025 23:54

feat: 8841: Need a way to change the number of object key buckets in MerkleDb #18910

feat: 8841: Need a way to change the number of object key buckets in MerkleDb #18910

Uh oh!

Conversation

artemananiev commented Apr 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jasperpotts commented Apr 21, 2025

Uh oh!

jasperpotts commented Apr 21, 2025

Uh oh!

artemananiev commented Apr 28, 2025

Uh oh!

artemananiev commented Apr 28, 2025

Uh oh!

codacy-production bot commented May 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Coverage summary from Codacy

See diff coverage on Codacy

See your quality gate settings Change summary preferences

Uh oh!

codecov bot commented May 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Neeharika-Sompalli left a comment

Choose a reason for hiding this comment

Uh oh!

jasperpotts left a comment

Choose a reason for hiding this comment

Uh oh!

tinker-michaelj left a comment

Choose a reason for hiding this comment

Uh oh!

poulok left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

artemananiev commented Apr 18, 2025 •

edited

Loading

codacy-production bot commented May 15, 2025 •

edited

Loading

codecov bot commented May 15, 2025 •

edited

Loading