-
Notifications
You must be signed in to change notification settings - Fork 0
Parallelize Soroban #3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: parallel_txset_nomination
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't looked much into the transaction/operation changes yet, but I've already got some comments re ledger manager/application logic.
src/ledger/LedgerManagerImpl.cpp
Outdated
Hash subSeed = sorobanBasePrngSeed; | ||
// If tx can use the seed, we need to compute a sub-seed for it. | ||
if (tx->isSoroban()) | ||
#ifdef ENABLE_NEXT_PROTOCOL_VERSION_UNSAFE_FOR_PRODUCTION |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we don't need any ifdef guards for this change:
- I don't think it interacts with any XDR changes at all
- It's also highly unlikely to get into an intermediate release, so there is no need to do that even if we wanted to be extra cautious
- The compile-time checks are actually arguably less safe as they defer the discovery of potential breakages in the current protocol
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. Removed.
src/util/numeric.h
Outdated
@@ -55,4 +55,6 @@ bool bigDivideUnsigned(uint64_t& result, uint64_t A, uint64_t B, uint64_t C, | |||
|
|||
// This only implements ROUND_DOWN | |||
uint64_t bigSquareRoot(uint64_t a, uint64_t b); | |||
|
|||
template <typename T> T add_sat(T a, T b); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: inconsistent naming style (addSaturating
or saturatingAdd
)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Renamed to saturatingAdd
, but after the ttl logic change, it's actually no longer used, so I might just remove it.
src/ledger/LedgerManagerImpl.cpp
Outdated
|
||
auto const& thread = stage.at(i); | ||
|
||
roTtlDeltas.emplace_back(std::async( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The default policy for std::async
is std::launch::async | std::launch::deferred
. Since you're calling the get
on the futures one-by-one, it seems to me that you'll end up spawning threads sequentially (according to the deferred
spec, the thread is only launched when you first try to get the future value). Probably we should specify the policy to be just async
, but I might be misunderstanding the spec as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch. I verified that execution was interleaved, but it sounds like the behavior here is implementation specific. Anyways, I remove the async
usage for thread
.
src/ledger/LedgerManagerImpl.cpp
Outdated
auto currLiveUntil = | ||
ltxe.current().data.ttl().liveUntilLedgerSeq; | ||
|
||
releaseAssertOrThrow( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This assertion is not correct, is it? One can delete an entry in tx 1, and then re-create it with smaller TTL in tx 2. We only enforce non-decreasing TTL per transaction, but not across transactions. It's probably worth adding a respective test as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch. Fixed and added a test.
src/ledger/LedgerManagerImpl.cpp
Outdated
} | ||
} | ||
|
||
for (auto const& kvp : cumulativeRoTtlDeltas) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One thing I haven't realized with the new TTL algorithm is tx meta. We might need to rethink how we emit TTL meta now. In the worst case we'd need to switch back from deltas to the max value, but maybe we can come up with something better...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Update the ttl reconciliation logic.
src/ledger/LedgerManagerImpl.cpp
Outdated
|
||
if (opType != RESTORE_FOOTPRINT && lk.type() == TTL && | ||
it->second.mLedgerEntry && updatedLe && | ||
isReadOnlyTTLIt != isReadOnlyTTLMap.end() && |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The entry can't be missing from the map, can it? Maybe an assertion would be more appropriate to ensure we didn't mess up anywhere.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added an assert. I still need to go over all of the asserts to make sure they're safe and the ones that throw aren't caught when they shouldn't be.
src/ledger/LedgerManagerImpl.cpp
Outdated
|
||
auto isReadOnlyTTLIt = isReadOnlyTTLMap.find(lk); | ||
|
||
if (opType != RESTORE_FOOTPRINT && lk.type() == TTL && |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think it's possible for restore op to modify TTL of RO entries, so not sure if the first condition is necessary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah this an outdated change that was no longer relevant. Removed.
src/ledger/LedgerManagerImpl.cpp
Outdated
break; | ||
} | ||
|
||
auto ltxe = ltx.loadWithoutRecord(lk); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't we also populate the ledger load metrics here? (currently these are a part of the soroban ops implementation).
Also, I see the code for these metrics remains in the operations, but in case if the limit is exceeded, we will never reach the operation code at all (so at least the error handling code in the operations seems redundant).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
src/ledger/LedgerManagerImpl.cpp
Outdated
switch (opType) | ||
{ | ||
case INVOKE_HOST_FUNCTION: | ||
sorobanOpResult.tr().invokeHostFunctionResult().code( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We also need to populate the respective events here, don't we?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated
src/ledger/LedgerManagerImpl.cpp
Outdated
LedgerManagerImpl::collectEntries(AbstractLedgerTxn& ltx, Thread const& txs) | ||
{ | ||
ThreadEntryMap entryMap; | ||
auto getEntries = [&](TransactionFrameBasePtr tx, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've left some comments regarding this implementation below, but I think this whole method might benefit from factoring out the operation-specific logic into the operations, i.e. add a function like 'preloadEntries' or something like that, and add the respective metrics/error handling there. If there is too much duplication, maybe consider factoring out the common logic into a separate function as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Refactored a bit
src/ledger/LedgerManagerImpl.cpp
Outdated
auto it = entryMap.find(lk); | ||
releaseAssertOrThrow(it != entryMap.end()); | ||
|
||
auto opType = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unused
ba6137b
to
8afa754
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wrt the state archival bits, I think we can remove meta generation and most archival meta info from the ltx. This is the end state though. If we need to keep some legacy ltx stuff around during development, that's fine, but I'd like to move away from generating restore meta via the ltx regardless, since this probably simplifies auto restore changes coming up.
// right before we cal into the invariants. | ||
} | ||
|
||
if (op->getOperation().body.type() == RESTORE_FOOTPRINT) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it might be easier just to build both the restore and op meta above and consolidate the logic. The only reason we had the post process function before was due to the limitations of ltx. Now that we have to build all meta ourselves anyway without the ltx, this post processing step is unnecessary. I think this may help simplify auto restore too.
src/ledger/LedgerManagerImpl.cpp
Outdated
{ | ||
if (key.type() != TTL) | ||
{ | ||
ltxInner.addRestoredFromLiveBucketListKey(key); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should be able to get rid of addRestoredFromLiveBucketListKey
and LedgerTxn::mRestoredKeys.liveBucketList
entirely. The only reason we need to store liveBucketList restoration events in the ltx is for the meta post processing step. However, if we do that without the ltx, this is no longer necessary.
We still need to keep track of hotArchiveBucketList restores at the ltx level though. Maybe we can rename this to something like "removeKeyFromHotArchive." If we no longer generate restore meta via ltx, we just need to know what keys to delete from the HotArchive on ledger close.
d73d552
to
c59330c
Compare
auto callback = [&metrics, &sorobanData, &resources, &res, | ||
&config](LedgerKey const& lk, uint32_t entrySize) -> bool { | ||
metrics.mLedgerReadByte += entrySize; | ||
if (resources.readBytes < metrics.mLedgerReadByte) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This flow needs to be updated for extend and restore. We shouldn't load entries in certain cases (like if the entry has been archived here).
a893b04
to
b38da27
Compare
This is a relatively simple algorithm, but it should be serviceable for traffic with relatively low amount of transitive IO conflicts.
008b38c
to
e98138f
Compare
Description
Checklist
clang-format
v8.0.0 (viamake format
or the Visual Studio extension)