Skip to content

Remove RewindBlockIndex logic #21009

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Apr 27, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
28 changes: 8 additions & 20 deletions src/init.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -1698,29 +1698,17 @@ bool AppInitMain(const util::Ref& context, NodeContext& node, interfaces::BlockA
break;
}

bool failed_rewind{false};
// Can't hold cs_main while calling RewindBlockIndex, so retrieve the relevant
// chainstates beforehand.
for (CChainState* chainstate : WITH_LOCK(::cs_main, return chainman.GetAll())) {
if (!fReset) {
// Note that RewindBlockIndex MUST run even if we're about to -reindex-chainstate.
// It both disconnects blocks based on the chainstate, and drops block data in
// BlockIndex() based on lack of available witness data.
uiInterface.InitMessage(_("Rewinding blocks...").translated);
if (!chainstate->RewindBlockIndex(chainparams)) {
strLoadError = _(
"Unable to rewind the database to a pre-fork state. "
"You will need to redownload the blockchain");
failed_rewind = true;
break; // out of the per-chainstate loop
}
if (!fReset) {
LOCK(cs_main);
auto chainstates{chainman.GetAll()};
if (std::any_of(chainstates.begin(), chainstates.end(),
[&chainparams](const CChainState* cs) EXCLUSIVE_LOCKS_REQUIRED(cs_main) { return cs->NeedsRedownload(chainparams); })) {
strLoadError = strprintf(_("Witness data for blocks after height %d requires validation. Please restart with -reindex."),
chainparams.GetConsensus().SegwitHeight);
break;
}
}

if (failed_rewind) {
break; // out of the chainstate activation do-while
}

bool failed_verification = false;

try {
Expand Down
140 changes: 10 additions & 130 deletions src/validation.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -4431,143 +4431,23 @@ bool CChainState::ReplayBlocks(const CChainParams& params)
return true;
}

//! Helper for CChainState::RewindBlockIndex
void CChainState::EraseBlockData(CBlockIndex* index)
bool CChainState::NeedsRedownload(const CChainParams& params) const
{
AssertLockHeld(cs_main);
assert(!m_chain.Contains(index)); // Make sure this block isn't active

// Reduce validity
index->nStatus = std::min<unsigned int>(index->nStatus & BLOCK_VALID_MASK, BLOCK_VALID_TREE) | (index->nStatus & ~BLOCK_VALID_MASK);
// Remove have-data flags.
index->nStatus &= ~(BLOCK_HAVE_DATA | BLOCK_HAVE_UNDO);
// Remove storage location.
index->nFile = 0;
index->nDataPos = 0;
index->nUndoPos = 0;
// Remove various other things
index->nTx = 0;
index->nChainTx = 0;
index->nSequenceId = 0;
// Make sure it gets written.
setDirtyBlockIndex.insert(index);
// Update indexes
setBlockIndexCandidates.erase(index);
auto ret = m_blockman.m_blocks_unlinked.equal_range(index->pprev);
while (ret.first != ret.second) {
if (ret.first->second == index) {
m_blockman.m_blocks_unlinked.erase(ret.first++);
} else {
++ret.first;
}
}
// Mark parent as eligible for main chain again
if (index->pprev && index->pprev->IsValid(BLOCK_VALID_TRANSACTIONS) && index->pprev->HaveTxsDownloaded()) {
setBlockIndexCandidates.insert(index->pprev);
}
}

bool CChainState::RewindBlockIndex(const CChainParams& params)
{
// Note that during -reindex-chainstate we are called with an empty m_chain!

// First erase all post-segwit blocks without witness not in the main chain,
// as this can we done without costly DisconnectTip calls. Active
// blocks will be dealt with below (releasing cs_main in between).
{
LOCK(cs_main);
for (const auto& entry : m_blockman.m_block_index) {
if (IsWitnessEnabled(entry.second->pprev, params.GetConsensus()) && !(entry.second->nStatus & BLOCK_OPT_WITNESS) && !m_chain.Contains(entry.second)) {
EraseBlockData(entry.second);
}
}
}
// At and above params.SegwitHeight, segwit consensus rules must be validated
CBlockIndex* block{m_chain.Tip()};
const int segwit_height{params.GetConsensus().SegwitHeight};

// Find what height we need to reorganize to.
CBlockIndex *tip;
int nHeight = 1;
{
LOCK(cs_main);
while (nHeight <= m_chain.Height()) {
// Although SCRIPT_VERIFY_WITNESS is now generally enforced on all
// blocks in ConnectBlock, we don't need to go back and
// re-download/re-verify blocks from before segwit actually activated.
if (IsWitnessEnabled(m_chain[nHeight - 1], params.GetConsensus()) && !(m_chain[nHeight]->nStatus & BLOCK_OPT_WITNESS)) {
break;
}
nHeight++;
}

tip = m_chain.Tip();
}
// nHeight is now the height of the first insufficiently-validated block, or tipheight + 1

BlockValidationState state;
// Loop until the tip is below nHeight, or we reach a pruned block.
while (!ShutdownRequested()) {
{
LOCK(cs_main);
LOCK(m_mempool.cs);
// Make sure nothing changed from under us (this won't happen because RewindBlockIndex runs before importing/network are active)
assert(tip == m_chain.Tip());
if (tip == nullptr || tip->nHeight < nHeight) break;
if (fPruneMode && !(tip->nStatus & BLOCK_HAVE_DATA)) {
// If pruning, don't try rewinding past the HAVE_DATA point;
// since older blocks can't be served anyway, there's
// no need to walk further, and trying to DisconnectTip()
// will fail (and require a needless reindex/redownload
// of the blockchain).
break;
}

// Disconnect block
if (!DisconnectTip(state, params, nullptr)) {
return error("RewindBlockIndex: unable to disconnect block at height %i (%s)", tip->nHeight, state.ToString());
}

// Reduce validity flag and have-data flags.
// We do this after actual disconnecting, otherwise we'll end up writing the lack of data
// to disk before writing the chainstate, resulting in a failure to continue if interrupted.
// Note: If we encounter an insufficiently validated block that
// is on m_chain, it must be because we are a pruning node, and
// this block or some successor doesn't HAVE_DATA, so we were unable to
// rewind all the way. Blocks remaining on m_chain at this point
// must not have their validity reduced.
EraseBlockData(tip);

tip = tip->pprev;
}
// Make sure the queue of validation callbacks doesn't grow unboundedly.
LimitValidationInterfaceQueue();

// Occasionally flush state to disk.
if (!FlushStateToDisk(params, state, FlushStateMode::PERIODIC)) {
LogPrintf("RewindBlockIndex: unable to flush state to disk (%s)\n", state.ToString());
return false;
}
}

{
LOCK(cs_main);
if (m_chain.Tip() != nullptr) {
// We can't prune block index candidates based on our tip if we have
// no tip due to m_chain being empty!
PruneBlockIndexCandidates();

CheckBlockIndex(params.GetConsensus());

// FlushStateToDisk can possibly read ::ChainActive(). Be conservative
// and skip it here, we're about to -reindex-chainstate anyway, so
// it'll get called a bunch real soon.
BlockValidationState state;
if (!FlushStateToDisk(params, state, FlushStateMode::ALWAYS)) {
LogPrintf("RewindBlockIndex: unable to flush state to disk (%s)\n", state.ToString());
return false;
}
while (block != nullptr && block->nHeight >= segwit_height) {
if (!(block->nStatus & BLOCK_OPT_WITNESS)) {
Copy link
Member

@glozow glozow Apr 19, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did a quick grep for BLOCK_OPT_WITNESS and came across #19806 which added a "fake" BLOCK_OPT_WITNESS "so that RewindBlockIndex() doesn't zealously unwind the assumed-valid chain." I don't have much background on AssumeUTXO but is there an interaction there? Should that be cleaned up as well?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call, @glozow. I'll make a note to look at whether that's still necessary after the merge of this PR.

// block is insufficiently validated for a segwit client
return true;
}
block = block->pprev;
}

return true;
return false;
}

void CChainState::UnloadBlockIndex() {
Expand Down
7 changes: 3 additions & 4 deletions src/validation.h
Original file line number Diff line number Diff line change
Expand Up @@ -722,7 +722,9 @@ class CChainState

/** Replay blocks that aren't fully applied to the database. */
bool ReplayBlocks(const CChainParams& params);
bool RewindBlockIndex(const CChainParams& params) LOCKS_EXCLUDED(cs_main);

/** Whether the chain state needs to be redownloaded due to lack of witness data */
[[nodiscard]] bool NeedsRedownload(const CChainParams& params) const EXCLUSIVE_LOCKS_REQUIRED(cs_main);
/** Ensures we have a genesis block in the block tree, possibly writing one to disk. */
bool LoadGenesisBlock(const CChainParams& chainparams);

Expand Down Expand Up @@ -769,9 +771,6 @@ class CChainState

bool RollforwardBlock(const CBlockIndex* pindex, CCoinsViewCache& inputs, const CChainParams& params) EXCLUSIVE_LOCKS_REQUIRED(cs_main);

//! Mark a block as not having block data
void EraseBlockData(CBlockIndex* index) EXCLUSIVE_LOCKS_REQUIRED(cs_main);

void CheckForkWarningConditions() EXCLUSIVE_LOCKS_REQUIRED(cs_main);
void InvalidChainFound(CBlockIndex* pindexNew) EXCLUSIVE_LOCKS_REQUIRED(cs_main);

Expand Down
33 changes: 22 additions & 11 deletions test/functional/p2p_segwit.py
Original file line number Diff line number Diff line change
Expand Up @@ -1956,22 +1956,33 @@ def test_non_standard_witness(self):
def test_upgrade_after_activation(self):
"""Test the behavior of starting up a segwit-aware node after the softfork has activated."""

self.restart_node(2, extra_args=["-segwitheight={}".format(SEGWIT_HEIGHT)])
# All nodes are caught up and node 2 is a pre-segwit node that will soon upgrade.
for n in range(2):
assert_equal(self.nodes[n].getblockcount(), self.nodes[2].getblockcount())
assert softfork_active(self.nodes[n], "segwit")
assert SEGWIT_HEIGHT < self.nodes[2].getblockcount()
assert 'segwit' not in self.nodes[2].getblockchaininfo()['softforks']

# Restarting node 2 should result in a shutdown because the blockchain consists of
# insufficiently validated blocks per segwit consensus rules.
self.stop_node(2)
with self.nodes[2].assert_debug_log(expected_msgs=[
f"Witness data for blocks after height {SEGWIT_HEIGHT} requires validation. Please restart with -reindex."], timeout=10):
self.nodes[2].start([f"-segwitheight={SEGWIT_HEIGHT}"])

# As directed, the user restarts the node with -reindex
self.start_node(2, extra_args=["-reindex", f"-segwitheight={SEGWIT_HEIGHT}"])

# With the segwit consensus rules, the node is able to validate only up to SEGWIT_HEIGHT - 1
assert_equal(self.nodes[2].getblockcount(), SEGWIT_HEIGHT - 1)
self.connect_nodes(0, 2)

# We reconnect more than 100 blocks, give it plenty of time
# sync_blocks() also verifies the best block hash is the same for all nodes
self.sync_blocks(timeout=240)

# Make sure that this peer thinks segwit has activated.
assert softfork_active(self.nodes[2], 'segwit')

# Make sure this peer's blocks match those of node0.
height = self.nodes[2].getblockcount()
while height >= 0:
block_hash = self.nodes[2].getblockhash(height)
assert_equal(block_hash, self.nodes[0].getblockhash(height))
assert_equal(self.nodes[0].getblock(block_hash), self.nodes[2].getblock(block_hash))
height -= 1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So based on the removal of this test code (and the SEGWIT_HEIGHT - 1 assertion above), the implication seems to be we expect users who are upgrading from pre-segwit to remove their datadir and sync to tip from scratch? Based on the assertions here, -reindex will not bring them to tip, so what's the point of telling the user to reindex at all? Or is this behavior specific to the functional tests, and in practice -reindex would bring the upgraded node to tip? (I'll need to review the reindex code; I can't remember what to expect there.)

If we expect users to remove their blocksdir before restarting with -reindex, you might consider mentioning that in the error message you've included in init.cpp.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reindexing will result in the node discarding the blocks without witness after segwit activation and redownloading them from peers and validating. See #21009 (comment). The reason the node doesn't sync to tip here is because in our functional tests, the nodes don't make automatic connections to each other. As soon as the nodes are connected, the upgraded node will resync from its peers.

Copy link
Contributor Author

@dhruv dhruv Apr 22, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I should have updated the PR description. Updated to now say:

This PR introduces NeedsRedownload() which merely checks for insufficiently validated segwit blocks and requests that the user restarts the node with -reindex. Reindexing the block files upon restart will make the node rebuild chain state and block index from the blk*.dat files on disk. The node won't be able to index the blocks with BLOCK_OPT_WITNESS, so they will be missing from the chain and be re-downloaded, with witness data.

Sorry for the confusion.

As for the -reindex code, the call path is: src/init.cpp::Threadimport() -> validation.cpp:CChainState::LoadExternalBlockFile() -> validation.cpp:CChainState::AcceptBlock() -> validation.cpp:ContextualCheckBlock(). The last function tries to validate for witness commitments and is unable to thereby causing the block to not be accepted into the chain upon a restart with -reindex

Using combine_logs.py for test/functional/p2p_segwit.py you can see lines like:

 node2 2021-04-22T14:56:56.543661Z [loadblk] [util/system.h:52] [error] ERROR: AcceptBlock: bad-witness-nonce-size, ContextualCheckBlock : invalid witness reserved value size
 node2 2021-04-22T14:56:56.544812Z [loadblk] [util/system.h:52] [error] ERROR: AcceptBlock: bad-witness-nonce-size, ContextualCheckBlock : invalid witness reserved value size

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, thanks all. Makes sense.

# The upgraded node should now have segwit activated
assert softfork_active(self.nodes[2], "segwit")

@subtest # type: ignore
def test_witness_sigops(self):
Expand Down