Skip to content

storage: fix index state truncate overflow #26051

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
May 19, 2025

Conversation

andrwng
Copy link
Contributor

@andrwng andrwng commented May 5, 2025

Similar to #25979, when an index contains state that leads it to overflow, it is unreliable. In this case, this could affect truncation, and result in us removing incorrect index entries.

So, this makes the truncate avoid removing entries if this case is detected, with the expectation that no entries are added that have overflowed.

Note, this bug doesn't appear to be as severe as #25979 in that:

  • suffix truncation is generally more rare
  • the overflowing index problem is generally a problem with:
    • large segments, much larger than the default allowed size
    • compacted segments, which we expect to be relatively rare to suffix truncate with tiered storage

Nonetheless, some kind of fix seems appropriate because it's difficult to reason about the overflow otherwise.

Backports Required

  • none - not a bug fix
  • none - this is a backport
  • none - issue does not exist in previous branches
  • none - papercut/not impactful enough to backport
  • v25.1.x
  • v24.3.x
  • v24.2.x

Release Notes

  • None

@andrwng
Copy link
Contributor Author

andrwng commented May 5, 2025

Don't love the approach, open to alternative suggestions.

Similar to redpanda-data#25979, when an
index contains state that leads it to overflow, it is unreliable. In
this case, this could affect truncation, and result in us removing
incorrect index entries.

So, this makes the truncate avoid removing entries if this case is
detected, with the expectation that no entries are added that have
overflowed.

Note, this bug doesn't appear to be as severe as redpanda-data#25979 in that:
- suffix truncation is generally more rare
- the overflowing index problem is generally a problem with:
  - large segments, much larger than the default allowed size
  - compacted segments, which we expect to be relatively rare to
    suffix truncate with tiered storage

Nonetheless, some kind of fix seems appropriate because it's difficult
to reason about the overflow otherwise.
@andrwng andrwng force-pushed the storage-index-overflow-truncate branch from 7e18ed3 to e882d67 Compare May 5, 2025 20:18
Copy link
Contributor

@bharathv bharathv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm .. not too familiar with this code but looks correct to me .. would prefer another pair of 👓

@vbotbuildovich
Copy link
Collaborator

CI test results

test results on build#65554
test_id test_kind job_url test_status passed
rptest.tests.basic_kafka_compat_test.TxKafkaCompatTest.test_concurrent_producer_ids.metadata_quorum=COMBINED_KRAFT ducktape https://buildkite.com/redpanda/redpanda/builds/65554#0196a26e-89da-4988-9dd6-7b6076b7f916 FLAKY 20/21
rptest.tests.partition_balancer_test.PartitionBalancerTest.test_rack_awareness ducktape https://buildkite.com/redpanda/redpanda/builds/65554#0196a26a-c285-45e4-b687-b45dc9f0a009 FAIL 0/1
rptest.tests.partition_move_interruption_test.PartitionMoveInterruption.test_cancelling_partition_move.replication_factor=3.unclean_abort=True.recovery=restart_recovery.compacted=False ducktape https://buildkite.com/redpanda/redpanda/builds/65554#0196a26e-89da-4988-9dd6-7b6076b7f916 FLAKY 20/21
rptest.tests.random_node_operations_test.RandomNodeOperationsTest.test_node_operations.enable_failures=True.mixed_versions=False.with_tiered_storage=False.with_iceberg=False.with_chunked_compaction=True.cloud_storage_type=CloudStorageType.S3 ducktape https://buildkite.com/redpanda/redpanda/builds/65554#0196a26a-c285-45e4-b687-b45dc9f0a009 FLAKY 17/18
rptest.tests.scaling_up_test.ScalingUpTest.test_scaling_up_with_recovered_topic ducktape https://buildkite.com/redpanda/redpanda/builds/65554#0196a26a-c283-427f-8df0-c0afafc17b35 FLAKY 10/21

// added to the index, unless by an older buggy version of Redpanda.
// With this in mind, either there are no entries above u32_max, or
// offset_lower_bound() isn't going to work correctly anyway, so we just
// skip removal and rely on queries to detect the overflow.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so we just skip removal and rely on queries to detect the overflow.

i'm not clear on what this means or how it works. the confusion i have is something like: if the truncation request is legit, but we ignore it because of the overflow, then how do we know that a later query would not be affected by having dropped a truncation request? presumably we rely on suffix truncation for raft correctness?

Copy link
Contributor Author

@andrwng andrwng May 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea is that if entries aren't truncated here, it's because we are attempting to truncate to something beyond the uint32 delta limit. If that's the case:

Does that answer your question? Or are you getting at that there's still an edge case where a later query returns the wrong thing?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think it looks ok. i'm less concerned about my question after read it again today

@andrwng andrwng merged commit 31fa920 into redpanda-data:dev May 19, 2025
19 checks passed
@vbotbuildovich
Copy link
Collaborator

/backport v25.1.x

@vbotbuildovich
Copy link
Collaborator

/backport v24.3.x

@vbotbuildovich
Copy link
Collaborator

/backport v24.2.x

@vbotbuildovich
Copy link
Collaborator

Failed to create a backport PR to v24.2.x branch. I tried:

git remote add upstream https://github.com/redpanda-data/redpanda.git
git fetch --all
git checkout -b backport-pr-26051-v24.2.x-295 remotes/upstream/v24.2.x
git cherry-pick -x e882d67ed6

Workflow run logs.

@vbotbuildovich
Copy link
Collaborator

Failed to create a backport PR to v25.1.x branch. I tried:

git remote add upstream https://github.com/redpanda-data/redpanda.git
git fetch --all
git checkout -b backport-pr-26051-v25.1.x-961 remotes/upstream/v25.1.x
git cherry-pick -x e882d67ed6

Workflow run logs.

@vbotbuildovich
Copy link
Collaborator

Failed to create a backport PR to v24.3.x branch. I tried:

git remote add upstream https://github.com/redpanda-data/redpanda.git
git fetch --all
git checkout -b backport-pr-26051-v24.3.x-196 remotes/upstream/v24.3.x
git cherry-pick -x e882d67ed6

Workflow run logs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants