Skip to content

Block remap for cloned blocks on device removal #17180

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Mar 26, 2025

Conversation

amotin
Copy link
Member

@amotin amotin commented Mar 26, 2025

When after device removal we handle block pointers remap, skip blocks that might be cloned. BRTs are indexed by vdev id and offset from block pointer's DVA[0]. So if we start addressing the same block by some different DVA, we won't get the proper reference counter. As result, we might either remap the block twice, that may result in assertion during indirect mapping condense, or free it prematurely, that may result in data overwrite, or free it twice, that may result in assertion in spacemap code.

Fixes #15604

How Has This Been Tested?

Written and cloned a file on a pool of several vdevs. Removed one of vdevs. Overwritten each 128th block of the file to trigger block pointers remap. Run zdb on the pool and observed it crashing due to incorrect block reference counting. Applied the patch and observed zdb passing clean. I wonder if it may also fix some space leaks periodically reported by zdb after ztest.

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Performance enhancement (non-breaking change which improves efficiency)
  • Code cleanup (non-breaking change which makes code smaller or more readable)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Library ABI change (libzfs, libzfs_core, libnvpair, libuutil and libzfsbootenv)
  • Documentation (a change to man pages or other documentation)

Checklist:

When after device removal we handle block pointers remap, skip blocks
that might be cloned.  BRTs are indexed by vdev id and offset from
block pointer's DVA[0].  So if we start addressing the same block by
some different DVA, we won't get the proper reference counter.  As
result, we might either remap the block twice, that may result in
assertion during indirect mapping condense, or free it prematurely,
that may result in data overwrite, or free it twice, that may result
in assertion in spacemap code.

Signed-off-by:  Alexander Motin <[email protected]>
Sponsored by:   iXsystems, Inc.
Fixes openzfs#15604
@amotin amotin added the Status: Code Review Needed Ready for review and testing label Mar 26, 2025
@amotin amotin requested review from behlendorf and tonyhutter March 26, 2025 15:08
@behlendorf behlendorf added Status: Accepted Ready to integrate (reviewed, tested) and removed Status: Code Review Needed Ready for review and testing labels Mar 26, 2025
@behlendorf behlendorf merged commit 4abc21b into openzfs:master Mar 26, 2025
21 of 23 checks passed
@amotin amotin deleted the no_brt_remap branch March 26, 2025 23:53
@mmatuska
Copy link
Contributor

@amotin is this eligible for both 2.3 and 2.2?

@amotin
Copy link
Member Author

amotin commented Mar 27, 2025

@mmatuska Yes. Everything that has block cloning already.

tonyhutter pushed a commit to tonyhutter/zfs that referenced this pull request Apr 3, 2025
When after device removal we handle block pointers remap, skip blocks
that might be cloned.  BRTs are indexed by vdev id and offset from
block pointer's DVA[0].  So if we start addressing the same block by
some different DVA, we won't get the proper reference counter.  As
result, we might either remap the block twice, that may result in
assertion during indirect mapping condense, or free it prematurely,
that may result in data overwrite, or free it twice, that may result
in assertion in spacemap code.

Reviewed-by: Ameer Hamza <[email protected]>
Reviewed-by: Paul Dagnelie <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by:  Alexander Motin <[email protected]>
Sponsored by:   iXsystems, Inc.
Closes openzfs#15604
Closes openzfs#17180
fuporovvStack pushed a commit to fuporovvStack/zfs that referenced this pull request Apr 11, 2025
When after device removal we handle block pointers remap, skip blocks
that might be cloned.  BRTs are indexed by vdev id and offset from
block pointer's DVA[0].  So if we start addressing the same block by
some different DVA, we won't get the proper reference counter.  As
result, we might either remap the block twice, that may result in
assertion during indirect mapping condense, or free it prematurely,
that may result in data overwrite, or free it twice, that may result
in assertion in spacemap code.

Reviewed-by: Ameer Hamza <[email protected]>
Reviewed-by: Paul Dagnelie <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by:  Alexander Motin <[email protected]>
Sponsored by:   iXsystems, Inc.
Closes openzfs#15604
Closes openzfs#17180
fuporovvStack pushed a commit to fuporovvStack/zfs that referenced this pull request Apr 11, 2025
When after device removal we handle block pointers remap, skip blocks
that might be cloned.  BRTs are indexed by vdev id and offset from
block pointer's DVA[0].  So if we start addressing the same block by
some different DVA, we won't get the proper reference counter.  As
result, we might either remap the block twice, that may result in
assertion during indirect mapping condense, or free it prematurely,
that may result in data overwrite, or free it twice, that may result
in assertion in spacemap code.

Reviewed-by: Ameer Hamza <[email protected]>
Reviewed-by: Paul Dagnelie <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by:  Alexander Motin <[email protected]>
Sponsored by:   iXsystems, Inc.
Closes openzfs#15604
Closes openzfs#17180
tonyhutter pushed a commit to tonyhutter/zfs that referenced this pull request Apr 16, 2025
When after device removal we handle block pointers remap, skip blocks
that might be cloned.  BRTs are indexed by vdev id and offset from
block pointer's DVA[0].  So if we start addressing the same block by
some different DVA, we won't get the proper reference counter.  As
result, we might either remap the block twice, that may result in
assertion during indirect mapping condense, or free it prematurely,
that may result in data overwrite, or free it twice, that may result
in assertion in spacemap code.

Reviewed-by: Ameer Hamza <[email protected]>
Reviewed-by: Paul Dagnelie <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by:  Alexander Motin <[email protected]>
Sponsored by:   iXsystems, Inc.
Closes openzfs#15604
Closes openzfs#17180
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Status: Accepted Ready to integrate (reviewed, tested)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

VERIFY3(counts[index] + inner_size <= size) failed (8192 <= 4096)
5 participants