Skip to content

Add TXG timestamp database #16853

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

oshogbo
Copy link
Contributor

@oshogbo oshogbo commented Dec 11, 2024

Motivation and Context

This feature enables tracking of when TXGs are committed to disk, providing an estimated timestamp for each TXG.

With this information, it becomes possible to perform scrubs based on specific date ranges, improving the granularity of data management and recovery operations.

Description

To achieve this, we implemented a round-robin database that keeps track of time. We separate the tracking into minutes, days, and years. We believe this provides the best resolution for time management. This feature does not track the exact time of each transaction group (txg) but provides an estimate. The txg database can also be used in other scenarios where mapping dates to transaction groups is required.

How Has This Been Tested?

  • Create pool
  • write data
  • wait some time
  • write data
  • wait some time
  • try to scrub different times

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Performance enhancement (non-breaking change which improves efficiency)
  • Code cleanup (non-breaking change which makes code smaller or more readable)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Library ABI change (libzfs, libzfs_core, libnvpair, libuutil and libzfsbootenv)
  • Documentation (a change to man pages or other documentation)

Checklist:

@oshogbo oshogbo force-pushed the oshogbo/scrub_data_range branch 7 times, most recently from 2a20b11 to 364f813 Compare December 11, 2024 14:01
@amotin amotin added the Status: Code Review Needed Ready for review and testing label Dec 11, 2024
@oshogbo oshogbo force-pushed the oshogbo/scrub_data_range branch from 364f813 to 891c8f2 Compare December 11, 2024 15:50
@amotin
Copy link
Member

amotin commented Dec 12, 2024

It crashes on VERIFY(!dmu_objset_is_dirty(dp->dp_meta_objset, txg)).

@amotin
Copy link
Member

amotin commented Dec 12, 2024

This reminds me we recently added ddp_class_start into the new dedup table entries format to be able to prune DDT based on time. I wonder if we could save some space would we have this mechanism back then.

@amotin amotin added the Status: Revision Needed Changes are required for the PR to be accepted label Dec 12, 2024
@oshogbo oshogbo force-pushed the oshogbo/scrub_data_range branch from 891c8f2 to ba5ee33 Compare January 31, 2025 10:31
@github-actions github-actions bot removed the Status: Revision Needed Changes are required for the PR to be accepted label Jan 31, 2025
@oshogbo oshogbo force-pushed the oshogbo/scrub_data_range branch 8 times, most recently from 963a5a3 to 33a7c27 Compare January 31, 2025 14:08
@oshogbo oshogbo force-pushed the oshogbo/scrub_data_range branch from 33a7c27 to 7797e3f Compare January 31, 2025 20:52
@tonyhutter
Copy link
Contributor

Forgot to mention this earlier - can you add a test case to exercise zpool scrub -S|-E? Please include all weird edge cases, like invalid dates/ranges, setting timezones forward/backwards, and testing -S|-E against pools where the feature isn't enabled.

@oshogbo
Copy link
Contributor Author

oshogbo commented Feb 3, 2025

Forgot to mention this earlier - can you add a test case to exercise zpool scrub -S|-E? Please include all weird edge cases, like invalid dates/ranges, setting timezones forward/backwards, and testing -S|-E against pools where the feature isn't enabled.

Unfortunately, I don't have an idea how to add such test, as to test it we would need to wait for rrd to be created. This will create very long test. Do you have some suggestions?

@tonyhutter
Copy link
Contributor

This will create very long test. Do you have some suggestions?

The test case could temporarily set the system clock forward to simulate the passage of time.

@amotin amotin added the Status: Revision Needed Changes are required for the PR to be accepted label Feb 5, 2025
@oshogbo oshogbo force-pushed the oshogbo/scrub_data_range branch from 7797e3f to 307ce00 Compare March 31, 2025 18:34
@github-actions github-actions bot removed the Status: Revision Needed Changes are required for the PR to be accepted label Mar 31, 2025
@oshogbo oshogbo force-pushed the oshogbo/scrub_data_range branch 4 times, most recently from eb670f4 to 9f84986 Compare March 31, 2025 19:22
@oshogbo oshogbo force-pushed the oshogbo/scrub_data_range branch from 9f84986 to 5d2ae66 Compare April 11, 2025 10:20
Copy link
Member

@amotin amotin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Few more comments, plus it reliably fails cli_root/zpool_resilver/zpool_resilver_restart and libzfs/libzfs_input tests.

@oshogbo oshogbo force-pushed the oshogbo/scrub_data_range branch 2 times, most recently from 36b8920 to f987166 Compare April 14, 2025 16:13
@oshogbo
Copy link
Contributor Author

oshogbo commented Apr 15, 2025

Can we re-run the tests? It seems they have timed out.
I don’t see any indication of an error, at least for now.

@amotin
Copy link
Member

amotin commented Apr 15, 2025

@oshogbo Many of them actually crashed on the same assertion:

[ 6543.510793] VERIFY0(spa->spa_checkpoint_txg) failed (0 == 15)
  [ 6543.511146] PANIC at spa.c:5224:spa_ld_read_checkpoint_txg()
  [ 6543.511407] Showing stack for process 450801
  [ 6543.512011] CPU: 0 PID: 450801 Comm: zpool Kdump: loaded Tainted: P           OE     -------  ---  5.14.0-503.35.1.el9_5.x86_64 #1
  [ 6543.512296] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
  [ 6543.512736] Call Trace:
  [ 6543.513202]  <TASK>
  [ 6543.513683]  dump_stack_lvl+0x34/0x48
  [ 6543.515313]  spl_panic+0xd1/0xe9 [spl]
  [ 6543.516916]  ? allocate_cgrp_cset_links+0x89/0xa0
  [ 6543.520874]  ? spl_kmem_alloc_impl+0xb0/0xd0 [spl]
  [ 6543.521241]  ? spl_kmem_alloc_impl+0xb0/0xd0 [spl]
  [ 6543.521597]  ? srso_alias_return_thunk+0x5/0xfbef5
  [ 6543.522182]  ? __kmalloc_node+0x4e/0x140
  [ 6543.522598]  ? srso_alias_return_thunk+0x5/0xfbef5
  [ 6543.522856]  ? spl_kmem_alloc_impl+0xb0/0xd0 [spl]
  [ 6543.523118]  ? srso_alias_return_thunk+0x5/0xfbef5
  [ 6543.523358]  ? __list_add+0x12/0x30 [spl]
  [ 6543.523650]  ? __dprintf+0x120/0x190 [zfs]
  [ 6543.539257]  spa_ld_read_checkpoint_txg+0x194/0x1d0 [zfs]
  [ 6543.539724]  ? srso_alias_return_thunk+0x5/0xfbef5
  [ 6543.539858]  ? srso_alias_return_thunk+0x5/0xfbef5
  [ 6543.539988]  ? spa_import_progress_set_notes_impl+0x103/0x200 [zfs]
  [ 6543.540410]  ? srso_alias_return_thunk+0x5/0xfbef5
  [ 6543.540569]  ? srso_alias_return_thunk+0x5/0xfbef5
  [ 6543.540709]  ? spa_import_progress_set_notes+0x5b/0x80 [zfs]
  [ 6543.541124]  spa_load_impl.constprop.0+0x10d/0x720 [zfs]
  [ 6543.541558]  spa_load+0x76/0x140 [zfs]
  [ 6543.542288]  spa_load_best+0x138/0x2c0 [zfs]
  [ 6543.542930]  spa_import+0x28a/0x780 [zfs]
  [ 6543.543526]  ? free_unref_page+0xf2/0x130
  [ 6543.543762]  zfs_ioc_pool_import+0x140/0x160 [zfs]
  [ 6543.544348]  zfsdev_ioctl_common+0x690/0x760 [zfs]
  [ 6543.544959]  ? srso_alias_return_thunk+0x5/0xfbef5
  [ 6543.545230]  ? _copy_from_user+0x27/0x60
  [ 6543.545593]  zfsdev_ioctl+0x53/0xe0 [zfs]
  [ 6543.546157]  __x64_sys_ioctl+0x8a/0xc0
  [ 6543.546526]  do_syscall_64+0x5f/0xf0
  [ 6543.546826]  ? srso_alias_return_thunk+0x5/0xfbef5
  [ 6543.547096]  ? exc_page_fault+0x62/0x150
  [ 6543.547411]  entry_SYSCALL_64_after_hwframe+0x78/0x80
  [ 6543.547966] RIP: 0033:0x7ff05670313b
  [ 6543.550135] Code: ff ff ff 85 c0 79 9b 49 c7 c4 ff ff ff ff 5b 5d 4c 89 e0 41 5c c3 66 0f 1f 84 00 00 00 00 00 f3 0f 1e fa b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d ad 4c 0f 00 f7 d8 64 89 01 48
  [ 6543.550491] RSP: 002b:00007ffe5bd417f8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
  [ 6543.550779] RAX: ffffffffffffffda RBX: 000056329756ce10 RCX: 00007ff05670313b
  [ 6543.551046] RDX: 00007ffe5bd42960 RSI: 0000000000005a02 RDI: 0000000000000003
  [ 6543.551290] RBP: 00007ffe5bd45f60 R08: 0000000000000003 R09: 0000000000000000
  [ 6543.551529] R10: 0000000010000000 R11: 0000000000000246 R12: 00007ffe5bd41960
  [ 6543.551779] R13: 000056329755c2e0 R14: 00007ffe5bd42960 R15: 00007ff050002da8
  [ 6543.552096]  </TASK>

@amotin amotin mentioned this pull request Apr 29, 2025
13 tasks
@oshogbo oshogbo force-pushed the oshogbo/scrub_data_range branch 2 times, most recently from f28d6f4 to 5ced42c Compare May 12, 2025 11:09
@oshogbo oshogbo force-pushed the oshogbo/scrub_data_range branch from 5ced42c to df71897 Compare May 22, 2025 07:50
@oshogbo
Copy link
Contributor Author

oshogbo commented May 22, 2025

I think everything should be fixed now.

@oshogbo oshogbo force-pushed the oshogbo/scrub_data_range branch from df71897 to 36e2707 Compare May 23, 2025 09:59
@oshogbo
Copy link
Contributor Author

oshogbo commented May 23, 2025

I have addressed the feedback.

This feature enables tracking of when TXGs are committed to disk,
providing an estimated timestamp for each TXG.

With this information, it becomes possible to perform scrubs based
on specific date ranges, improving the granularity of
data management and recovery operations.

Signed-off-by: Mariusz Zaborski <[email protected]>
Sponsored-by: Klara, Inc.
Sponsored-by: Wasabi Technology, Inc.
@oshogbo oshogbo force-pushed the oshogbo/scrub_data_range branch from 36e2707 to 0f5ff9e Compare May 28, 2025 08:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Status: Code Review Needed Ready for review and testing
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants