Skip to content

Add FreeBSD posix_fadvise support. #13958

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 2 commits into from

Conversation

macdice
Copy link

@macdice macdice commented Sep 26, 2022

As commit 320f0c6 did for Linux, connect POSIX_FADV_WILLNEED and POSIX_FADV_SEQUENTIAL up to dmu_prefetch() on FreeBSD.

[Draft only, work in progress]

Signed-off-by: Thomas Munro [email protected]

Motivation and Context

See #13694 (née #9807) for motivation and context. The present PR is for feature parity on FreeBSD.

Description

Connects posix_fadvise() to dmu_prefetch().

Quite similar to Linux, but:

  • ESPIPE case is already handled by kern_posix_fadvise() function before we get here.
  • (offset, length) has already been converted to (start, end), and if length was 0 then end is OFF_MAX. Is it OK to call dmu_prefetch() with a very large size, or should we also try to clamp it to zp->z_size? It will be clamped by max prefetch size internally anyway.
  • This is only a draft because I haven't yet figured out what, if anything, needs to be done about mapped files (I don't understand why for Linux something was done before dmu_prefetch() and not after it completes, and whether that is done for correctness or performance).

How Has This Been Tested?

I have tested on top of the main branch of FreeBSD, which most recently merged OpenZFS master a few days ago (openzfs/zfs commit c629f0b). I've tested with simple programs, and PostgreSQL which generates a WILLNEED advice in various circumstances.

I wonder if we should implement POSIX_FADV_DONTNEED to evict ranges from ARC to facilitate testing, but that'd be another PR.

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Performance enhancement (non-breaking change which improves efficiency)
  • Code cleanup (non-breaking change which makes code smaller or more readable)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Library ABI change (libzfs, libzfs_core, libnvpair, libuutil and libzfsbootenv)
  • Documentation (a change to man pages or other documentation)

Checklist:

@ghost ghost added the Status: Code Review Needed Ready for review and testing label Sep 28, 2022
Copy link
Member

@amotin amotin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am thinking about possible integration between dmu_prefetch() and dmu_zfetch(), but that is separate. Otherwise aside one comment I'd be happy to have this (I don't now many apps using it, but good prefetch is a "must have" for a wide pools).

@behlendorf
Copy link
Contributor

@macdice should this PR be promoted from a draft? Is it ready for review?

macdice added a commit to macdice/zfs that referenced this pull request Oct 28, 2022
As commit 320f0c6 did for Linux, connect POSIX_FADV_WILLNEED
up to dmu_prefetch() on FreeBSD.

Reviewed-by: Mateusz Guzick <[email protected]>
Reviewed-by: Alexander Motin <[email protected]>
Signed-off-by: Thomas Munro <[email protected]>
Closes openzfs#13958
@macdice macdice marked this pull request as ready for review October 28, 2022 13:07
Copy link
Contributor

@behlendorf behlendorf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As part of this change it would be best to enable the functional/fadvise/fadvise_sequential.ksh test case on FreeBSD. It looks like it will just need to be updated to use POSIX_FADV_WILLNEED instead of POSIX_FADV_SEQUENTIAL, and to check the arcstats in the correct way for FreeBSD. Aside from that, I'd expect it to work.

macdice added a commit to macdice/zfs that referenced this pull request Oct 30, 2022
As commit 320f0c6 did for Linux, connect POSIX_FADV_WILLNEED
up to dmu_prefetch() on FreeBSD.

Reviewed-by: Mateusz Guzick <[email protected]>
Reviewed-by: Alexander Motin <[email protected]>
Signed-off-by: Thomas Munro <[email protected]>
Closes openzfs#13958
macdice added a commit to macdice/zfs that referenced this pull request Oct 30, 2022
As commit 320f0c6 did for Linux, connect POSIX_FADV_WILLNEED
up to dmu_prefetch() on FreeBSD.

Reviewed-by: Mateusz Guzick <[email protected]>
Reviewed-by: Alexander Motin <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Thomas Munro <[email protected]>
Closes openzfs#13958
macdice added a commit to macdice/zfs that referenced this pull request Oct 30, 2022
As commit 320f0c6 did for Linux, connect POSIX_FADV_WILLNEED
up to dmu_prefetch() on FreeBSD.

Reviewed-by: Mateusz Guzick <[email protected]>
Reviewed-by: Alexander Motin <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Thomas Munro <[email protected]>
Closes openzfs#13958
macdice added a commit to macdice/zfs that referenced this pull request Oct 30, 2022
As commit 320f0c6 did for Linux, connect POSIX_FADV_WILLNEED
up to dmu_prefetch() on FreeBSD.

Reviewed-by: Mateusz Guzick <[email protected]>
Reviewed-by: Alexander Motin <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Thomas Munro <[email protected]>
Closes openzfs#13958
@macdice
Copy link
Author

macdice commented Oct 30, 2022

Tests fixed as requested. I had another go at computing the length; I didn't like the way dmu_prefetch loops past the end limited only by the dmu_prefetch_max setting, so I figured we should really check zp->z_size here, and also it was off by one. But now I'm wondering how to read that safely.

@macdice macdice force-pushed the fadvise-freebsd branch 3 times, most recently from 81b4255 to 03f018e Compare November 13, 2022 11:42
Copy link
Contributor

@behlendorf behlendorf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a couple nits then looks good to me.

@behlendorf
Copy link
Contributor

@macdice if you can wrap up the few remaining commends and rebase this we can get it merged.

@behlendorf
Copy link
Contributor

@macdice thanks for updating this. Can you rebase this against the latest version of the master branch and then force update the PR. This should give us a better run of the test suite to make sure everything is in order.


/* kern_posix_fadvise points to the last byte, we want one past */
if (end != OFF_MAX)
end += 1;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👀

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

extra tab removed

1.  Use the existing get_arcstats() function from libtest.shlib to read
    data_size, instead of directly accessing /proc/spl/kstat/zfs, which
    FreeBSD doesn't have.

2.  Make the regex in libtest.shlib a little stricter, because otherwise
    data_size also matches metadata_size and the test breaks, which is
    probably why it was done another way first...

3.  Instead of relying on the numerical values of POSIX_FADV_XXX macros,
    accept macro names as arguments to the file_fadvise program.  (The
    numbers happen to match on Linux and FreeBSD, but future systems may
    vary and it seems a little strange/raw to count on that.)

4.  For implementation reasons, SEQUENTIAL doesn't reach ZFS via FreeBSD
    VFS currently (perhaps something that should be investigated in
    FreeBSD).  Since on Linux we're treating SEQUENTIAL and WILLNEED the
    same, it doesn't really matter which one we use, so switch the test
    over to WILLNEED exercise the new prefetch code on both OSes the
    same way.

Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Thomas Munro <[email protected]>
As commit 320f0c6 did for Linux, connect POSIX_FADV_WILLNEED
up to dmu_prefetch() on FreeBSD.

Reviewed-by: Mateusz Guzik <[email protected]>
Reviewed-by: Alexander Motin <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Thomas Munro <[email protected]>
@macdice
Copy link
Author

macdice commented Jan 28, 2023

Hmm. zloop failed with "ASSERT at cmd/ztest.c:5933:ztest_dmu_snapshot_hold()/lib/x86_64-linux-gnu/libasan.so.6(+0x45c0e)[0x7efebc67fc0e]", but when I try it locally I get various other random failures from an asan/ubsan build (on FreeBSD it fails fast, on Linux it fails after a few successful loops), and that's the case with or without my changes. Then there are some functional test failures, for example "functional/casenorm/mixed_formd_lookup (run as root) [00:00] [FAIL]" and those fails for me too, even with my change reverted. So perhaps I rebased onto a bad commit?

@behlendorf
Copy link
Contributor

Unfortunately zloop isn't entirely bulletproof and the CI only runs it on Linux so I'm not too surprised you saw failures there. There are also a couple known unreliable ZTS test cases which are listed in https://review.whamcloud.com/c/fs/lustre-release/+/49660 as exceptions, including several of the casenorm tests. So that's not a concern either. What's more problematic is that all of the builders appear to have gotten hung up running the l2arc test group. Can you take a look in to what's going on here.

Test (Linux): /usr/share/zfs/zfs-tests/tests/functional/io/setup (run as root) [00:00] [PASS]
Test (Linux): /usr/share/zfs/zfs-tests/tests/functional/io/libaio (run as root) [00:25] [PASS]
Test (Linux): /usr/share/zfs/zfs-tests/tests/functional/io/io_uring (run as root) [00:00] [SKIP]
Test (Linux): /usr/share/zfs/zfs-tests/tests/functional/io/cleanup (run as root) [00:00] [PASS]
Test: /usr/share/zfs/zfs-tests/tests/functional/l2arc/setup (run as root) [00:07] [PASS]
... <hang> ...

@mjguzik
Copy link
Contributor

mjguzik commented Apr 30, 2023

what's the status here? 👀

@amotin amotin mentioned this pull request May 26, 2025
13 tasks
@amotin amotin closed this May 26, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Status: Code Review Needed Ready for review and testing
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants