Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor: allow eBPF crates to compile on host architecture #1229

Closed

Conversation

thomaseizinger
Copy link
Contributor

@thomaseizinger thomaseizinger commented Mar 24, 2025

Related: aya-rs/book#214
Resolves: #1230


This change is Reviewable

Copy link

netlify bot commented Mar 24, 2025

Deploy Preview for aya-rs-docs ready!

Built without sensitive environment variables

Name Link
🔨 Latest commit 53c59ae
🔍 Latest deploy log https://app.netlify.com/sites/aya-rs-docs/deploys/67eb2d7c7d17f00008720f60
😎 Deploy Preview https://deploy-preview-1229--aya-rs-docs.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

@mergify mergify bot added the test A PR that improves test cases or CI label Mar 24, 2025
@thomaseizinger thomaseizinger changed the title refactor(eBPF): use target_arch for panic-handler cfg refactor: allow eBPF crates to compile on host architecture Mar 25, 2025
@thomaseizinger
Copy link
Contributor Author

@tamird I've updated this to cfg the eBPF crates on the target architecture. We can't directly get rid of default-members though because of the other integration tests. I haven't looked at the xtask code to see if it can be done.

Do you want me to look into that as well as part of this?

Copy link
Member

@tamird tamird left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tamird I've updated this to cfg the eBPF crates on the target architecture. We can't directly get rid of default-members though because of the other integration tests. I haven't looked at the xtask code to see if it can be done.

Do you want me to look into that as well as part of this?

Yeah, it would be great to get all the way to the finish line.

Also I think this code can be simplified now?

aya/clippy.sh

Lines 5 to 20 in 1db534d

# We cannot run clippy over the whole workspace at once due to feature unification. Since both
# integration-test and integration-ebpf depend on integration-common and integration-test activates
# integration-common's aya dependency, we end up trying to compile the panic handler twice: once
# from the bpf program, and again from std via aya.
#
# `-C panic=abort` because "unwinding panics are not supported without std"; integration-ebpf
# contains `#[no_std]` binaries.
#
# `-Zpanic_abort_tests` because "building tests with panic=abort is not supported without
# `-Zpanic_abort_tests`"; Cargo does this automatically when panic=abort is set via profile but we
# want to preserve unwinding at runtime - here we are just running clippy so we don't care about
# unwinding behavior.
#
# `+nightly` because "the option `Z` is only accepted on the nightly compiler".
cargo +nightly hack clippy "$@" --exclude integration-ebpf --all-targets --feature-powerset --workspace -- --deny warnings
cargo +nightly hack clippy "$@" --package integration-ebpf --all-targets --feature-powerset -- --deny warnings -C panic=abort -Zpanic_abort_tests

@mergify mergify bot added the aya-bpf This is about aya-bpf (kernel) label Mar 26, 2025
xtask/src/run.rs Outdated
profile,
])
cmd.env(AYA_BUILD_INTEGRATION_BPF, "true")
.env("RUSTFLAGS", "--cfg aya_integration_test")
Copy link
Contributor Author

@thomaseizinger thomaseizinger Mar 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tamird Setting a custom cfg here allows us to exclude the integration tests from compiling unless we compile them via xtask. It leads to some duplication across the test crates if we want to compile them without dead-code warnings etc.

I think that could be resolved if we were to write a custom test macro (and one for test_case and one for tokio::test). I'd like to defer that to later if possible. My main interest is in removing the need for default-members for users of aya which don't necessarily have such an xtask-based setup. For them, the introduction of aya_ebpf::main_stub and moving the panic_handler is the important bit.

I'll leave it up to you whether you want to include this in here or only merge the improvements around the main-stub and panic-handler :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer excluding the tests via default_members. Disabling them via conditional compilation is too verbose - at least as implemented in this PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I agree it is pretty verbose and I don't like it either. @tamird suggested we might as well try to go the full way and try to get rid of default-members entirely in here, even though it is more difficult than for a regular aya-based eBPF workspace.

I am happy to drop these again if @tamird is fine with keeping default-members within the main repo just to keep the integration tests disabled..

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am fine with any improvements that are clearly stated. If we're not able to remove default-members, what are we accomplishing here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we agree on a way how we can make the eBPF programs compile for host architectures (dealing with the main stub and all) and we think that is worthwhile updating the template and book as well, then what this PR is about is applying those changes here which removes drift between what the book and template recommend how to define eBPF programs and how the main repo does it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(put it in the commit message where applicable)

@thomaseizinger thomaseizinger requested a review from tamird March 26, 2025 03:25
() => {
#[cfg(not(target_arch = "bpf"))]
fn main() {
panic!(r#"eBPF kernels are not designed to be executed in user-space. This main function is only a placeholder to allow the code to compile on the host system (i.e. on any system that is not `target_arch = "bpf"`). This works in tandem with the `no_main` attribute which is only applied when compiling for `target_arch = "bpf"`."#)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See #1229 (comment)

This is not a meaningful message like I asked. This is basically just saying "we need this because no_main only on bpf" but doesn't explain why we can't keep no_main in place when targeting not-bpf.

Copy link
Contributor Author

@thomaseizinger thomaseizinger Mar 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please write out message that you'd like to see? 🙏

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd love to, but I actually don't know the reason.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The crate doesn't compile without a main function (for the host architecture). I guess the reasons is "binaries need an entrypoint"?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're still not answering the question, or am I missing something? Why can't no_main be used on the host?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't how else to say it, the code doesn't compile without a main function:

error[E0601]: `main` function not found in crate `tcx`
  --> test/integration-ebpf/src/tcx.rs:10:2
   |
10 | }
   |  ^ consider adding a `main` function to `test/integration-ebpf/src/tcx.rs`

If I add the no_main attribute back unconditionally, then it fails at link time:

error: linking with `/nix/store/xy542s1z1x4257pwdis32lxggff0ilkk-clang-wrapper-18.1.8/bin/clang` failed: exit status: 1
  |
  = note: LC_ALL="C" PATH="/home/thomas/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/bin:/home/thomas/.cargo/bin:/run/wrappers/bin:/home/thomas/.nix-profile/bin:/nix/profile/bin:/home/thomas/.local/state/nix/profile/bin:/etc/profiles/per-user/thomas/bin:/nix/var/nix/profiles/default/bin:/run/current-system/sw/bin:/home/thomas/.zsh/plugins/powerlevel10k:/home/thomas/.zsh/plugins/zsh-nix-shell" VSLANG="1033" "/nix/store/xy542s1z1x4257pwdis32lxggff0ilkk-clang-wrapper-18.1.8/bin/clang" "-m64" "/tmp/rustcfWbebz/symbols.o" "<17 object files omitted>" "-Wl,--as-needed" "-Wl,-Bstatic" "/home/thomas/src/github.com/aya-rs/aya/target/debug/deps/{libintegration_common-7199cfc51e571138.rlib,libaya-29671cf570d77640.rlib,libonce_cell-d9c100d41321019c.rlib,libbitflags-859ae39afcf27678.rlib,libassert_matches-e66f24f8aa7afaf4.rlib,libtokio-5bb67de0f8e545f8.rlib,libsocket2-09fa8cf7b5bce950.rlib,libmio-1925bbb0f0827c81.rlib,libpin_project_lite-d0952d87a1353a36.rlib,liblibc-76a839478b7a8354.rlib,libaya_obj-5ff296ad2a1a8244.rlib,libthiserror-87232d1566e28900.rlib,libobject-89715a7346573fe7.rlib,libindexmap-9e0d9bcc9af9f37c.rlib,libhashbrown-060ab6f8299b021b.rlib,libfoldhash-e74b4bd1cf6c7986.rlib,libequivalent-4b5dae619175b0d3.rlib,liballocator_api2-0d18d2560b2240f9.rlib,libcrc32fast-3927d6bc625d2278.rlib,libcfg_if-7320f7c2ff486fd2.rlib,libmemchr-c683b493e630f757.rlib,liblog-0c18bbe7f7b3152f.rlib,libbytes-54e1836e6e796e5f.rlib,libserde-08bf56e3be229b6b.rlib,libaya_ebpf-d1e19b3d7ce546b7.rlib,libaya_ebpf_bindings-a4277a8e5542a5f7.rlib,libaya_ebpf_cty-92a90813681baf20.rlib}" "/home/thomas/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/lib/{libstd-6273572f18644c87.rlib,libpanic_unwind-267e668abf74a283.rlib,libobject-ec6154ccae37a33e.rlib,libmemchr-500edd5521c440d4.rlib,libaddr2line-86d8d9428792e8ef.rlib,libgimli-10f06487503767c2.rlib,librustc_demangle-6a38424de1e5bca5.rlib,libstd_detect-de9763ea1c19dca3.rlib,libhashbrown-a7f5bb2f736d3c49.rlib,librustc_std_workspace_alloc-7e368919bdc4a44c.rlib,libminiz_oxide-376454d49910c786.rlib,libadler-fa99f5692b5dce85.rlib,libunwind-91cafdaf16f7fe40.rlib,libcfg_if-f7ee3f1ea78d9dae.rlib,liblibc-d3a35665f881365a.rlib,liballoc-715bc629a88bca60.rlib,librustc_std_workspace_core-ae70165d1278cff7.rlib,libcore-406129d0e3fbc101.rlib,libcompiler_builtins-1af05515ab19524a.rlib}" "-Wl,-Bdynamic" "-lgcc_s" "-lutil" "-lrt" "-lpthread" "-lm" "-ldl" "-lc" "-Wl,--eh-frame-hdr" "-Wl,-z,noexecstack" "-L" "/home/thomas/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/lib" "-o" "/home/thomas/src/github.com/aya-rs/aya/target/debug/deps/bpf_probe_read-449d9b05f6edf685" "-Wl,--gc-sections" "-pie" "-Wl,-z,relro,-z,now" "-nodefaultlibs" "-fuse-ld=/nix/store/gbhcq80nkval1l1ssjs0c8sgrw83b7gc-mold-wrapper-2.34.1/bin/mold"
  = note: some arguments are omitted. use `--verbose` to show all linker arguments
  = note: mold: error: undefined symbol: main
          >>> referenced by /nix/store/maxa3xhmxggrc5v2vc0c3pjb79hjlkp9-glibc-2.40-66/lib/Scrt1.o:(.text)
          >>>               /nix/store/maxa3xhmxggrc5v2vc0c3pjb79hjlkp9-glibc-2.40-66/lib/Scrt1.o:(_start)
          clang: error: linker command failed with exit code 1 (use -v to see invocation)


error: could not compile `integration-ebpf` (bin "bpf_probe_read") due to 1 previous error

Based on the above, it appears to me that "Binaries don't compile without a main function and therefore we need a stub when compiling for the host architecture" is a pretty solid explanation but I am open to other suggestions if you have a specific wording in mind.


/// Defines our panic handler when compiling for eBPF.
///
/// eBPF programs are not allowed to panic, meaning this handler won't actually ever be called.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is incorrect. It's not that BPF programs aren't allowed to panic, it is that there's no concept of panic in BPF.

We use a panic handler implemented as an infinite loop because the verifier disallows non-terminating loops. That guarantees that if any panics survive in the generated code, the verifier will reject the resulting program (because it would be an infinite loop).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've mentioned this in passing in other convos, but an unwinding panic handler could be implemented with special glue in the userspace loader code, where the loader would hot patch at program load time the panic code to unwind the stack and return some panic-default value (and perhaps print an error with aya-log).

Low on my priority list, but at least in theory we have the technology (tm)

Cargo.toml Outdated
"aya-tool",
"test-distro",
"test/integration-common",
# test/integration-test is omitted; including it in this list causes `cargo test` to run its
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what happens if you run cargo test after this PR?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It runs all tests apart from the integrartion tests.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please write a comment here about the omitted crates and why.

Copy link
Member

@dave-tucker dave-tucker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've left a few scattered comments, but overall I'm a bit confused about what this PR is trying to solve.
I read the associated issues and it wasn't clear what root issue was nor what the minimal fix required could be.

The setup we have is far from ideal, but it is this way due to some missing cargo features (artifact-deps, per-package-target to name a few). As far as I know, aya-template works fine - clippy, rust-analyzer etc... and aya itself has a couple of nagging warnings about multiple panic_handlers from the integration-tests, but other than that it's fine.

The UX improvements for aya-ebpf don't seem like much of an improvement to me and likely warrant further discussion, especially given they are breaking changes. I'd prefer it if we could keep that separate from solving the root issue.

Reviewed 17 of 17 files at r1, 18 of 18 files at r2, 1 of 1 files at r3, 18 of 18 files at r4, 1 of 1 files at r5, 1 of 1 files at r6, 19 of 19 files at r7, 1 of 1 files at r8.
Reviewable status: 20 of 40 files reviewed, 15 unresolved discussions (waiting on @tamird and @thomaseizinger)


ebpf/aya-ebpf/src/lib.rs line 137 at r9 (raw file):

#[macro_export]
macro_rules! main_stub {

This doesn't seem very rusty.

I wonder if it might better be written as a proc macro (we already use this heavily in aya-ebpf).

#[derive(MainFunction)
#[xdp]
fn do_something ...

Which would expand out to:

#[cfg(not(target_arch = "bpf"))]
extern crate libc; // needs to link against libc - sigh

#[cfg(not(target_arch = "bpf"))]
#[no_mangle]
pub extern "C" fn main(_argc: isize, _argv: *const *const u8) -> isize {
    unreachable!()
}

...

ebpf/aya-ebpf/src/lib.rs line 153 at r11 (raw file):

/// Including the panic handler within `aya-ebpf` means every eBPF will automatically have this panic handler defined.
#[cfg(target_arch = "bpf")]
#[panic_handler]

I'm 👎 for defining a panic handler in aya-ebpf.

As implemented, there's no way for a user to opt-out and/or bring an alternative implementation. Also I believe this would be a breaking change for anyone who has a panic_handler already defined and upgrades aya-ebpf.

You could potentially gate this behind a feature to make it opt-in.


test/integration-test/src/tests/bpf_probe_read.rs line 4 at r7 (raw file):

use integration_common::bpf_probe_read::{RESULT_BUF_LEN, TestResult};

#[cfg_attr(aya_integration_test, test_log::test)]

Is this better?
Before, I could add a test by using #[test].
Now I have 2 verbose #[cfg_attr] to add to every test.


ebpf/aya-ebpf/src/lib.rs line 137 at r4 (raw file):

#[macro_export]
macro_rules! prelude {

I gave up trying to review this commit-by-commit at this point.
This prelude macro appears to be removed later.
If this commit isn't required, then please drop it.


ebpf/aya-ebpf/src/lib.rs line 139 at r6 (raw file):

macro_rules! prelude {
    () => {
        /// Defines our panic handler when compiling for eBPF.

s/our/the//g


ebpf/aya-ebpf/src/lib.rs line 141 at r6 (raw file):

        /// Defines our panic handler when compiling for eBPF.
        ///
        /// eBPF programs are not allowed to panic, meaning this handler won't actually ever be called.

This is incorrect. The panic handler will be called if you do something that could panic.
The program will fail to pass the verifier because an infinite loop won't verify.


test/integration-test/src/tests/btf_relocations.rs line 3 at r7 (raw file):

use aya::{Btf, EbpfLoader, Endianness, maps::Array, programs::UProbe, util::KernelVersion};

#[cfg_attr(aya_integration_test, test_case::test_case("enum_signed_32", false, Some((KernelVersion::new(6, 0, 0), "https://github.com/torvalds/linux/commit/6089fb3")), -0x7AAAAAAAi32 as u64))]

Same comment here re: usability.


test/integration-ebpf/src/bpf_probe_read.rs line 1 at r11 (raw file):

#![cfg_attr(target_arch = "bpf", no_std)]

This is incorrect. This crate should be no_std whether of not you're compiling for the host target or bpf target.

xtask/src/run.rs Outdated
profile,
])
cmd.env(AYA_BUILD_INTEGRATION_BPF, "true")
.env("RUSTFLAGS", "--cfg aya_integration_test")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer excluding the tests via default_members. Disabling them via conditional compilation is too verbose - at least as implemented in this PR.

@thomaseizinger
Copy link
Contributor Author

I've left a few scattered comments, but overall I'm a bit confused about what this PR is trying to solve.

Thank you for the review @dave-tucker ! Sorry for not providing more context.

I've been working with aya in another project with a monorepo and having to define and maintain default-members was annoying (we have 20+) crates. I played around with the setup and came up with the idea of conditionally definining no-std and no-main. I figured that is an improvement worth contributing upstream so that other users of aya don't have to go through the same thing.

Instead of just updating the template and book, I figured updating the main repo to use this would also be nice so that there is no drift. But the addition of xtask and how the integration tests are laid out added more requirements as to what we need to do to get rid of default-members here.

@thomaseizinger thomaseizinger force-pushed the refactor/use-correct-cfg branch from 2e8d8a9 to 0efde5c Compare March 27, 2025 00:03
@thomaseizinger
Copy link
Contributor Author

So I've experimented with a few more combinations of cfgs and attributes:

  • If we have no_std unconditional then compiling for the host architecture doesn't work because we get conflicting implementations of the panic_handler:
error[E0152]: found duplicate lang item `panic_impl`
 --> test/integration-ebpf/src/bpf_probe_read.rs:4:1
  |
4 | aya_ebpf::panic_handler!();
  | ^^^^^^^^^^^^^^^^^^^^^^^^^^
  |
  = note: the lang item is first defined in crate `std` (which `aya` depends on)
  = note: first definition in `std` loaded from /home/thomas/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/lib/libstd-6273572f18644c87.so, /home/thomas/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/lib/libstd-6273572f18644c87.rlib
  = note: second definition in the local crate (`bpf_probe_read`)
  = note: this error originates in the macro `aya_ebpf::panic_handler` (in Nightly builds, run with -Z macro-backtrace for more info)
  • If we have no_main unconditional then compiling for the host architecture doesn't work because the linker expects a main symbol in the final binary:
error: linking with `/nix/store/xy542s1z1x4257pwdis32lxggff0ilkk-clang-wrapper-18.1.8/bin/clang` failed: exit status: 1
  |
  = note: LC_ALL="C" PATH="/home/thomas/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/bin:/home/thomas/.cargo/bin:/run/wrappers/bin:/home/thomas/.nix-profile/bin:/nix/profile/bin:/home/thomas/.local/state/nix/profile/bin:/etc/profiles/per-user/thomas/bin:/nix/var/nix/profiles/default/bin:/run/current-system/sw/bin:/home/thomas/.zsh/plugins/powerlevel10k:/home/thomas/.zsh/plugins/zsh-nix-shell" VSLANG="1033" "/nix/store/xy542s1z1x4257pwdis32lxggff0ilkk-clang-wrapper-18.1.8/bin/clang" "-m64" "/tmp/rustcDrzSob/symbols.o" "<17 object files omitted>" "-Wl,--as-needed" "-Wl,-Bstatic" "/home/thomas/src/github.com/aya-rs/aya/target/debug/deps/{libintegration_common-7199cfc51e571138.rlib,libaya-29671cf570d77640.rlib,libonce_cell-d9c100d41321019c.rlib,libbitflags-859ae39afcf27678.rlib,libassert_matches-e66f24f8aa7afaf4.rlib,libtokio-5bb67de0f8e545f8.rlib,libsocket2-09fa8cf7b5bce950.rlib,libmio-1925bbb0f0827c81.rlib,libpin_project_lite-d0952d87a1353a36.rlib,liblibc-76a839478b7a8354.rlib,libaya_obj-5ff296ad2a1a8244.rlib,libthiserror-87232d1566e28900.rlib,libobject-89715a7346573fe7.rlib,libindexmap-9e0d9bcc9af9f37c.rlib,libhashbrown-060ab6f8299b021b.rlib,libfoldhash-e74b4bd1cf6c7986.rlib,libequivalent-4b5dae619175b0d3.rlib,liballocator_api2-0d18d2560b2240f9.rlib,libcrc32fast-3927d6bc625d2278.rlib,libcfg_if-7320f7c2ff486fd2.rlib,libmemchr-c683b493e630f757.rlib,liblog-0c18bbe7f7b3152f.rlib,libbytes-54e1836e6e796e5f.rlib,libserde-08bf56e3be229b6b.rlib,libaya_ebpf-d1e19b3d7ce546b7.rlib,libaya_ebpf_bindings-a4277a8e5542a5f7.rlib,libaya_ebpf_cty-92a90813681baf20.rlib}" "/home/thomas/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/lib/{libstd-6273572f18644c87.rlib,libpanic_unwind-267e668abf74a283.rlib,libobject-ec6154ccae37a33e.rlib,libmemchr-500edd5521c440d4.rlib,libaddr2line-86d8d9428792e8ef.rlib,libgimli-10f06487503767c2.rlib,librustc_demangle-6a38424de1e5bca5.rlib,libstd_detect-de9763ea1c19dca3.rlib,libhashbrown-a7f5bb2f736d3c49.rlib,librustc_std_workspace_alloc-7e368919bdc4a44c.rlib,libminiz_oxide-376454d49910c786.rlib,libadler-fa99f5692b5dce85.rlib,libunwind-91cafdaf16f7fe40.rlib,libcfg_if-f7ee3f1ea78d9dae.rlib,liblibc-d3a35665f881365a.rlib,liballoc-715bc629a88bca60.rlib,librustc_std_workspace_core-ae70165d1278cff7.rlib,libcore-406129d0e3fbc101.rlib,libcompiler_builtins-1af05515ab19524a.rlib}" "-Wl,-Bdynamic" "-lgcc_s" "-lutil" "-lrt" "-lpthread" "-lm" "-ldl" "-lc" "-Wl,--eh-frame-hdr" "-Wl,-z,noexecstack" "-L" "/home/thomas/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/lib" "-o" "/home/thomas/src/github.com/aya-rs/aya/target/debug/deps/bpf_probe_read-449d9b05f6edf685" "-Wl,--gc-sections" "-pie" "-Wl,-z,relro,-z,now" "-nodefaultlibs" "-fuse-ld=/nix/store/gbhcq80nkval1l1ssjs0c8sgrw83b7gc-mold-wrapper-2.34.1/bin/mold"
  = note: some arguments are omitted. use `--verbose` to show all linker arguments
  = note: mold: error: undefined symbol: main
          >>> referenced by /nix/store/maxa3xhmxggrc5v2vc0c3pjb79hjlkp9-glibc-2.40-66/lib/Scrt1.o:(.text)
          >>>               /nix/store/maxa3xhmxggrc5v2vc0c3pjb79hjlkp9-glibc-2.40-66/lib/Scrt1.o:(_start)
          clang: error: linker command failed with exit code 1 (use -v to see invocation)

I conclude from that that we have to make no_std and no_main conditional on the target architecture if we want those crates to compile on x86 etc.

@thomaseizinger thomaseizinger force-pushed the refactor/use-correct-cfg branch from d246b9f to 940d22a Compare March 27, 2025 00:12
@thomaseizinger
Copy link
Contributor Author

@tamird @dave-tucker I removed the conditional compilation of the integration tests and tried to group the remaining changes into meaningful commits. Let me know what you think.

@dave-tucker
Copy link
Member

FYI @thomaseizinger I've just opened this #1234 which is somewhat related. This conversation here + the clippy warnings when working in this repo forced my hand.

So I've experimented with a few more combinations of cfgs and attributes:

  • If we have no_std unconditional then compiling for the host architecture doesn't work because we get conflicting implementations of the panic_handler:
error[E0152]: found duplicate lang item `panic_impl`
 --> test/integration-ebpf/src/bpf_probe_read.rs:4:1
  |
4 | aya_ebpf::panic_handler!();
  | ^^^^^^^^^^^^^^^^^^^^^^^^^^
  |
  = note: the lang item is first defined in crate `std` (which `aya` depends on)
  = note: first definition in `std` loaded from /home/thomas/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/lib/libstd-6273572f18644c87.so, /home/thomas/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/lib/libstd-6273572f18644c87.rlib
  = note: second definition in the local crate (`bpf_probe_read`)
  = note: this error originates in the macro `aya_ebpf::panic_handler` (in Nightly builds, run with -Z macro-backtrace for more info)

You may need to try this again with my PR linked above. Full details in the commit message.

  • If we have no_main unconditional then compiling for the host architecture doesn't work because the linker expects a main symbol in the final binary:
error: linking with `/nix/store/xy542s1z1x4257pwdis32lxggff0ilkk-clang-wrapper-18.1.8/bin/clang` failed: exit status: 1
  |
  = note: LC_ALL="C" PATH="/home/thomas/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/bin:/home/thomas/.cargo/bin:/run/wrappers/bin:/home/thomas/.nix-profile/bin:/nix/profile/bin:/home/thomas/.local/state/nix/profile/bin:/etc/profiles/per-user/thomas/bin:/nix/var/nix/profiles/default/bin:/run/current-system/sw/bin:/home/thomas/.zsh/plugins/powerlevel10k:/home/thomas/.zsh/plugins/zsh-nix-shell" VSLANG="1033" "/nix/store/xy542s1z1x4257pwdis32lxggff0ilkk-clang-wrapper-18.1.8/bin/clang" "-m64" "/tmp/rustcDrzSob/symbols.o" "<17 object files omitted>" "-Wl,--as-needed" "-Wl,-Bstatic" "/home/thomas/src/github.com/aya-rs/aya/target/debug/deps/{libintegration_common-7199cfc51e571138.rlib,libaya-29671cf570d77640.rlib,libonce_cell-d9c100d41321019c.rlib,libbitflags-859ae39afcf27678.rlib,libassert_matches-e66f24f8aa7afaf4.rlib,libtokio-5bb67de0f8e545f8.rlib,libsocket2-09fa8cf7b5bce950.rlib,libmio-1925bbb0f0827c81.rlib,libpin_project_lite-d0952d87a1353a36.rlib,liblibc-76a839478b7a8354.rlib,libaya_obj-5ff296ad2a1a8244.rlib,libthiserror-87232d1566e28900.rlib,libobject-89715a7346573fe7.rlib,libindexmap-9e0d9bcc9af9f37c.rlib,libhashbrown-060ab6f8299b021b.rlib,libfoldhash-e74b4bd1cf6c7986.rlib,libequivalent-4b5dae619175b0d3.rlib,liballocator_api2-0d18d2560b2240f9.rlib,libcrc32fast-3927d6bc625d2278.rlib,libcfg_if-7320f7c2ff486fd2.rlib,libmemchr-c683b493e630f757.rlib,liblog-0c18bbe7f7b3152f.rlib,libbytes-54e1836e6e796e5f.rlib,libserde-08bf56e3be229b6b.rlib,libaya_ebpf-d1e19b3d7ce546b7.rlib,libaya_ebpf_bindings-a4277a8e5542a5f7.rlib,libaya_ebpf_cty-92a90813681baf20.rlib}" "/home/thomas/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/lib/{libstd-6273572f18644c87.rlib,libpanic_unwind-267e668abf74a283.rlib,libobject-ec6154ccae37a33e.rlib,libmemchr-500edd5521c440d4.rlib,libaddr2line-86d8d9428792e8ef.rlib,libgimli-10f06487503767c2.rlib,librustc_demangle-6a38424de1e5bca5.rlib,libstd_detect-de9763ea1c19dca3.rlib,libhashbrown-a7f5bb2f736d3c49.rlib,librustc_std_workspace_alloc-7e368919bdc4a44c.rlib,libminiz_oxide-376454d49910c786.rlib,libadler-fa99f5692b5dce85.rlib,libunwind-91cafdaf16f7fe40.rlib,libcfg_if-f7ee3f1ea78d9dae.rlib,liblibc-d3a35665f881365a.rlib,liballoc-715bc629a88bca60.rlib,librustc_std_workspace_core-ae70165d1278cff7.rlib,libcore-406129d0e3fbc101.rlib,libcompiler_builtins-1af05515ab19524a.rlib}" "-Wl,-Bdynamic" "-lgcc_s" "-lutil" "-lrt" "-lpthread" "-lm" "-ldl" "-lc" "-Wl,--eh-frame-hdr" "-Wl,-z,noexecstack" "-L" "/home/thomas/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/lib" "-o" "/home/thomas/src/github.com/aya-rs/aya/target/debug/deps/bpf_probe_read-449d9b05f6edf685" "-Wl,--gc-sections" "-pie" "-Wl,-z,relro,-z,now" "-nodefaultlibs" "-fuse-ld=/nix/store/gbhcq80nkval1l1ssjs0c8sgrw83b7gc-mold-wrapper-2.34.1/bin/mold"
  = note: some arguments are omitted. use `--verbose` to show all linker arguments
  = note: mold: error: undefined symbol: main
          >>> referenced by /nix/store/maxa3xhmxggrc5v2vc0c3pjb79hjlkp9-glibc-2.40-66/lib/Scrt1.o:(.text)
          >>>               /nix/store/maxa3xhmxggrc5v2vc0c3pjb79hjlkp9-glibc-2.40-66/lib/Scrt1.o:(_start)
          clang: error: linker command failed with exit code 1 (use -v to see invocation)

I conclude from that that we have to make no_std and no_main conditional on the target architecture if we want those crates to compile on x86 etc.

Your conclusion is somewhat correct. The linker expects a main symbol when building for a *-gnu target (glibc).
Given these crates target the bpfel-unknown-none and bpfeb-unknown-none targets only them being #[no_std] and #[no_main] is absolutely the correct call. We don't ever intend for them to work on *-gnu targets.

If you need them to compile - to some definition of the word since they would never run on that target - then the onus is on you to work within these constraints (see my comments above on for adding a main symbol to a no_std, no_main binary #1229 (review)).

Coming back to default_members, I understand your frustration in working with a big repo with lots of crates.
However, that's something that would be better fixed at the Cargo level IMHO instead of us carrying hacks.
Have you discussed this with the cargo maintainers? The same problem exists not just for eBPF, but for WASM crates too.

@thomaseizinger
Copy link
Contributor Author

thomaseizinger commented Mar 27, 2025

If you need them to compile - to some definition of the word since they would never run on that target - then the onus is on you to work within these constraints (see my comments above on for adding a main symbol to a no_std, no_main binary #1229 (review)).

Coming back to default_members, I understand your frustration in working with a big repo with lots of crates. However, that's something that would be better fixed at the Cargo level IMHO instead of us carrying hacks. Have you discussed this with the cargo maintainers? The same problem exists not just for eBPF, but for WASM crates too.

I agree fully, it is a trade-off. The question is, do you want to leverage hacks to make the tools of today work well with the workspace or do you want to stick what is strictly correct in terms of how the crates are defined?

My conclusion was: I'd rather use a few hacks because it is more important to me that other people in my team can come to the repo and use cargo the way they are used to and don't have to learn about eBPF and cargo's limitiations now. I also consider the use of default-members a hack. However, it is a hack that everyone adding new crates needs to learn where as the definition of the eBPF crates is more localised and thus only the person working on that needs to learn about.

You seem to be leaning on the other side of this trade-off for the aya repo which is completely fine. Do you feel the same way about what is recommended in the book and template? Do you want to leave those as is or shall we change them to use conditional compilation instead of default-members?

If we leave is as, I am happy to close this and the other PRs, there is nothing to do here then :)

Copy link

mergify bot commented Mar 27, 2025

@thomaseizinger, this pull request is now in conflict and requires a rebase.

@mergify mergify bot added the needs-rebase label Mar 27, 2025
Copy link
Member

@tamird tamird left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please rebase?

Being able to compile the eBPF crates for the host architecture is
useful because it allows `cargo build` and `cargo test` to "just work".

To achieve that, we feature-gate `no_std` and `no_main` on the bpf
target architecture.
@thomaseizinger thomaseizinger force-pushed the refactor/use-correct-cfg branch from 940d22a to 53c59ae Compare April 1, 2025 00:04
@mergify mergify bot removed the needs-rebase label Apr 1, 2025
@thomaseizinger
Copy link
Contributor Author

@tamird Rebased! I've dropped the panic-handler macro in favor of changing the cfg on the extern crate definition. Without that, cargo will still complain about duplicate panic-handlers when cargo check --workspace is used (or the crates get added to default-members).

@@ -133,6 +133,16 @@ pub fn check_bounds_signed(value: i64, lower: i64, upper: i64) -> bool {
}
}

#[macro_export]
macro_rules! main_stub {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am open to changing this to something else other than a macro by example if we can come up with something elegant.

AFAIK, eBPF progams can have multiple entry points so we cannot really attach it to another proc-macro, at least not in the general case.

Leaning onto the idea from @dave-tucker, we could define a aya_ebpf::main macro that people can add to their entrypoint:

#[xdp]
#[aya_ebpf::main]
pub fn handle(ctx: XdpContext) -> u32 {
	xdp_action::XDP_PASS
}

It is not very elegant though because the handler itself has nothing to do with the generated main stub, i.e. putting the attribute on any other function would also work.

Given that we already use nightly, we could try and build something on top of #![feature(custom_inner_attributes)]?

Using that, we could perhaps lump the no_std, no_main and main stub together so that users only have to define one macro, like:

#![aya_ebpf::program]

Copy link
Member

@tamird tamird left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 22 of 22 files at r16, 20 of 20 files at r17, all commit messages.
Reviewable status: 22 of 41 files reviewed, 13 unresolved discussions (waiting on @dave-tucker and @thomaseizinger)


test/integration-ebpf/src/bpf_probe_read.rs line 1 at r11 (raw file):

Previously, dave-tucker (Dave Tucker) wrote…

This is incorrect. This crate should be no_std whether of not you're compiling for the host target or bpf target.

Agreed. I don't understand why no_std got caught up here.


-- commits line 2 at r16:
this commit messages needs to include rationale.

https://cbea.ms/git-commit/


-- commits line 10 at r17:
why is feature gating no_std required?

also, please enumerate the excluded crates both here and in the cargo.toml and explain the reason.

Cargo.toml Outdated
"aya-tool",
"test-distro",
"test/integration-common",
# test/integration-test is omitted; including it in this list causes `cargo test` to run its
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please write a comment here about the omitted crates and why.

xtask/src/run.rs Outdated
profile,
])
cmd.env(AYA_BUILD_INTEGRATION_BPF, "true")
.env("RUSTFLAGS", "--cfg aya_integration_test")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(put it in the commit message where applicable)

@thomaseizinger
Copy link
Contributor Author

thomaseizinger commented Apr 2, 2025

Agreed. I don't understand why no_std got caught up here.

If I don't make no_std conditional on target_arch = "bpf", compiling for the host architecture fails with:

❯ cargo check
    Checking integration-ebpf v0.1.0 (/home/thomas/src/github.com/aya-rs/aya/test/integration-ebpf)
error: `#[panic_handler]` function required, but not found

error: unwinding panics are not supported without std
  |
  = help: using nightly cargo, use -Zbuild-std with panic="abort" to avoid unwinding
  = note: since the core library is usually precompiled with panic="unwind", rebuilding your crate with panic="abort" may not be enough to fix the problem

error: could not compile `integration-ebpf` (bin "bpf_probe_read") due to 2 previous errors

If I switch the extern crate ebpf_panic back to cfg(not(test)), I get:

❯ cargo check
    Checking integration-ebpf v0.1.0 (/home/thomas/src/github.com/aya-rs/aya/test/integration-ebpf)
error: unwinding panics are not supported without std
  |
  = help: using nightly cargo, use -Zbuild-std with panic="abort" to avoid unwinding
  = note: since the core library is usually precompiled with panic="unwind", rebuilding your crate with panic="abort" may not be enough to fix the problem

error: could not compile `integration-ebpf` (bin "bpf_probe_read") due to 1 previous error

also, please enumerate the excluded crates both here and in the cargo.toml and explain the reason.

No (new) crates are excluded, quite the opposite. We now include more crates in default-members because they no longer need special treatment!

@thomaseizinger
Copy link
Contributor Author

I'd prefer if we can hash this out verbally first before I keep iterating on the code. This has already taken much longer than I originally anticipated and at the moment, it doesn't feel like we are getting any closer to getting this merged.

As far as my experience is concerned, the combination of feature-gated no_std, no_main and panic-handler is required in combination with a main stub in order to make these crates compile on the host architecture. I am open to other ideas if you have any.

  1. Is this generally acceptable or do you want to retain unconditional no_std and no_main?
  2. Is a macro-by-example main stub acceptable? I don't have the capacity to contribute a proc-macro, sorry.

@alessandrod
Copy link
Collaborator

catching up with this PR just now, I'd be aginst any of this ending up in the book or template. I agree that it's valuable to be able to run on host architecture (for testing I presume?), but it's a niche case and we shouldn't be imposing extra cognitive burden on people who just want to write eBPF programs (not targeting host), especially to those just getting started who are the #1 consumers of the book and template.

Also it's not clear to me why we'd want the main stub to be in aya proper? Can't it be in the calling code/some other crate?

@thomaseizinger
Copy link
Contributor Author

catching up with this PR just now, I'd be aginst any of this ending up in the book or template. I agree that it's valuable to be able to run on host architecture (for testing I presume?), but it's a niche case and we shouldn't be imposing extra cognitive burden on people who just want to write eBPF programs (not targeting host), especially to those just getting started who are the #1 consumers of the book and template.

Thank you for clearly stating your position!

Coming from a mono-repo project where I integrated eBPF with aya, I had the opposite experience and would have appreciated more guidance by the book / template as to how set things up without interrupting the cargo-based workflow for the remaining crates. I've found the cognitive burden of compile-errors displayed by rust-analyzer as well as the inconsistency in how cargo clippy worked to be more confusing (due to default-members, I no longer saw clippy warnings for my eBPF code). I didn't know about default-members at first and so it took me a while until I actually understand how all of those pieces fit together and how I needed to adjust my code to not need any of that.

As I've stated elsewhere, I think that localising the added complexity to the eBPF crate itself (with conditional compilation) is better than requiring every other developer on the same project to learn why we are suddenly using default-members, why cargo clippy --workspace no longer works etc. The consequences are more far reaching than you initially think. Not being able to use --workspace means I'd have to adjust our CI to no longer use it. We build for multiple platforms where MacOS and Windows already only compile a select set of crates. The Linux build however uses --workspace to make sure we really run all tests on at least one platform. If somebody adds a crate now and forgets to update default-members, buggy code could slip through CI. To prevent that, I'd now have to add an additional lint based on cargo metadata that ensures default-members contains every crate but the specific eBPF one. Whilst writing that, I reached the point where I started looking for another solution because the added complexity started to be get out of hand.

Anyway, I am just repeating myself at this point. Thank you for all the input on the PR :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
aya-bpf This is about aya-bpf (kernel) test A PR that improves test cases or CI
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add panic_handler to aya-ebpf
4 participants