-
Notifications
You must be signed in to change notification settings - Fork 318
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
refactor: allow eBPF crates to compile on host architecture #1229
refactor: allow eBPF crates to compile on host architecture #1229
Conversation
✅ Deploy Preview for aya-rs-docs ready!Built without sensitive environment variables
To edit notification comments on pull requests, go to your Netlify site configuration. |
target_arch
for panic-handler cfg
@tamird I've updated this to cfg the eBPF crates on the target architecture. We can't directly get rid of Do you want me to look into that as well as part of this? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tamird I've updated this to cfg the eBPF crates on the target architecture. We can't directly get rid of
default-members
though because of the other integration tests. I haven't looked at the xtask code to see if it can be done.Do you want me to look into that as well as part of this?
Yeah, it would be great to get all the way to the finish line.
Also I think this code can be simplified now?
Lines 5 to 20 in 1db534d
# We cannot run clippy over the whole workspace at once due to feature unification. Since both | |
# integration-test and integration-ebpf depend on integration-common and integration-test activates | |
# integration-common's aya dependency, we end up trying to compile the panic handler twice: once | |
# from the bpf program, and again from std via aya. | |
# | |
# `-C panic=abort` because "unwinding panics are not supported without std"; integration-ebpf | |
# contains `#[no_std]` binaries. | |
# | |
# `-Zpanic_abort_tests` because "building tests with panic=abort is not supported without | |
# `-Zpanic_abort_tests`"; Cargo does this automatically when panic=abort is set via profile but we | |
# want to preserve unwinding at runtime - here we are just running clippy so we don't care about | |
# unwinding behavior. | |
# | |
# `+nightly` because "the option `Z` is only accepted on the nightly compiler". | |
cargo +nightly hack clippy "$@" --exclude integration-ebpf --all-targets --feature-powerset --workspace -- --deny warnings | |
cargo +nightly hack clippy "$@" --package integration-ebpf --all-targets --feature-powerset -- --deny warnings -C panic=abort -Zpanic_abort_tests |
xtask/src/run.rs
Outdated
profile, | ||
]) | ||
cmd.env(AYA_BUILD_INTEGRATION_BPF, "true") | ||
.env("RUSTFLAGS", "--cfg aya_integration_test") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tamird Setting a custom cfg here allows us to exclude the integration tests from compiling unless we compile them via xtask. It leads to some duplication across the test crates if we want to compile them without dead-code warnings etc.
I think that could be resolved if we were to write a custom test
macro (and one for test_case
and one for tokio::test
). I'd like to defer that to later if possible. My main interest is in removing the need for default-members
for users of aya which don't necessarily have such an xtask-based setup. For them, the introduction of aya_ebpf::main_stub
and moving the panic_handler
is the important bit.
I'll leave it up to you whether you want to include this in here or only merge the improvements around the main-stub and panic-handler :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I prefer excluding the tests via default_members
. Disabling them via conditional compilation is too verbose - at least as implemented in this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I agree it is pretty verbose and I don't like it either. @tamird suggested we might as well try to go the full way and try to get rid of default-members
entirely in here, even though it is more difficult than for a regular aya-based eBPF workspace.
I am happy to drop these again if @tamird is fine with keeping default-members
within the main repo just to keep the integration tests disabled..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am fine with any improvements that are clearly stated. If we're not able to remove default-members
, what are we accomplishing here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we agree on a way how we can make the eBPF programs compile for host architectures (dealing with the main stub and all) and we think that is worthwhile updating the template and book as well, then what this PR is about is applying those changes here which removes drift between what the book and template recommend how to define eBPF programs and how the main repo does it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(put it in the commit message where applicable)
() => { | ||
#[cfg(not(target_arch = "bpf"))] | ||
fn main() { | ||
panic!(r#"eBPF kernels are not designed to be executed in user-space. This main function is only a placeholder to allow the code to compile on the host system (i.e. on any system that is not `target_arch = "bpf"`). This works in tandem with the `no_main` attribute which is only applied when compiling for `target_arch = "bpf"`."#) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See #1229 (comment)
This is not a meaningful message like I asked. This is basically just saying "we need this because no_main only on bpf" but doesn't explain why we can't keep no_main in place when targeting not-bpf.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you please write out message that you'd like to see? 🙏
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd love to, but I actually don't know the reason.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The crate doesn't compile without a main function (for the host architecture). I guess the reasons is "binaries need an entrypoint"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're still not answering the question, or am I missing something? Why can't no_main
be used on the host?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't how else to say it, the code doesn't compile without a main
function:
error[E0601]: `main` function not found in crate `tcx`
--> test/integration-ebpf/src/tcx.rs:10:2
|
10 | }
| ^ consider adding a `main` function to `test/integration-ebpf/src/tcx.rs`
If I add the no_main
attribute back unconditionally, then it fails at link time:
error: linking with `/nix/store/xy542s1z1x4257pwdis32lxggff0ilkk-clang-wrapper-18.1.8/bin/clang` failed: exit status: 1
|
= note: LC_ALL="C" PATH="/home/thomas/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/bin:/home/thomas/.cargo/bin:/run/wrappers/bin:/home/thomas/.nix-profile/bin:/nix/profile/bin:/home/thomas/.local/state/nix/profile/bin:/etc/profiles/per-user/thomas/bin:/nix/var/nix/profiles/default/bin:/run/current-system/sw/bin:/home/thomas/.zsh/plugins/powerlevel10k:/home/thomas/.zsh/plugins/zsh-nix-shell" VSLANG="1033" "/nix/store/xy542s1z1x4257pwdis32lxggff0ilkk-clang-wrapper-18.1.8/bin/clang" "-m64" "/tmp/rustcfWbebz/symbols.o" "<17 object files omitted>" "-Wl,--as-needed" "-Wl,-Bstatic" "/home/thomas/src/github.com/aya-rs/aya/target/debug/deps/{libintegration_common-7199cfc51e571138.rlib,libaya-29671cf570d77640.rlib,libonce_cell-d9c100d41321019c.rlib,libbitflags-859ae39afcf27678.rlib,libassert_matches-e66f24f8aa7afaf4.rlib,libtokio-5bb67de0f8e545f8.rlib,libsocket2-09fa8cf7b5bce950.rlib,libmio-1925bbb0f0827c81.rlib,libpin_project_lite-d0952d87a1353a36.rlib,liblibc-76a839478b7a8354.rlib,libaya_obj-5ff296ad2a1a8244.rlib,libthiserror-87232d1566e28900.rlib,libobject-89715a7346573fe7.rlib,libindexmap-9e0d9bcc9af9f37c.rlib,libhashbrown-060ab6f8299b021b.rlib,libfoldhash-e74b4bd1cf6c7986.rlib,libequivalent-4b5dae619175b0d3.rlib,liballocator_api2-0d18d2560b2240f9.rlib,libcrc32fast-3927d6bc625d2278.rlib,libcfg_if-7320f7c2ff486fd2.rlib,libmemchr-c683b493e630f757.rlib,liblog-0c18bbe7f7b3152f.rlib,libbytes-54e1836e6e796e5f.rlib,libserde-08bf56e3be229b6b.rlib,libaya_ebpf-d1e19b3d7ce546b7.rlib,libaya_ebpf_bindings-a4277a8e5542a5f7.rlib,libaya_ebpf_cty-92a90813681baf20.rlib}" "/home/thomas/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/lib/{libstd-6273572f18644c87.rlib,libpanic_unwind-267e668abf74a283.rlib,libobject-ec6154ccae37a33e.rlib,libmemchr-500edd5521c440d4.rlib,libaddr2line-86d8d9428792e8ef.rlib,libgimli-10f06487503767c2.rlib,librustc_demangle-6a38424de1e5bca5.rlib,libstd_detect-de9763ea1c19dca3.rlib,libhashbrown-a7f5bb2f736d3c49.rlib,librustc_std_workspace_alloc-7e368919bdc4a44c.rlib,libminiz_oxide-376454d49910c786.rlib,libadler-fa99f5692b5dce85.rlib,libunwind-91cafdaf16f7fe40.rlib,libcfg_if-f7ee3f1ea78d9dae.rlib,liblibc-d3a35665f881365a.rlib,liballoc-715bc629a88bca60.rlib,librustc_std_workspace_core-ae70165d1278cff7.rlib,libcore-406129d0e3fbc101.rlib,libcompiler_builtins-1af05515ab19524a.rlib}" "-Wl,-Bdynamic" "-lgcc_s" "-lutil" "-lrt" "-lpthread" "-lm" "-ldl" "-lc" "-Wl,--eh-frame-hdr" "-Wl,-z,noexecstack" "-L" "/home/thomas/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/lib" "-o" "/home/thomas/src/github.com/aya-rs/aya/target/debug/deps/bpf_probe_read-449d9b05f6edf685" "-Wl,--gc-sections" "-pie" "-Wl,-z,relro,-z,now" "-nodefaultlibs" "-fuse-ld=/nix/store/gbhcq80nkval1l1ssjs0c8sgrw83b7gc-mold-wrapper-2.34.1/bin/mold"
= note: some arguments are omitted. use `--verbose` to show all linker arguments
= note: mold: error: undefined symbol: main
>>> referenced by /nix/store/maxa3xhmxggrc5v2vc0c3pjb79hjlkp9-glibc-2.40-66/lib/Scrt1.o:(.text)
>>> /nix/store/maxa3xhmxggrc5v2vc0c3pjb79hjlkp9-glibc-2.40-66/lib/Scrt1.o:(_start)
clang: error: linker command failed with exit code 1 (use -v to see invocation)
error: could not compile `integration-ebpf` (bin "bpf_probe_read") due to 1 previous error
Based on the above, it appears to me that "Binaries don't compile without a main function and therefore we need a stub when compiling for the host architecture" is a pretty solid explanation but I am open to other suggestions if you have a specific wording in mind.
ebpf/aya-ebpf/src/lib.rs
Outdated
|
||
/// Defines our panic handler when compiling for eBPF. | ||
/// | ||
/// eBPF programs are not allowed to panic, meaning this handler won't actually ever be called. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is incorrect. It's not that BPF programs aren't allowed to panic, it is that there's no concept of panic in BPF.
We use a panic handler implemented as an infinite loop because the verifier disallows non-terminating loops. That guarantees that if any panics survive in the generated code, the verifier will reject the resulting program (because it would be an infinite loop).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've mentioned this in passing in other convos, but an unwinding panic handler could be implemented with special glue in the userspace loader code, where the loader would hot patch at program load time the panic code to unwind the stack and return some panic-default value (and perhaps print an error with aya-log).
Low on my priority list, but at least in theory we have the technology (tm)
Cargo.toml
Outdated
"aya-tool", | ||
"test-distro", | ||
"test/integration-common", | ||
# test/integration-test is omitted; including it in this list causes `cargo test` to run its |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what happens if you run cargo test after this PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It runs all tests apart from the integrartion tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please write a comment here about the omitted crates and why.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've left a few scattered comments, but overall I'm a bit confused about what this PR is trying to solve.
I read the associated issues and it wasn't clear what root issue was nor what the minimal fix required could be.
The setup we have is far from ideal, but it is this way due to some missing cargo features (artifact-deps, per-package-target to name a few). As far as I know, aya-template works fine - clippy, rust-analyzer etc... and aya itself has a couple of nagging warnings about multiple panic_handlers from the integration-tests, but other than that it's fine.
The UX improvements for aya-ebpf don't seem like much of an improvement to me and likely warrant further discussion, especially given they are breaking changes. I'd prefer it if we could keep that separate from solving the root issue.
Reviewed 17 of 17 files at r1, 18 of 18 files at r2, 1 of 1 files at r3, 18 of 18 files at r4, 1 of 1 files at r5, 1 of 1 files at r6, 19 of 19 files at r7, 1 of 1 files at r8.
Reviewable status: 20 of 40 files reviewed, 15 unresolved discussions (waiting on @tamird and @thomaseizinger)
ebpf/aya-ebpf/src/lib.rs
line 137 at r9 (raw file):
#[macro_export] macro_rules! main_stub {
This doesn't seem very rusty.
I wonder if it might better be written as a proc macro (we already use this heavily in aya-ebpf).
#[derive(MainFunction)
#[xdp]
fn do_something ...
Which would expand out to:
#[cfg(not(target_arch = "bpf"))]
extern crate libc; // needs to link against libc - sigh
#[cfg(not(target_arch = "bpf"))]
#[no_mangle]
pub extern "C" fn main(_argc: isize, _argv: *const *const u8) -> isize {
unreachable!()
}
...
ebpf/aya-ebpf/src/lib.rs
line 153 at r11 (raw file):
/// Including the panic handler within `aya-ebpf` means every eBPF will automatically have this panic handler defined. #[cfg(target_arch = "bpf")] #[panic_handler]
I'm 👎 for defining a panic handler in aya-ebpf.
As implemented, there's no way for a user to opt-out and/or bring an alternative implementation. Also I believe this would be a breaking change for anyone who has a panic_handler already defined and upgrades aya-ebpf.
You could potentially gate this behind a feature to make it opt-in.
test/integration-test/src/tests/bpf_probe_read.rs
line 4 at r7 (raw file):
use integration_common::bpf_probe_read::{RESULT_BUF_LEN, TestResult}; #[cfg_attr(aya_integration_test, test_log::test)]
Is this better?
Before, I could add a test by using #[test]
.
Now I have 2 verbose #[cfg_attr]
to add to every test.
ebpf/aya-ebpf/src/lib.rs
line 137 at r4 (raw file):
#[macro_export] macro_rules! prelude {
I gave up trying to review this commit-by-commit at this point.
This prelude
macro appears to be removed later.
If this commit isn't required, then please drop it.
ebpf/aya-ebpf/src/lib.rs
line 139 at r6 (raw file):
macro_rules! prelude { () => { /// Defines our panic handler when compiling for eBPF.
s/our/the//g
ebpf/aya-ebpf/src/lib.rs
line 141 at r6 (raw file):
/// Defines our panic handler when compiling for eBPF. /// /// eBPF programs are not allowed to panic, meaning this handler won't actually ever be called.
This is incorrect. The panic handler will be called if you do something that could panic.
The program will fail to pass the verifier because an infinite loop won't verify.
test/integration-test/src/tests/btf_relocations.rs
line 3 at r7 (raw file):
use aya::{Btf, EbpfLoader, Endianness, maps::Array, programs::UProbe, util::KernelVersion}; #[cfg_attr(aya_integration_test, test_case::test_case("enum_signed_32", false, Some((KernelVersion::new(6, 0, 0), "https://github.com/torvalds/linux/commit/6089fb3")), -0x7AAAAAAAi32 as u64))]
Same comment here re: usability.
test/integration-ebpf/src/bpf_probe_read.rs
line 1 at r11 (raw file):
#![cfg_attr(target_arch = "bpf", no_std)]
This is incorrect. This crate should be no_std
whether of not you're compiling for the host target or bpf target.
xtask/src/run.rs
Outdated
profile, | ||
]) | ||
cmd.env(AYA_BUILD_INTEGRATION_BPF, "true") | ||
.env("RUSTFLAGS", "--cfg aya_integration_test") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I prefer excluding the tests via default_members
. Disabling them via conditional compilation is too verbose - at least as implemented in this PR.
Thank you for the review @dave-tucker ! Sorry for not providing more context. I've been working with aya in another project with a monorepo and having to define and maintain Instead of just updating the template and book, I figured updating the main repo to use this would also be nice so that there is no drift. But the addition of xtask and how the integration tests are laid out added more requirements as to what we need to do to get rid of |
2e8d8a9
to
0efde5c
Compare
So I've experimented with a few more combinations of
I conclude from that that we have to make |
d246b9f
to
940d22a
Compare
@tamird @dave-tucker I removed the conditional compilation of the integration tests and tried to group the remaining changes into meaningful commits. Let me know what you think. |
FYI @thomaseizinger I've just opened this #1234 which is somewhat related. This conversation here + the clippy warnings when working in this repo forced my hand.
You may need to try this again with my PR linked above. Full details in the commit message.
Your conclusion is somewhat correct. The linker expects a If you need them to compile - to some definition of the word since they would never run on that target - then the onus is on you to work within these constraints (see my comments above on for adding a Coming back to |
I agree fully, it is a trade-off. The question is, do you want to leverage hacks to make the tools of today work well with the workspace or do you want to stick what is strictly correct in terms of how the crates are defined? My conclusion was: I'd rather use a few hacks because it is more important to me that other people in my team can come to the repo and use cargo the way they are used to and don't have to learn about eBPF and cargo's limitiations now. I also consider the use of You seem to be leaning on the other side of this trade-off for the aya repo which is completely fine. Do you feel the same way about what is recommended in the book and template? Do you want to leave those as is or shall we change them to use conditional compilation instead of default-members? If we leave is as, I am happy to close this and the other PRs, there is nothing to do here then :) |
@thomaseizinger, this pull request is now in conflict and requires a rebase. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you please rebase?
Being able to compile the eBPF crates for the host architecture is useful because it allows `cargo build` and `cargo test` to "just work". To achieve that, we feature-gate `no_std` and `no_main` on the bpf target architecture.
940d22a
to
53c59ae
Compare
@tamird Rebased! I've dropped the panic-handler macro in favor of changing the |
@@ -133,6 +133,16 @@ pub fn check_bounds_signed(value: i64, lower: i64, upper: i64) -> bool { | |||
} | |||
} | |||
|
|||
#[macro_export] | |||
macro_rules! main_stub { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am open to changing this to something else other than a macro by example if we can come up with something elegant.
AFAIK, eBPF progams can have multiple entry points so we cannot really attach it to another proc-macro, at least not in the general case.
Leaning onto the idea from @dave-tucker, we could define a aya_ebpf::main
macro that people can add to their entrypoint:
#[xdp]
#[aya_ebpf::main]
pub fn handle(ctx: XdpContext) -> u32 {
xdp_action::XDP_PASS
}
It is not very elegant though because the handler itself has nothing to do with the generated main
stub, i.e. putting the attribute on any other function would also work.
Given that we already use nightly, we could try and build something on top of #![feature(custom_inner_attributes)]
?
Using that, we could perhaps lump the no_std
, no_main
and main
stub together so that users only have to define one macro, like:
#![aya_ebpf::program]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 22 of 22 files at r16, 20 of 20 files at r17, all commit messages.
Reviewable status: 22 of 41 files reviewed, 13 unresolved discussions (waiting on @dave-tucker and @thomaseizinger)
test/integration-ebpf/src/bpf_probe_read.rs
line 1 at r11 (raw file):
Previously, dave-tucker (Dave Tucker) wrote…
This is incorrect. This crate should be
no_std
whether of not you're compiling for the host target or bpf target.
Agreed. I don't understand why no_std
got caught up here.
-- commits
line 2 at r16:
this commit messages needs to include rationale.
-- commits
line 10 at r17:
why is feature gating no_std required?
also, please enumerate the excluded crates both here and in the cargo.toml and explain the reason.
Cargo.toml
Outdated
"aya-tool", | ||
"test-distro", | ||
"test/integration-common", | ||
# test/integration-test is omitted; including it in this list causes `cargo test` to run its |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please write a comment here about the omitted crates and why.
xtask/src/run.rs
Outdated
profile, | ||
]) | ||
cmd.env(AYA_BUILD_INTEGRATION_BPF, "true") | ||
.env("RUSTFLAGS", "--cfg aya_integration_test") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(put it in the commit message where applicable)
If I don't make
If I switch the
No (new) crates are excluded, quite the opposite. We now include more crates in |
I'd prefer if we can hash this out verbally first before I keep iterating on the code. This has already taken much longer than I originally anticipated and at the moment, it doesn't feel like we are getting any closer to getting this merged. As far as my experience is concerned, the combination of feature-gated
|
catching up with this PR just now, I'd be aginst any of this ending up in the book or template. I agree that it's valuable to be able to run on host architecture (for testing I presume?), but it's a niche case and we shouldn't be imposing extra cognitive burden on people who just want to write eBPF programs (not targeting host), especially to those just getting started who are the #1 consumers of the book and template. Also it's not clear to me why we'd want the main stub to be in aya proper? Can't it be in the calling code/some other crate? |
Thank you for clearly stating your position! Coming from a mono-repo project where I integrated eBPF with aya, I had the opposite experience and would have appreciated more guidance by the book / template as to how set things up without interrupting the As I've stated elsewhere, I think that localising the added complexity to the eBPF crate itself (with conditional compilation) is better than requiring every other developer on the same project to learn why we are suddenly using Anyway, I am just repeating myself at this point. Thank you for all the input on the PR :) |
Related: aya-rs/book#214
Resolves: #1230
This change is