Skip to content

aya: introduce uprobe location ElfSymOffset #1258

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

SergejIsbrecht
Copy link

@SergejIsbrecht SergejIsbrecht commented Apr 23, 2025

As discussed in #1198 I inserted a new enum value to specify the uprobe location analogous to https://github.com/libbpf/libbpf/blob/master/src/elf.c#L259C1-L270C2

Regarding testing. I did not see any unit-tests, just integration-tests. As I see it AbsoluteOffset is not tested either. Some help would be appreciated. Maybe something like https://github.com/vadorovsky/aya-examples/blob/main/uprobe/uprobe/src/main.rs ?

Fixes: #1198


This change is Reviewable

Transforms symbols's virtual address to file offset as required by the
Linux Kernel.
Reference: https://github.com/libbpf/libbpf/blob/master/src/elf.c#L259C1-L270C2

Fixes: aya-rs#1198
Copy link

netlify bot commented Apr 23, 2025

Deploy Preview for aya-rs-docs ready!

Built without sensitive environment variables

Name Link
🔨 Latest commit c942278
🔍 Latest deploy log https://app.netlify.com/sites/aya-rs-docs/deploys/6808dfcc1bd3460008e1fb54
😎 Deploy Preview https://deploy-preview-1258--aya-rs-docs.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

@mergify mergify bot added the aya This is about aya (userspace) label Apr 23, 2025
@tamird tamird requested review from ajwerner and Copilot and removed request for ajwerner April 23, 2025 12:42
Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR introduces a new enum variant, ElfSymOffset, for specifying uprobe attachment locations by translating a symbol’s virtual address into a file offset. The changes include updating the AbsoluteOffset documentation, adding the ElfSymOffset variant with detailed documentation and an associated computation in the matching logic, and providing additional context on ELF symbol mapping.

/// for shared libs) into file offset, which is what kernel is expecting for
/// uprobe/uretprobe attachment.
/// See Documentation/trace/uprobetracer.rst for more details. This is done
/// by looking up symbol's containing section's header and using iter's
Copy link
Preview

Copilot AI Apr 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Possible typo: 'iter's' might be a misspelling; consider replacing it with 'its' for clarity.

Suggested change
/// by looking up symbol's containing section's header and using iter's
/// by looking up symbol's containing section's header and using its

Copilot uses AI. Check for mistakes.

/// ## References:
/// * [libbpf](https://github.com/libbpf/libbpf/blob/master/src/elf.c#L259C1-L270C2)
/// * [StackOverflow](https://stackoverflow.com/questions/79478284/uprobe-symbol-adress-mapping-offset)
ElfSymOffset {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR!

I'm wary of explicitly exposing ELF details in the API. What is the advantage of adding this variant, vs letting the caller do the math and use Absolute? The caller must already know about ELF in order to pass correct values for st_value, sh_addr and sh_offset, so presumably they also know how to do the math?

Assuming we decide this has to be in aya (which atm I'm a bit skeptical about), I think it should probably be an utility that does the math and returns UprobeAttachLocation::Absolute?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR!

Thank you for looking into it.

I'm wary of explicitly exposing ELF details in the API. What is the advantage of adding this variant, vs letting the caller do the math and use Absolute?

There is no direct advantage, except for making it really visible to the user, that by AbsoluteOffset the file offset is meant. I had quite a hard time to debug it, because I had a wrong assumption. I didn't know the kernel was expecting an absolute file-offset. Instead I gave the kernel the virtual address, which is quite silly in retrospect. By providing an analogous implementation to libbpf, we can make sure the caller will understand the difference.

The caller must already know about ELF in order to pass correct values for st_value, sh_addr and sh_offset, so presumably they also know how to do the math?

I think the the real problem is to actually understand what values you need to do the math in order to convert the virtual address to an absolute offset. There is basically no documentation regarding this whatsoever. I even asked in SO (https://stackoverflow.com/questions/79478284/uprobe-symbol-adress-mapping-offset), but didn't get an answer right away. As already set, my intention was to make this obvious to the user.

I think it should probably be an utility that does the math and returns UprobeAttachLocation::Absolute?

I guess this would be fine as well, but not that visible to the user.

Do you mean something like the From for u64, but with a tuple?

struct ElfSym {
    st_value: u64,
    sh_addr: u64,
    sh_offset: u64,
}

impl From<ElfSym> for UProbeAttachLocation<'static> {
    fn from(sym: ElfSym) -> Self {
        Self::AbsoluteOffset(sym.st_value - sym.sh_addr + sym.sh_offset)
    }
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another option that is more discoverable would be a constructor like this:

impl UProbeAttachLocation<'static> {
    /// ...
    pub fn from_symbol_address(
        instruction_address: u64,
        section_address: u64,
        section_offset: u64,
    )  -> Self {
        Self::AbsoluteOffset(instruction_address - section_address + section_offset)
    }
}

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no direct advantage, except for making it really visible to the user, that by AbsoluteOffset the file offset is meant. I had quite a hard time to debug it, because I had a wrong assumption. I didn't know the kernel was expecting an absolute file-offset. Instead I gave the kernel the virtual address, which is quite silly in retrospect.

Sorry about that! Sounds like this could be fixed by better docs?

By providing an analogous implementation to libbpf, we can make sure the caller will understand the difference.

Libbpf has a lot of APIs that I wouldn't want in aya, and this seems like one of them. If you look at UProbeAttachLocation today, it doesn't "leak" any ELF concepts. SymbolOffset takes a name and a number, which is pretty straightforward for "this function called X" at offset Y. AbsoluteOffset takes a number that does not require knowing anything about ELF itself.

What is your use case exactly? What are you trying to attach to?

Copy link
Author

@SergejIsbrecht SergejIsbrecht Apr 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ajwerner

Another option that is more discoverable would be a constructor like this:

Looks good to me. I will change it

@alessandrod

Sorry about that! Sounds like this could be fixed by better docs?

Yes, I guess extending the docs for AbsoluteOffset or better yet UProbe#attach. I think the problem was, that we just passed in u64 and that's it. To be fair, we are using the 0.13.1 version, which looks like this:

https://github.com/aya-rs/aya/blob/aya-v0.13.1/aya/src/programs/uprobe.rs#L81

    pub fn attach<T: AsRef<Path>>(
        &mut self,
        fn_name: Option<&str>,
        offset: u64,
        target: T,
        pid: Option<pid_t>,
    ) -> Result<UProbeLinkId, ProgramError> {

What is your use case exactly? What are you trying to attach to?

We are indexing a lot of shared objects on an android device and save the extracted offset to a database to not scrap the file offset every time

        let link_id = jni_program
            .attach(
                None,
                offset.unwrap(),
                "/apex/com.android.art/lib64/libart.so",
                None,
            )
            .map_err(EbpfError::ProgramError)?;
        jni_program
            .take_link(link_id)
            .map_err(EbpfError::ProgramError)

https://github.com/amosproj/amos2024ws03-android-zero-instrumentation/blob/f9f4986acdb518a11d604f07c700ed7e7cb96e07/rust/backend/daemon/src/features/jni_references.rs#L136-L147

I though we could change the API to something, which would require the client to acknowledge there is a difference between an virtual address and the file offset required by the Linux kernel. In retrospect introducing ElfSymOffset might not fix this problem, because it is still possible to just pass in an u64 without thinking twice whether the offset is actually the right one.

What about extending the attach doc and provide a utility function from_symbol_address, just like @ajwerner suggested?

/cc @fhilgers

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that I look back at it, there’s no symbol to speak of here, so maybe UProbeAttachLocatoin::from_virtual_address or UProbeAttachLocation::from_address?

@@ -55,8 +55,52 @@ pub enum UProbeAttachLocation<'a> {
/// The location of the target function in the target object file, offset by
/// the given number of bytes.
SymbolOffset(&'a str, u64),
/// The offset in the target object file, in bytes.
/// The **file** offset in the target object file, in bytes.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Repeating the word file twice in the sentence is redundant. If you want to bold **file** later in the sentence, that's okay. Furthermore if you want to change the sentence to The target object **file** offset, in bytes. that's fine. Feel free to a following note if you feel like there's need for more clarity. Perhaps:

///
/// Note that this is not the virtual address of the instruction as it
/// would be loaded by the loader; it is the offset in the loaded object file.

/// ## References:
/// * [libbpf](https://github.com/libbpf/libbpf/blob/master/src/elf.c#L259C1-L270C2)
/// * [StackOverflow](https://stackoverflow.com/questions/79478284/uprobe-symbol-adress-mapping-offset)
ElfSymOffset {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another option that is more discoverable would be a constructor like this:

impl UProbeAttachLocation<'static> {
    /// ...
    pub fn from_symbol_address(
        instruction_address: u64,
        section_address: u64,
        section_offset: u64,
    )  -> Self {
        Self::AbsoluteOffset(instruction_address - section_address + section_offset)
    }
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
aya This is about aya (userspace)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

uprobe offset for dynamic shared library not adjusted for ELF file .text section offset
3 participants