Skip to content

How to safely use the middle API when the signature isn't known until runtime. #54

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
vext01 opened this issue Apr 22, 2022 · 3 comments
Labels

Comments

@vext01
Copy link

vext01 commented Apr 22, 2022

Hi,

I've recently needed to use the libffi crate to allow an interpreter to make native calls to external functions whose signatures are not known until runtime.

The examples in the repo demonstrate using statically-known signatures (and thus static arrays of arguments). The examples are of limited utility, because if you knew the signatures you wanted to call ahead of time, you'd just call an extern-declared symbol directly without libffi. As far as I know, libffi is intended for scenarios where the type signatures are dynamic.

Once you start dealing with calls with dynamic types, it becomes unclear how to best safely use the crate. For example, my first attempt looked a bit like:

use libffi::middle::*;
use std::ffi::c_void;

// Pretend that this function is from an external dynamic C library.
fn f(a: u64, b: u64, c:u64) {
    println!("a={}, b={}, c={}", a, b, c);
}

fn call_ext(addr: *mut c_void, num_args: usize) {
    let mut args = Vec::new();
    let mut builder = Builder::new();
    for i in 0..num_args {
        args.push(arg(&i)); // Bad!
        builder = builder.arg(Type::u64());
    }
    builder = builder.res(Type::void());
    let cif = builder.into_cif();
    unsafe {cif.call::<()>(CodePtr(addr), &args)};
}

fn main() {
    call_ext(&f as *const _ as *mut c_void, 3);
}

This program is UB because arg() calls Arg::new(), which looks like:

    pub fn new<T>(r: &T) -> Self {
        Arg(r as *const T as *mut c_void)
    }

So it stashes a raw pointer to its argument, but in the case of my example above:

  • The same memory is re-used for &i on each iteration of the loop, so we'd unintentionally be doing the call with 3 identical arguments.
  • The storage for the i dies once the loop finishes, so the Arg stores a dangling pointer. Using the resulting pointers invokes UB.

(Another gotcha here, is that we've inadvertently made a vector of usize args, since we didn't add explicit type annotations. But that's another story)

My next attempt was to do something like:

use libffi::middle::*;
use std::ffi::c_void;

// Pretend that this function is from an external dynamic C library.
fn f(a: u64, b: u64, c:u64) {
    println!("a={}, b={}, c={}", a, b, c);
}

fn call_ext(addr: *mut c_void, num_args: usize) {
    let mut builder = Builder::new();
    let mut args: Vec<u64> = Vec::new(); // <-----------------
    for i in 0..num_args {
        args.push(u64::try_from(i).unwrap());
        builder = builder.arg(Type::u64());
    }
    let cif = builder.into_cif();
    let ffi_args = args.iter().map(|a| arg(a)).collect::<Vec<Arg>>(); // <-----------------
    unsafe {cif.call::<()>(CodePtr(addr), &ffi_args)};
}

fn main() {
    call_ext(&f as *const _ as *mut c_void, 3);
    println!("Hello, world!");
}

This solves the above problems by:

  • ensuring that the storage for each argument is a distinct memory address
  • ensuring that the storage out-lives the ffi call.

But I'm still not certain that this is correct. The example assumes that the backing storage of the vector holding the arguments is not moved: a guarantee I don't think we have(?).

My next attempt would be to take a slice from the argument vector. By taking a slice, Rust cannot move the vector's backing storage (if it wanted to, the program would (hopefully?) not compile). But having shown this to colleagues, they have concerns about pointer aliasing rules.

So my question is: what is the correct and safe way to use libffi::middle to create dynamically-typed calls to external functions?

The only solution I can see is to manually manage an unmovable chunk of memory with malloc() (or using a something like the alloc crate). There has to be a better way.

(FWIW, all of the programs I've shown so far seg-fault, although when I used the approach from the latter example in my interpreter, it did work, but perhaps by chance).

(Side question: You can't use libffi::high to do dynamic calls, can you? You'd need to ability to dynamically create a Rust type signature as far as i can see, and if you could do that, you wouldn't need libffi)

Thanks!

@vext01 vext01 changed the title How to safely use the middle API when the signature isn't know until runtime. How to safely use the middle API when the signature isn't known until runtime. Apr 22, 2022
@vext01
Copy link
Author

vext01 commented Apr 22, 2022

call_ext(&f as *const _ as *mut c_void, 3);

should be:

call_ext(f as *const _ as *mut c_void, 3);

i.e. f not &f.

Question still stands though.

@yorickpeterse
Copy link
Collaborator

I can't really answer this unfortunately, as I don't use the high or middle APIs myself. Instead I'm using the low-level APIs, as I found it easier to bind to from my language. With that said, I think it's going to be difficult to have an API that doesn't require any unsafe code, as the very idea of libffi is unsafe times 100. Maybe @tov can answer your question, but he's only active sporadically so it may take some time to get an answer.

@vext01
Copy link
Author

vext01 commented Apr 22, 2022

Just to clarify, my question is not about whether the API can be safe. libffi is obviously rather unsafe by definition.

My question is more about how the authors of the crate intend their API be used for dynamically-typed calls without invoking UB.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants