Skip to content

Unsafe fields #3458

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 24 commits into
base: master
Choose a base branch
from
Open

Unsafe fields #3458

wants to merge 24 commits into from

Conversation

jhpratt
Copy link
Member

@jhpratt jhpratt commented Jul 13, 2023


// Unsafe field initialization requires an `unsafe` block.
// Safety: `unsafe_field` is odd.
let mut foo = unsafe {
Copy link
Contributor

@juntyr juntyr Jul 13, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’m not sure I like that the entire struct expression is now inside an unsafe block (though I’m not sure what a better syntax would be). If the safety invariant requires the entire struct, this makes sense. However, if it is more specific, a larger unsafe block is too broad as it also allows unsafe code usage to initialise the safe fields, which should get their own unsafe blocks. Though perhaps this could just be linted against, e.g. don’t use unsafe expressions in a struct initialiser without putting them inside nested unsafe blocks.

Copy link
Member Author

@jhpratt jhpratt Jul 13, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The alternative is to do this:

let mut foo = Foo {
    safe_field: 0,
    unsafe_field: unsafe { 1 },
};

That feels worse to me.

Presumably the safety invariant being promised is an invariant within the struct, even if it is not strictly required that that be the case.

Ideally we'd also have partial initialization, such that it would be possible to do

let mut foo = Foo { safe_field: 0 };
unsafe { foo.unsafe_field = 1; }

but I expect that's quite a bit farther away.

Copy link
Contributor

@juntyr juntyr Jul 13, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that just unsafe in one field initialiser expression is insufficient as it means something very different. I do like however that it clearly shows which field the unsafety applies to.

It reminds me a bit of unsafe blocks in unsafe functions (#2585). Ideally, the fact that struct init is unsafe would not allow unsafe field init expressions. I doubt that special-casing struct inits to not propagate outer unsafe blocks would be backwards compatible, so a new lint would be the only avenue in this direction.

Some new syntax like this

let mut foo = unsafe Foo {
    safe_field: 0,
    unsafe unsafe_field: 1,
};

would communicate intent better but looks quite unnatural to me.

Perhaps in the future there might be explicit safe blocks to reassert a safe context within an unsafe block, so that it could be written as follows:

let mut foo = unsafe {
    Foo {
        safe_field: safe { /* some more complex expr here */ },
        unsafe_field: safe { /* some more complex expr here */ },
    }
};

which could be encouraged with clippy lints. Though for now just moving safe non-trivial struct field initialisers into variables that are initialised outside the unsafe block.

Overall, I think going with the original syntax of

let mut foo = unsafe {
    Foo {
        safe_field: 2,
        unsafe_field: 1,
    }
};

looks like a good and intuitive solution, though I still think that a lint against using unsafe expressions inside the struct expression without a nested unsafe block would still help.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe this?

let mut foo = unsafe Foo {
    safe_field: 0,
    unsafe_field: 1,
};

unsafe would be followed immediately by a struct expression, and doing so would only indicate that unsafe fields may be initialized. It would not introduce an unsafe context.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like it - the syntax is still close enough to the unsafe block syntax (except when initialising tuple structs) that it's relatively intuitive what the unsafety applies to, but coupled strongly to the struct name so that it also makes sense why no unsafe context is introduced.

Copy link
Member

@scottmcm scottmcm Jul 14, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there's something nice about

let mut foo = unsafe Foo {
    safe_field: 0,
    unsafe_field: 1,
};

but to me that would go with unsafe struct Foo, like how unsafe trait Bar leads to unsafe impl Bar.

Spitballing: if per-field safety is really needed, then maybe

let mut foo = Foo {
    safe_field: 0,
    unsafe unsafe_field: 1,
};

though maybe the answer is the same as for unsafe { unsafe_fn_call(complicated_expr) }: if you don't want the expr in the block, use a let.

After all, there's always field shorthand available, so you can do

let safe_field = unsafe { complex to construct };
let unsafe_field = easy and safe to construct;
unsafe { Foo { safe_field, unsafe_field } }

Perhaps the more important factor would be the workflow for the errors the programmer gets when changing a (internal and thus not semver break) field to unsafe. Anything at the struct level wouldn't give new "hey, you need unsafe here" if you already had another field that was unsafe.

Of course, if the safety is at the struct level (not the field level) then that doesn't come up.


Hmm, the "you added an additional requirement to something that's already in an unsafe block and there's no way to help you make sure you handed that" is a pre-existing problem that needs tooling for that everywhere, so maybe it's not something this RFC needs to think about.

If we had unsafe(aligned(a), nonzero(b)) { ... } so that tooling could help ensure that people thought about everything declared in the safety section, then we'd just have unsafe(initialized(ptr, len)) { ... } so that it acknowledged the discharge of the obligation for the type invariant, and the "I need 100 unsafe blocks" problem goes away.

@algesten
Copy link

By introducing unsafe fields, Rust can improve the situation where a field that is otherwise safe is used as a safety invariant.

I think the motivation could point out who would benefit from this.

I assume it's library authors a making unsafe fields a "reminder to self" about upholding some invariant as opposed to say expecting a unsafe field in an API surface. I.e we still expect unsafe set_len rather than pub unsafe len, right?

@programmerjake
Copy link
Member

this seems closely related to mut(self) fields #3323 which should probably be mentioned.

@jhpratt
Copy link
Member Author

jhpratt commented Jul 13, 2023

I assume it's library authors a making unsafe fields a "reminder to self" about upholding some invariant as opposed to say expecting a unsafe field in an API surface. I.e we still expect unsafe set_len rather than pub unsafe len, right?

Not necessarily. It is entirely reasonable that Vec.len could be exposed. Whether it actually is exposed is a decision solely for T-libs-api. Likewise with the inner fields of the various nonzero types. I'd rather not tie down who this is intended for, as it truly is intended for everyone. I know I've written a binary that had fields relied upon in unsafe code — there was just no way to make it actually unsafe.

this seems closely related to mut(self) fields #3323 which should probably be mentioned.

Related, sure, but beyond the mention in unresolved questions, I'm not sure how it could be mentioned. Pretty much the only overlap is what's considered a "mutable access", which I didn't feel necessary to be copy-pasted.

@jsgf
Copy link

jsgf commented Jul 13, 2023

How does this look with functional struct update?

Foo {
// Stuff
.. unsafe { other }
}

?
Or does the whole initializer need to be unsafe?

Edit: or I guess it doesn't need unsafe if the source had been initialized with unsafe.

@jhpratt
Copy link
Member Author

jhpratt commented Jul 13, 2023

@jsgf Great question. I don't have an immediate answer, though I believe the mechanism currently in the compiler would require the entire initializer to be unsafe.

@mo8it
Copy link

mo8it commented Jul 13, 2023

I love it! This is much better than getters and setters, both for library authors and users.

typed-builder would have to adjust, but I would love to implement this feature there :)

@idanarye (the main author) Maybe you have some input regarding the builder pattern?

@djc
Copy link
Contributor

djc commented Jul 13, 2023

The implicit notion that only mutation is unsafe (and reading is not) seems tricky. Do you have reasoning to prove that we'll never need fields that are unsafe to read? Maybe there should be an alternative syntax proposal (like unsafe(mut) or mut(unsafe)) that makes this more obvious/explicit?

@idanarye
Copy link

@mo8it I don't want to spam the comments here with a discussion about typed builder, so I've opened a discussion in my repository instead: idanarye/rust-typed-builder#103

@ehuss ehuss added the T-lang Relevant to the language team, which will review and decide on the RFC. label Jul 13, 2023
@idanarye
Copy link

Regarding the RFC itself - I think you are trying to solve a visibility issue with the safety mechanism, which is the wrong tool for the job. You gave Vec as an example, and it looks like you want to grant public read access to its len field (so that you don't have to use len() as a method? Let's ignore the question if that's big enough an improvement to justify such a feature). To do so, you are willing to provide public write access to it as well but make that access unsafe.

But why?

I mean, it's obvious why you don't want to give regular write access to len. But why give any access at all? Even if you require an unsafe block, what good will come from letting external users modify the len field without going through a method that upholds the invariant? As long as we are devising a brand new feature, wouldn't it make more sense to add a feature that gives public read access without any write access at all (other than private access from inside the defining module, of course)?

I'm aware of the set_len method that grants such access, but this is an explicit decision to give such access, with a fully documented method. Not a side-effect of wanting to provide a non-method-call read access to the field.


Another thing - conceptually it never makes sense to define only one field as unsafe. The invariant is a property of the struct as a whole. If this is unsafe:

let mut vec = Vec::new();
vec.len = 4; // UNSAFE!!!

The why not this?

let mut vec = Vec::from([1, 2, 3, 4]);
vec.buf = RawVec::new(); // perfectly safe apparently

Yes, buf is not publicly exposed at all, but inside the module len will need unsafe block to modify and buf won't, even though the invariant is about both of them, together, and how they interact with each other.

Whatever the semantics of unsafe fields will be - conceptually it makes more sense to put the unsafe on the struct itself. If there are fields that are not part of the invariant, they should not be part of the struct - because unsafety should be contained as much as possible and not contaminated with unrelated data. The only reason putting unsafe only on len seems to make case in your example is that the goal - as I've said before - is about visibility, not about safety.

## "Mutable use" in the compiler

The concept of a "mutable use" [already exists][mutating use] within the compiler. This catches all
situations that are relevant here, including `ptr::addr_of_mut!`, `&mut`, and direct assignment to a
Copy link
Contributor

@Jules-Bertholet Jules-Bertholet Jul 13, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One could argue that ptr::addr_of_mut! on an unsafe field need not be unsafe, because writes through the pointer are unsafe.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider this in conjunction with the examples present in the RFC.

fn make_zero(p: *mut u8) {
    unsafe { *p = 0; }
}

let p = ptr::addr_of_mut!(foo.unsafe_field);
make_zero(p);

Ignoring thread-safety, which can be easily achieved with locks, no single step appears wrong. make_zero does not do anything wrong — assigning zero to an arbitrary *mut u8 is fine. Passing a pointer to the method is naturally okay. Yet it still results in unsoundness, as foo.unsafe_field must be odd.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"assigning zero to an arbitrary *mut u8 is fine" what‽ No it is not fine‽ An arbitrary *mut u8 could be null, dangling, aliased... fn make_zero is unsound.

Copy link
Member Author

@jhpratt jhpratt Jul 14, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True — I typed that far too quickly and without thinking. Regardless, it's not immediately obvious to me that ptr::addr_of_mut! should be allowed safely.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not fully convinced either way, but I fear it would just be confusing to require unsafe for an operation that can't lead to unsoundness, especially as addr_of!(struct.field) as *mut _ would do the same thing with no unsafe.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

especially as addr_of!(struct.field) as *mut _ would do the same thing

While you can do that, it's undefined behavior to actually mutate the resulting mut pointer. I've just confirmed this with miri.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's UB under Stacked Borrows, but MIRIFLAGS=-Zmiri-tree-borrows accepts it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's surprising. I'm not familiar with tree borrows, admittedly.

Copy link

@CodesInChaos CodesInChaos Aug 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ptr::from_mut(&mut struct).wrapping_offset(offset_of!(Struct, field)) should achieve the same thing as addr_of_mut!, and I doubt you'd want to make offset_of unsafe for unsafe fields. And even without offset_of, you could use ptr::from_mut(&mut struct).wrapping_offset(addr_of!(stuct.Field) - ptr::from_ref(&struct)). Both of these should be safe in every borrowing model.

@burdges
Copy link

burdges commented Jul 13, 2023

Vec::len should not do this even if this feature exists, because Vec::set_len is better pedagogically.

static muts require unsafe blocks for both writing and reading. An unsafe field would likely be some similar construction, so unsafe for both writing and reading. An UnsafeCell already hits those requirements, but any variants should set their auto-traits.

Ain't clear this proposal handles auto-traits correctly, even if some use case exists. If you need this, then define your own type which provides this. We've the inner-builder or whatever deref polymorphism pattern, which comes up far more in practice, and can simulate this of desired.

pub struct ThingBuilder { ... }

impl ThingBuilder {
    fn build(self) -> Thing {
        ...
        Thing { ..., inner, self }
    }
}

pub struct Thing {
    ... 
    inner: ThingBuilder,
}

impl Deref for Thing {
    type Target = ThingBuilder;
    fn deref(&self) -> ThingBuilder {
        &self.inner
    }
}

// We stop mutating ThingBuilder once we create a Thing, so Thing: !DerefMut,
// but Thing: Deref<Target=ThingBuilder> to make reading & replicating the
// builder config easy.

@jhpratt
Copy link
Member Author

jhpratt commented Jul 14, 2023

Do you have reasoning to prove that we'll never need fields that are unsafe to read?

How could a field of a struct be unsafe to read?


You gave Vec as an example, and it looks like you want to grant public read access to its len field (so that you don't have to use len() as a method?

Within the module it's defined in (as the field is private), it is currently safe to assign any value, despite the fact that it can lead to undefined behavior. Said another way, the current behavior is inherently unsound.

Nothing in the RFC so much as hints at Vec.len being made public, nor is an RFC an appropriate place to make a change like that. It is an example of a field that should be unsafe to avoid unsoundness and nothing more.

Another thing - conceptually it never makes sense to define only one field as unsafe.

I never claimed that was the case.

The why not this?

let mut vec = Vec::from([1, 2, 3, 4]);
vec.buf = RawVec::new(); // perfectly safe apparently

Inclusion of one example does not mean that everything not included is forbidden. There is simply no reason to repeat the same thing for every field. Of course buf would also be unsafe. I used Vec.len as the example because it's a clear, obvious example where its safety invariants are publicly documented.

The invariant is a property of the struct as a whole.

If there are fields that are not part of the invariant, they should not be part of the struct

For Vec, yes, but only because all fields of the strict interact with all other fields.

I have real world code where this is not the case. The flags field is for whether other fields are initialized. Note that some fields have niche value optimization, and as such don't interact with other fields in any way. Are you asserting that month: Option<Month> and other similar fields should be in a separate struct solely because it has niche value optimization? That appears to a logical conclusion as a result of what you've said.


static muts require unsafe blocks for both writing and reading. An unsafe field would likely be some similar construction, so unsafe for both writing and reading.

static mut requires unsafe due to inherent data races between threads. Unsafe fields have no such issue.

Ain't clear this proposal handles auto-traits correctly, even if some use case exists.

I don't follow. What problems do you see with handling auto traits? The type of the field is unchanged, so there would be no impact on auto traits.

@@ -0,0 +1,137 @@
- Feature Name: `unsafe_fields`
Copy link
Member

@scottmcm scottmcm Jul 14, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The appears to be missing the https://github.com/rust-lang/rfcs/blob/master/0000-template.md#rationale-and-alternatives section. Please add one, with a bunch of subsections for various "here's why I picked this way over some other way" decisions you made.

For example, that's a great place to talk about unsafe structs vs unsafe fields.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A Rationale and Alternatives section is now taking shape!

@scottmcm
Copy link
Member

scottmcm commented Jul 14, 2023

(@jhpratt I'd love for you to steal something like the text in this post for the RFC)

I assume it's library authors a making unsafe fields a "reminder to self" about upholding some invariant as opposed to say expecting a unsafe field in an API surface.

A huge incentive for it, to me at least, is helping avoid misunderstandings about the model.

For example, a 2020 PACMPL paper contains the following statement:

To check how prevalent unsafe is used as a documentation feature, our queries gathered data for unsafe traits and unsafe functions with safe implementations.

set_len being an unsafe fn is of course not merely a "documentation feature". It's an absolutely critical part of the soundness of Vec. (And the paper does talk about "invariants that are potentially critical for upholding Rust’s safety guarantees" (emphasis added), so I don't think their analysis is incorrect, but I still find the phrasing curious.)

Thus a huge win of this would be to avoid the (from the same paper)

After all, the compiler does not force developers to declare such functions as unsafe -- in contrast to other unsafe features.

Having the body of set_len do something the compiler recognizes as unsafe is a big help to people understanding the soundness of Vec, and in general to people adding new features inside existing unsafety privacy boundaries.

Especially combined with other accepted work like https://rust-lang.github.io/rfcs/2316-safe-unsafe-trait-methods.html we could start even doing things like clippy lints for "why is this unsafe when it doesn't do anything unsafe? Should one of the types involved be marked unsafe?"

If we don't have to link to tootsie pops as often because changing things that are relied on by unsafe code is itself unsafe, I'd consider that a big win.


As another way to look at this, it's weird that when I'm writing a method on by type with a safety invariant that I can do Self { a, b } and it's totally "safe", whereas if I call Self::new_unchecked(a, b) I need an unsafe block and tidy nags me to write a safety comment.

Tidy should nag about a safety comment for the constructor too, so that I'm not disincentivized to use the other, correctly-marked-unsafe function when writing things.

@burdges
Copy link

burdges commented Jul 14, 2023

We do not need unsafe per se when maintaining a safety invariant within its defining visibility boundary aka module:

"Because it relies on invariants of a struct field, this unsafe code does more than pollute a whole function: it pollutes a whole module. Generally, the only bullet-proof way to limit the scope of unsafe code is at the module boundary with privacy."

In other words, rust does not have unsafe types because you must enforce invariants at module boundaries anyways. Also various discussions in https://github.com/rust-lang/unsafe-code-guidelines clarify this point.

A method like Vec::set_len must be unsafe due to being public. An extern fn is unsafe because it points outside the module. etc.

Anyways..

If I understand, you want this type:

pub UnsafeInvariant<T>(T);
impl<T> UnsafeInvariant<T> {
    fn new(t: T) -> UnsafeInvariant<T> { UnsafeInvariant(t) }
    unsafe fn get(&self) -> &T { &self.0 }
    unsafe fn get_mut(&mut self) -> &mut T { &mut self.0 }
}

It's similar to UnsafeCell but propagates all auto-traits normally, based upon your comment above.

You still enforce invariants by visibility but UnsafeInvariant could provide the documentation for which you propose unsafe fields. In practice, I suspect you'd be better off like this:

    fn invariant_get(&self) -> &T { &self.0 }
    fn invariant_get_mut(&mut self) -> &mut T { &mut self.0 }

Why? All those unsafe blocks you'll write risk other mistakes, so ideally they should not exist if they merely maintain some invarant. Instead, you want a safe but distinctively named accessor method, which flags that you maintain the invariant.

Anecdotally, this type winds up being much more common:

struct HideMut<T>(T);
impl<T> Deref for HideMut<T> {
    type Target = T;
    fn deref(&self) -> &T { &self.0 }
}
impl<T> HideMut<T> {
    fn new(t: T) -> HideMut<T> { HideMut(t) }
    unsafe fn get_mut(&mut self) -> &mut T { &mut self.0 }
}

And UnsafeCell remains more common than both of course.

In fact, if you wrap the HideMut declaration inside some macro_rules! use_hide_mut then HideMut becomes module local, so the local module can access pub foo: HideMut<Foo> fields freely, but the outside world has only immutable access, even if given a &mut for the containing struct. This is really the common pattern.

We made this a local type for visibility modifiers, so a language level construct helps here. Also conversely, if you do not require visibility modifiers then simple types like UnsafeInvariant suffice, no language change necessary.

I suppose one might imagine pub(positive_visibility) unsafe(negative_visibility) mut(positive_visibility) field: type, except this still cannot capture when mutation becomes unsafe but reading remains safe. Yet, visibility control types like UnsafeInvariant and HideMut work fine.

```rust
fn change_unsafe_field(foo: &mut Foo) {
// Safety: An odd number plus two remains an odd number.
unsafe { foo.unsafe_field += 2; }
Copy link
Member

@scottmcm scottmcm Jul 14, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: include a section talking about how both of the following are reasonable types:

/// This type has a *correctness* invariant that it holds an even number,
/// but it's not something on which `unsafe` code is allowed to rely.
pub struct Even(u32);

impl Even {
    pub fn new(x: u32) -> Option<Self>;
    pub fn new_i_promise(x: u32) -> Self;
    pub fn get_value(&self) -> u32;
}
/// # Safety
///
/// The value of this type must always be an even number.
/// (`unsafe` code is allowed to rely on that fact.)
pub unsafe struct Even(u32);

impl Even {
    pub fn new(x: u32) -> Option<Self>;
    pub unsafe fn new_unchecked(x: u32) -> Self;
    pub fn get_value(&self) -> u32;
}

and it's up to the type author to decide which is appropriate for the expected uses of the type.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've updated the RFC to include a discussion of when to use unsafe fields and when not to use unsafe fields.

@scottmcm
Copy link
Member

scottmcm commented Jul 14, 2023

EDIT: see below; it looks like the thing I was worried about here is probably impossible for other reasons.

I do thing that "safe to read; unsafe to modify" is the 99%+ case, and should certainly be the default, but

How could a field of a struct be unsafe to read?

Well, the field in AtomicPtr is unsafe to read, because it could be a race, for example. https://github.com/rust-lang/rust/blob/7a5814f922f85370e773f2001886b8f57002811c/library/core/src/sync/atomic.rs#L176

Or the value field of a ShardedLock in crossbeam https://docs.rs/crossbeam-utils/0.8.11/src/crossbeam_utils/sync/sharded_lock.rs.html#81

So perhaps some nuance for !Freeze could make sense? I'm not sure what the semver implications of that would be, though.

@Jules-Bertholet
Copy link
Contributor

Well, the field in AtomicPtr is unsafe to read, because it could be a race

I don't think that is correct? The unsafe operation is dereferencing the pointer returned by UnsafeCell::get(), accessing the field can't lead to UB on its own.

@scottmcm
Copy link
Member

scottmcm commented Jul 14, 2023

accessing the field can't lead to UB on its own

Ah, I guess an access can't actually read an UnsafeCell (without ownership) because it's never Copy.

So I think the PlaceMention is always ok for everything, and a read would be unsafe for an UnsafeCell, but you can't actually do a read of an UnsafeCell directly in Rust.

@jhpratt
Copy link
Member Author

jhpratt commented Jul 14, 2023

We do not need unsafe per se when maintaining a safety invariant within its defining visibility boundary aka module

rust does not have unsafe types because you must enforce invariants at module boundaries anyways

The nomicon describes current behavior; using it as an argument against this RFC is counter to the purpose of the RFC. It's circular reasoning at best.

if they merely maintain some invarant.

The invariants are "merely" there for soundness. If the invariant is violated, the result is undefined behavior. That's far more serious than you make it sound.

Yet, visibility control types like UnsafeInvariant and HideMut work fine.

I have never seen anyone write code like this in practice. The standard library and my own code (in time) is included in this. That is a significant argument in favor of something better.


@scottmcm I'll definitely include parts of that into the RFC. Also reading that blog post now — I'd never seen it before.

@burdges
Copy link

burdges commented Jul 14, 2023

If the invariant is violated, the result is undefined behavior.

That is a significant argument in favor of something better.

This RFC is not better because memory safety also helps when writing unsafe code.

The unsafe keyword is not simply a marker to tell you where to read more carefully. Its a marker of where safety rules must be violated.

In other words, we always convert regular code invariants into memory safety assurance, but these regular code invariants have exactly the same risks as other regular code, including their own memory safety concerns. This RFC confuses the memory safety consumed in maintaining the regular code invariant with the actually unsafe options the code requires.

In the past, unsafe fn bodies were unsafe blocks, but rust changed this to reduce the unsafe code surface area. This RFC is a mistake because it increases the unsafe code surface area with no benefits, given the same cautions can be maintained in other ways, like by variable naming, etc.

UnsafeInvariant not being used is evidence this feature is not required. UnsafeInvariant would make sense if you wanted to split the regular code invariant across distant visibility boundaries. In practice, unsafe fns always sufficed, or indeed proved more nuanced than UnsafeInvariant.

Anyways..

I think this discussion belongs in https://github.com/rust-lang/unsafe-code-guidelines where at least some people think formally about the unsafe code boundary.

@jhpratt
Copy link
Member Author

jhpratt commented Jul 14, 2023

In other words, we always convert regular code invariants into memory safety assurance, but these regular code invariants have exactly the same risks as other regular code, including their own memory safety concerns. This RFC confuses the memory safety consumed in maintaining the regular code invariant with the actually unsafe options the code requires.

This is your fundamental misunderstanding.

Other code in Vec relies on the invariants of Vec.len in ways that leads to undefined behavior if the invariants are broken. Fields like Vec.len are not "regular code invariants" — they are tightly coupled to whether the code is sound or not. It is a soundness invariant. You cannot possibly claim otherwise.

given the same cautions can be maintained in other ways, like by variable naming, etc.

Frankly, it's thinking like this that led to the creation of Rust. Thread safety can be maintained if everyone is super careful, but we all know how that works out. Likewise with a million other things. Programmers can not be relied upon to do the right thing. We have to force them to do it by leveraging the compiler wherever possible.

I think this discussion belongs in https://github.com/rust-lang/unsafe-code-guidelines where at least some people think formally about the unsafe code boundary.

I have no idea why you think the discussion belongs there, particularly as you're the one that initiated it here.

@burdges
Copy link

burdges commented Jul 14, 2023

Other code in Vec relies on the invariants of Vec.len in ways that leads to undefined behavior if the invariants are broken.

Yes, but this does not make altering Vec.len within the Vec module an unsafe operation. That's not how safety works.

What happens if I've unsafe code which relies upon an invariant between the values in a slice, so your unsafe field is a &[*mut Foo] or &[usize]? I want memory safety around the regular code invariants in how I maintain this slice. Yes, those regular code invariants control memory safety in how the module get used, hence privacy. It's clearly worse if I've many more unsafe blocks merely to access this slice, which now might intermix with some real unsafe code blocks for the *mut Foos or even violate slice invariants.

We often have this "catch your tail" phenomenon in formalalisms.

Frankly, it's thinking like this that led to the creation of Rust

No. There is a formal model from the rustbelt project about how unsafe works, which gets discussed in the unsafe code guidlines repo. We should only expand what falls under unsafe code if the formal model says so. This RFC confuses those really important formal models with mere "read this" markers.

In brief, you don't have a formal conception of when unsafe types should be used. It's unlikely one exists. That makes this change a regression towards the C days.

@jhpratt
Copy link
Member Author

jhpratt commented Jul 14, 2023

Yes, but this does not make altering Vec.len within the Vec module an unsafe operation. That's not how safety works.

It actually is. If something can result in undefined behavior, it is unsafe. Where that happens is wholly irrelevant. You're using circular logic regarding the UCG, which is by definition written about Rust as it currently is. You seem to be claiming that library undefined behavior doesn't exist, which is bizarre.

I'll leave it at that as there's clearly no convincing you that the mere concept of an unsafe field has merit. I won't be responding further to anything along those lines.

@tgross35
Copy link
Contributor

Could you add another motivating example to the RFC? It seems like the vec.len example isn't super convincing because changing anything in Vec/RawVec breaks the model down and is "unsafe", by definition: that is protected via the private/public module interface. This makes me think that the resolution would be something like unsafe modules or #[unsafe] types rather than fine grained per-field control, so I'm curious to see what motivates this specific solution.

@jhpratt
Copy link
Member Author

jhpratt commented Jul 15, 2023

@tgross35 I linked this earlier in the thread. Is that sufficient for you? It demonstrates that field-level safety is appropriate, as some of the fields are perfectly safe to set as they have no interaction with any other field. Yet at the same time, all fields are one logical unit that should not be split into separate structs.

@tgross35
Copy link
Contributor

tgross35 commented Jul 15, 2023

Thank you for the link, I meant specifically to add an example to the RFC document (which perhaps you planned to do anyway).

Even with that example, it does seem to me that it would be more correct to nest flags into a separate type with the other items it has the invariant with. To me, it illustrates the concept better; a field alone is never unsafe but rather it is unsafe in the context of other fields, and it seems like this is a suitable level for abstraction to a separate type. Also:

  • If a struct of 12 fields has 8 marked unsafe, it isn't immediately clear how they may be related. Are they all tied together via an invariant? Are 6 tied together and the remaining 2 associated somehow? This becomes immediately clear by separating things that are related into separate types, and I think that behavior should be encouraged
  • For the example Parsed struct, if sufficiently many fields are related to flags then it seems like it would make sense to mark the entire struct #[upholds_unsafe_invariant] or something like that. Maybe a few fields aren't related to the invariant, but commenting // SAFETY: not related to the invariant is easy enough to make this clear (and forces you to make the change if they do become related to the invariant for some reason)

Niche optimization was mentioned as well as a reason for not wanting to split off types, but I think this is more applicable to all separations of logic, and not just those with safety concerns. I don't wish to derail this conversation, but the thought has crossed my mind before about how Rust could potentially use a #[flatten] attribute for nested struct fields that tells the compiler to merge the child struct's fields into the parent's and rearrange for best possible size. (yes - possible method duplication, but this would be meant for one-off structs).

I think that investigating something like that may be more widely useful than using niche optimization as a justification for why a single flat struct is the correct solution. (edit: I brought this up for some discussion on Zulip https://rust-lang.zulipchat.com/#narrow/stream/213817-t-lang/topic/.60.23.5Bflatten.5D.60.20struct.20field.20attribute)

matthiaskrgr added a commit to matthiaskrgr/rust that referenced this pull request Mar 4, 2025
…ethercote

Do not require that unsafe fields lack drop glue

Instead, we adopt the position that introducing an `unsafe` field itself carries a safety invariant: that if you assign an invariant to that field weaker than what the field's destructor requires, you must ensure that field is in a droppable state in your destructor.

See:
- rust-lang/rfcs#3458 (comment)
- https://rust-lang.zulipchat.com/#narrow/channel/213817-t-lang/topic/unsafe.20fields.20RFC/near/502113897

Tracking Issue: rust-lang#132922
workingjubilee added a commit to workingjubilee/rustc that referenced this pull request Mar 4, 2025
…ethercote

Do not require that unsafe fields lack drop glue

Instead, we adopt the position that introducing an `unsafe` field itself carries a safety invariant: that if you assign an invariant to that field weaker than what the field's destructor requires, you must ensure that field is in a droppable state in your destructor.

See:
- rust-lang/rfcs#3458 (comment)
- https://rust-lang.zulipchat.com/#narrow/channel/213817-t-lang/topic/unsafe.20fields.20RFC/near/502113897

Tracking Issue: rust-lang#132922
workingjubilee added a commit to workingjubilee/rustc that referenced this pull request Mar 5, 2025
…ethercote

Do not require that unsafe fields lack drop glue

Instead, we adopt the position that introducing an `unsafe` field itself carries a safety invariant: that if you assign an invariant to that field weaker than what the field's destructor requires, you must ensure that field is in a droppable state in your destructor.

See:
- rust-lang/rfcs#3458 (comment)
- https://rust-lang.zulipchat.com/#narrow/channel/213817-t-lang/topic/unsafe.20fields.20RFC/near/502113897

Tracking Issue: rust-lang#132922
workingjubilee added a commit to workingjubilee/rustc that referenced this pull request Mar 5, 2025
…ethercote

Do not require that unsafe fields lack drop glue

Instead, we adopt the position that introducing an `unsafe` field itself carries a safety invariant: that if you assign an invariant to that field weaker than what the field's destructor requires, you must ensure that field is in a droppable state in your destructor.

See:
- rust-lang/rfcs#3458 (comment)
- https://rust-lang.zulipchat.com/#narrow/channel/213817-t-lang/topic/unsafe.20fields.20RFC/near/502113897

Tracking Issue: rust-lang#132922
jieyouxu added a commit to jieyouxu/rust that referenced this pull request Mar 5, 2025
…ethercote

Do not require that unsafe fields lack drop glue

Instead, we adopt the position that introducing an `unsafe` field itself carries a safety invariant: that if you assign an invariant to that field weaker than what the field's destructor requires, you must ensure that field is in a droppable state in your destructor.

See:
- rust-lang/rfcs#3458 (comment)
- https://rust-lang.zulipchat.com/#narrow/channel/213817-t-lang/topic/unsafe.20fields.20RFC/near/502113897

Tracking Issue: rust-lang#132922
jieyouxu added a commit to jieyouxu/rust that referenced this pull request Mar 5, 2025
…ethercote

Do not require that unsafe fields lack drop glue

Instead, we adopt the position that introducing an `unsafe` field itself carries a safety invariant: that if you assign an invariant to that field weaker than what the field's destructor requires, you must ensure that field is in a droppable state in your destructor.

See:
- rust-lang/rfcs#3458 (comment)
- https://rust-lang.zulipchat.com/#narrow/channel/213817-t-lang/topic/unsafe.20fields.20RFC/near/502113897

Tracking Issue: rust-lang#132922
compiler-errors added a commit to compiler-errors/rust that referenced this pull request Mar 6, 2025
…ethercote

Do not require that unsafe fields lack drop glue

Instead, we adopt the position that introducing an `unsafe` field itself carries a safety invariant: that if you assign an invariant to that field weaker than what the field's destructor requires, you must ensure that field is in a droppable state in your destructor.

See:
- rust-lang/rfcs#3458 (comment)
- https://rust-lang.zulipchat.com/#narrow/channel/213817-t-lang/topic/unsafe.20fields.20RFC/near/502113897

Tracking Issue: rust-lang#132922
Noratrieb added a commit to Noratrieb/rust that referenced this pull request Mar 6, 2025
…ethercote

Do not require that unsafe fields lack drop glue

Instead, we adopt the position that introducing an `unsafe` field itself carries a safety invariant: that if you assign an invariant to that field weaker than what the field's destructor requires, you must ensure that field is in a droppable state in your destructor.

See:
- rust-lang/rfcs#3458 (comment)
- https://rust-lang.zulipchat.com/#narrow/channel/213817-t-lang/topic/unsafe.20fields.20RFC/near/502113897

Tracking Issue: rust-lang#132922
compiler-errors added a commit to compiler-errors/rust that referenced this pull request Mar 6, 2025
…ethercote

Do not require that unsafe fields lack drop glue

Instead, we adopt the position that introducing an `unsafe` field itself carries a safety invariant: that if you assign an invariant to that field weaker than what the field's destructor requires, you must ensure that field is in a droppable state in your destructor.

See:
- rust-lang/rfcs#3458 (comment)
- https://rust-lang.zulipchat.com/#narrow/channel/213817-t-lang/topic/unsafe.20fields.20RFC/near/502113897

Tracking Issue: rust-lang#132922
rust-timer added a commit to rust-lang-ci/rust that referenced this pull request Mar 7, 2025
Rollup merge of rust-lang#137808 - jswrenn:droppy-unsafe-fields, r=nnethercote

Do not require that unsafe fields lack drop glue

Instead, we adopt the position that introducing an `unsafe` field itself carries a safety invariant: that if you assign an invariant to that field weaker than what the field's destructor requires, you must ensure that field is in a droppable state in your destructor.

See:
- rust-lang/rfcs#3458 (comment)
- https://rust-lang.zulipchat.com/#narrow/channel/213817-t-lang/topic/unsafe.20fields.20RFC/near/502113897

Tracking Issue: rust-lang#132922
github-actions bot pushed a commit to rust-lang/miri that referenced this pull request Mar 7, 2025
Do not require that unsafe fields lack drop glue

Instead, we adopt the position that introducing an `unsafe` field itself carries a safety invariant: that if you assign an invariant to that field weaker than what the field's destructor requires, you must ensure that field is in a droppable state in your destructor.

See:
- rust-lang/rfcs#3458 (comment)
- https://rust-lang.zulipchat.com/#narrow/channel/213817-t-lang/topic/unsafe.20fields.20RFC/near/502113897

Tracking Issue: #132922
lnicola pushed a commit to lnicola/rust-analyzer that referenced this pull request Mar 10, 2025
Do not require that unsafe fields lack drop glue

Instead, we adopt the position that introducing an `unsafe` field itself carries a safety invariant: that if you assign an invariant to that field weaker than what the field's destructor requires, you must ensure that field is in a droppable state in your destructor.

See:
- rust-lang/rfcs#3458 (comment)
- https://rust-lang.zulipchat.com/#narrow/channel/213817-t-lang/topic/unsafe.20fields.20RFC/near/502113897

Tracking Issue: #132922
github-actions bot pushed a commit to model-checking/verify-rust-std that referenced this pull request Mar 14, 2025
Instead, we adopt the position that introducing an `unsafe` field
itself carries a safety invariant: that if you assign an invariant
to that field weaker than what the field's destructor requires,
you must ensure that field is in a droppable state in your
destructor.

See:
- rust-lang/rfcs#3458 (comment)
- https://rust-lang.zulipchat.com/#narrow/channel/213817-t-lang/topic/unsafe.20fields.20RFC/near/502113897
github-actions bot pushed a commit to model-checking/verify-rust-std that referenced this pull request Mar 14, 2025
…ethercote

Do not require that unsafe fields lack drop glue

Instead, we adopt the position that introducing an `unsafe` field itself carries a safety invariant: that if you assign an invariant to that field weaker than what the field's destructor requires, you must ensure that field is in a droppable state in your destructor.

See:
- rust-lang/rfcs#3458 (comment)
- https://rust-lang.zulipchat.com/#narrow/channel/213817-t-lang/topic/unsafe.20fields.20RFC/near/502113897

Tracking Issue: rust-lang#132922
prescribe its terminology. Documentation of the unsafe fields tooling should reflect broader
consensus, once that consensus is reached.

# Future possibilities
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As this RFC is written, and as unstably implemented, unsafe fields are unsafe to read or take shared references to. However, it seems to me that there are a lot of use cases where what is potentially unsound is only mutating or moving out of the field:

  • Vec.len can be safely read (not that anyone is proposing making it a public unsafe field, as already discussed).
  • I looked for places to apply unsafe fields in my code, and the two cases I found have these shapes:
    • in struct Foo { a: A, b: B }, a must be dropped before b. Thus, both fields must not have &mut or move-out, but & is fine.
    • Subsetting wrapper types, where the only restriction is that the field never contains a value not meeting the restriction.

I scanned the discussion but did not see any mention of supporting an unsafe-to-modify-only variety. Is that something that could be a future possibility? Or has it been considered and rejected? (One could argue that it is pointless because, in such cases, one can always provide a fn field(&self) -> &FieldType accessor — but that prevents borrow splitting with &mut access to other fields, so it’s not quite a complete solution.)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I argue the same thing here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As this RFC is written, and as unstably implemented, unsafe fields are unsafe to read or take shared references to.

That is true. Since this RFC is the first major extension of field safety tooling since 1.0, we take a syntactically and semantically conservative approach: The keyword for denoting an invariant is simply unsafe, and the rule is simply that the compiler should require unsafe for any use that could violate an invariant.

However, it seems to me that there are a lot of use cases where what is potentially unsound is only mutating or moving out of the field

It is true that there are a lot of use-cases in which mutation is specifically problematic, but there are also a lot of use-cases for which other uses are problematic, too. Lately, I've been reviewing more of the latter than the former!

In any case, this RFC is aimed at folks who want to be very diligent about their use of fields that carry invariants. I've found that it's no so hard to write "// SAFETY: Reading this field cannot violate its invariant." once to define a safe getter, and I've appreciated the prompt to think critically about whether what I'm doing really is safe.

I scanned the discussion but did not see any mention of supporting an unsafe-to-modify-only variety.

Whoops, I thought I had written about this somewhere. An unsafe-to-modify-only variety is difficult to support well due to the possibility of interior mutation.

Is that something that could be a future possibility?

Perhaps! If Rust has more information about the kind of invariant the field has, its safety analysis can be more refined and tailored to the restrictions imposed by that invariant. We might also pair additional syntactic information with facts known about the type, like whether or not it contains UnsafeCell.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see also: #3458 (comment)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've found that it's no so hard to write "// SAFETY: Reading this field cannot violate its invariant." once to define a safe getter, and I've appreciated the prompt to think critically about whether what I'm doing really is safe.

That (and the thing about interior mutability being a hazard) makes sense to me. However, getters impede borrow splitting (unless and until we get some kind of subsetting type for the &self parameter) so making reading unsafe makes some programs harder to express (while others, that need to be wary of interior mutability, easier), so I think it would be good if that were mentioned explicitly as a tradeoff.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@moulins It's impossible to determine whether or not a field can be written through an & reference without examining its API. Whether or not the type is Freeze isn't enough. For example, &Mutex<T> is both Freeze and writable through an &. Even if Freeze were enough, the SemVer particulars of Freeze are still up for discussion. It's a complex extension with many considerations.

Copy link
Contributor

@Jules-Bertholet Jules-Bertholet Mar 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In fact, because of the moving-out problem, the wrapper struct solution can enable better scoped unsafe annotations than native unsafe-to-read fields even when the value is actually unsafe to read.

Compare these two examples (for simplicity, we assume that as_bytes and drop on a non-UTF8 String are guaranteed safe):

// With unsafe-to-read fields

mod string_stuff {
    /// Safe to call even if `input` is not valid UTF-8
    pub fn process_sketchy_string(input: String) {
        if core::str::from_utf8(input.as_bytes()).is_ok() {
            dbg!(input);
        } else {
            println!("not valid utf8");
        }
    }
}

struct Foo {
    unsafe maybe_invalid_utf8: String,
}

fn do_thing(foo: Foo) {
    // SAFETY: we immediately pass to `process_sketchy_string()`,
    // which is documented safe to call even when `s` is invalid UTF-8
    let s = unsafe { foo.maybe_invalid_utf8 };
    process_sketchy_string(s);
}

fn just_drop_the_string(foo: Foo) {
    // SAFETY: we immediately drop the `String` without reading its contents
    let s = unsafe { foo.maybe_invalid_utf8 };
    drop(s);
}
// With unsafe-to-write fields + wrapper struct

mod string_stuff {
    pub fn process_sketchy_string(input: UnsafeReadWrapper<String>) {
        let string: Option<String> = {
            // SAFETY: we immediately check for UTF-8 validity
            // before doing anything else with the string
            let input = unsafe { input.into_inner() };
            core::str::from_utf8(input.as_bytes())
                .is_ok()
                .then_some(input)
        };
        if let Some(string) = string {
            dbg!(string);
        } else {
            println!("not valid utf8");
        }
    }
}

struct Foo {
    unsafe maybe_invalid_utf8: UnsafeReadWrapper<String>,
}

fn do_thing(foo: Foo) {
    let s = foo.maybe_invalid_utf8;
    process_sketchy_string(s);
}

fn just_drop_the_string(foo: Foo) {
    let s = foo.maybe_invalid_utf8;
    drop(s);
}

With the wrapper type solution, unsafe annotations are required inside process_sketchy_string(), where the dangerous stuff is actually happening. do_thing() and just_drop_the_string() do not need to use unsafe.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see also: #3458 (comment)

Would be good if the RFC would contain this example, since the question is bound to come up again :)

With the wrapper type solution, unsafe annotations are required inside process_sketchy_string(), where the dangerous stuff is actually happening. do_thing() and just_drop_the_string() do not need to use unsafe.

This only works if you have a type that exactly matches the invariant you care for. Presumably UnsafeReadWrapper would just say "could be anything, just satisfies the language invariant"; therefore, any case where the actual invariant on the unsafe field is stronger than that should require unsafe in the caller to ensure that the invariant upheld by the field implies the precondition of the function.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The RFC now documents the alternatives of making moving safe, and copying safe: cf0516b

Copy link
Contributor

@Jules-Bertholet Jules-Bertholet Apr 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This only works if you have a type that exactly matches the invariant you care for. Presumably UnsafeReadWrapper would just say "could be anything, just satisfies the language invariant"; therefore, any case where the actual invariant on the unsafe field is stronger than that should require unsafe in the caller to ensure that the invariant upheld by the field implies the precondition of the function.

Even if the call to process_sketchy_string() needs to be unsafe because UnsafeReadWrapper’s requirements are too weak, it’s still a win over the unsafe-to-read example.

pub struct KeepAlive<T> {
/// # Safety
///
/// `T` may not be accessed (read or written) via this `Arc`.
Copy link
Member

@RalfJung RalfJung Apr 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That seems like a rather odd example -- why would I even have this field then? It's also an example of a "subtractive" invariant.

There are more convincing examples involving less artificial, and "additive", invariants. For instance,

    /// Must contain an even number.
  unsafe r: &'static Cell<i32>,

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your example is appropriate for the next heading ("Field Copies are Safe"), but not this one, since &T is unconditionally Copy. The example here should be one where moving the field (and invalidating the original memory) should not be safe.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking of "move" and "copy" as the same thing for a Copy type, but I take your point.

Still, I think the example here is not convincing at all. If that's the best we got, we should just allow moves of unsafe fields instead.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Disagree. We should keep unsafe moves for three reasons:

  • Over and over again, we've thought "can't we just make X safe" and the answer has been "no". I am not confident that this time is different; that this is "the best we got".
  • There are cases, such as the example here, where making moves unsafe really does prevent programmer mistakes earlier than if we made moves safe. Isn't that the point of this RFC?
  • There is pedagogic value in having a simple rule — "uses of unsafe fields are unsafe" — than having one with niche carve-outs, particularly as odd as "moves are safe, but only if they invalidate the original memory".

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A truly destructive move leaves the field inaccessible. The field has to be re-assigned a new value before the entire outer type is even considered fully initialized again. That re-assignment is unsafe, and that unsafety is where you have to check the invariant. So I am quite confident that there's no way a move can cause issues.

Having a field that nobody is allowed to read or write is not the purpose of this RFC and is not covered by the motivation stated for this RFC. What bug would be prevented by this? When would one even have such a field?

Copy link
Contributor

@Jules-Bertholet Jules-Bertholet Apr 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here are 4 examples demonstrating all the nobs I mentioned in my earlier comment (#3458 (comment)).

Example 1: UnalignedRef

pub struct UnalignedRef<'a, T> {
    /// # Safety
    ///
    /// `ptr` is a shared reference to a valid-but-unaligned instance of `T`.
    unsafe(mut) ptr: *const T,
    _lifetime: PhantomData<&'a T>,
}

It should be unsafe to get &mut to UnalignedRef.ptr, because you could use that &mut to swap the pointer with a different one that doesn’t point to a valid T.

However, taking & to the field is fine, you can’t do anything dangerous with that in safe code.

Also, if you copy out the value of ptr, there is nothing dangerous you can do in safe code with that value, even if you take an &mut to it. The safety invariant concerns the field, not the value.

Therefore, we mark the field as unsafe to mutate.

Example 2: NonZeroCell

pub struct NonZeroCell {
    /// # Safety
    ///
    /// Contents never equal zero.
    unsafe cell: Cell<u32>,
}

It should be unsafe to get &mut to UnalignedRef.cell, because you could use that &mut to set the field equal to zero.

In addition, & also should be unsafe, because you could set the cell to zero with that as well.

However, moving the Cell out of the struct is fine (as long as you don't move a different Cell back in to replace it). You can set that moved-out Cell to 0 all you like, no danger will result. Again, the safety invariant concerns the field, not the value.

Therefore, we mark the field as unsafe to mutate or get a shared reference to.

Example 3: MaybeInvalidString

struct MaybeInvalidString {
    /// # Safety
    ///
    /// May contain invalid UTF-8.
    /// But must still point to a valid allocation
    /// with `length` bytes fully initialized.
    unsafe(mut) maybe_invalid: Unsafe<String>,
}

Any reference to the inner String is unsafe, as an API could assume that its invariants are satisfied. Taking the String out of its MaybeInvalidString wrapper will not mitigate this danger. As there is a safety danger regarding the value itself, we wrap it in Unsafe.

However, an arbitrary Unsafe<String> value has no library safety invariants at all, but we still assume that maybe_invalid points to a valid allocation and has length bytes fully initialized. Therefore, we can’t let people swap in arbitrary Unsafe<String>s into the field willy-nilly. For this reason, we mark the field as unsafe to mutate also.

Example 4: UniqueArc

/// An `Arc` that provides exclusive access to its referent.
///
/// A `UniqueArc` may have any number of `KeepAlive` handles which ensure that
/// the inner value is not dropped. These handles only control dropping, and do
/// not provide read or write access to the value.
pub struct UniqueArc<T: 'static> {
    /// # Safety
    ///
    /// While this `UniqueArc` exists, the value pointed by this `arc` may not
    /// be accessed (read or written) other than via this `UniqueArc`.
    unsafe arc: Arc<UnsafeCell<T>>,
}

/// Keeps the parent [`UniqueArc`] alive without providing read or write access
/// to its value.
pub struct KeepAlive<T> {
    /// # Safety
    ///
    /// `T` may not be accessed (read or written) via this `Arc`.
    #[expect(unused)]
    arc: Unsafe<Arc<UnsafeCell<T>>>,
}

UniqueArc.arc

Any sort of reference to this field is dangerous, because it could be used to clone the Arc, and then use the clone to access the inner value, violating the field’s invariant. Therefore, we mark the field as unsafe to mutate or get a shared reference to.

However, if you move the Arc out of the field, you destroy the containing UniqueArc, so its safety invariants are no longer relevant. The safety invariant concerns the field, not the value.

KeepAlive.arc

Any access to the inner Arc inside this KeepAlive is dangerous, because it could be used to access the inner T. However, because the arc field is entirely unused, there are no safety restrictions on its contents at all. Therefore, we mark the value unsafe (by wrapping it in Unsafe), but not the field.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the examples. It might also be helpful to see what some use-sites of these fields looks like. Would you agree that these are the drawbacks of the approach:

  • the onus is on the user to correctly deduce what combination of unsafe/unsafe(mut)/Unsafe they need, and the cost of getting this wrong can be that invariants are violatable in unsafe code
  • the safety proofs required to call Unsafe's constructor and accessors depends on program dataflow

Copy link
Contributor

@Jules-Bertholet Jules-Bertholet Apr 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the onus is on the user to correctly deduce what combination of unsafe/unsafe(mut)/Unsafe they need, and the cost of getting this wrong can be that invariants are violatable in unsafe code

On the other hand, by forcing you to think deeply about exactly what your safety invariants are, this choice could also help you not violate them. And, as I have said before, false-positive unsafe also has a cost. Taken to the extreme: if everything is unsafe, then nothing is.

the safety proofs required to call Unsafe's constructor and accessors depends on program dataflow

Yes for the accessors, but this is also true of the unsafe-to-move-out-of fields in the current iteration of this RFC. No for the constructor, Unsafe’s constructor is safe.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might also be helpful to see what some use-sites of these fields looks like.

Example 1: UnalignedRef

pub struct UnalignedRef<'a, T> {
    /// # Safety
    ///
    /// `ptr` is a shared reference to a valid-but-unaligned instance of `T`.
    unsafe(mut) ptr: *const T,
    _lifetime: PhantomData<&'a T>,
}

// Usage:

fn main() {
    let array: [u8; 4] = [0; 4];
    let unaligned_u32: *const u32 = (&raw const array).cast();

    // SAFETY: `unaligned_u32` points to 4 initialized bytes
    let mut unaligned_ref = unsafe {
        UnalignedRef {
            ptr: unaligned_u32,
            _lifetime: PhantomData,
        }
    };

    let _ref = &unaligned_ref.ptr;
    // SAFETY: `_mut_ref` is entirely unused
    let _mut_ref = unsafe { &mut unaligned_ref.ptr };

    let _ptr = &raw const unaligned_ref.ptr;
    let _mut_ptr = &raw mut unaligned_ref.ptr;

    let copy_out = unaligned_ref.ptr;

    // SAFETY: doesn't modify the value at all
    unsafe {
        unaligned_ref.ptr = copy_out;
    }
}

Example 2: NonZeroCell

pub struct NonZeroCell {
    /// # Safety
    ///
    /// Contents never equal zero.
    unsafe cell: Cell<u32>,
}

// Usage:

fn main() {
    // SAFETY: `cell` is not 0
    let mut nonzero_cell = unsafe {
        NonZeroCell {
            cell: Cell::new(42),
        }
    };

    // SAFETY: `_ref` is entirely unused
    let _ref =  unsafe {  &nonzero_cell.cell };
    // SAFETY: `_mut_ref` is entirely unused
    let _mut_ref = unsafe {  &mut nonzero_cell.cell };

    let _ptr = &raw const nonzero_cell.cell;
    let _mut_ptr = &raw mut nonzero_cell.cell;

    let move_out = nonzero_cell.cell;

    // SAFETY: doesn't modify the value at all
    unsafe {
        nonzero_cell.cell = move_out;
    }
}

Definition of Unsafe

/// A wrapper for a `T` that may have broken safety invariants.
///
/// Validity invariants must still be upheld.
/// Additionally, the destructor is not supressed,
/// so the minimal invariants required by the destructor
/// must still be upheld. (Use `Unsafe<ManuallyDrop<...>>`
/// if you can't provide this.)
#[repr(transparent)]
#[derive(Default)]
pub struct Unsafe<T: ?Sized>(T);

impl<T> Unsafe<T> {
    pub fn new(val: T) -> Self {
        Self(val)
    }

    pub fn set(&mut self, val: T) {
        self.0 = val;
    }

    /// # Safety
    ///
    /// `T` may have broken safety invariants
    pub unsafe fn into_inner(self) -> T {
        self.0
    }
}

impl<T: ?Sized> Unsafe<T> {
    /// # Safety
    ///
    /// `T` may have broken safety invariants
    pub unsafe fn get(&self) -> &T {
        &self.0
    }

    /// # Safety
    ///
    /// `T` may have broken safety invariants
    pub unsafe fn get_mut(&mut self) -> &mut T {
        &mut self.0
    }
}

impl<T> Clone for Unsafe<T>
where
    T: Copy,
{
    fn clone(&self) -> Self {
        *self
    }
}

impl<T> Copy for Unsafe<T> where T: Copy {}

Example 3: MaybeInvalidString

struct MaybeInvalidString {
    /// # Safety
    ///
    /// May contain invalid UTF-8.
    /// But must still point to a valid allocation
    /// with `length` bytes fully initialized.
    unsafe(mut) maybe_invalid: Unsafe<String>,
}

/// Returns a `String` that is invalid UTF-8.
fn make_an_invalid_string() -> Unsafe<String> {
    let mut invalid = Unsafe::new("hello".to_owned());
    // SAFETY: we turn the String into invalid UTF-8,
    // but it's wrapped in an `Unsafe` so this is fine
    unsafe {
        let bytes_mut = invalid.get_mut().as_mut_str().as_bytes_mut();
        bytes_mut[0] = 0xFF;
        bytes_mut[1] = 0xFF;
    }
    invalid
}

fn main() {
    // SAFETY: `maybe_invalid` points to a valid allocation
    // and is fully initialized up to its length
    // (though it is not valid UTF-8
    let mut string = unsafe {
        MaybeInvalidString {
            maybe_invalid: make_an_invalid_string(),
        }
    };

    let ref_ = &string.maybe_invalid;
    // SAFETY: `_inner_ref` is entirely unused
    let _inner_ref = unsafe {  ref_.get() };
    // SAFETY: `mut_ref` is not used to swap out the value
    let mut_ref = unsafe {  &mut string.maybe_invalid };
    // SAFETY: `_inner_mut_ref` is entirely unused
    let _inner_mut_ref = unsafe {  mut_ref.get_mut() };

    let _ptr = &raw const string.maybe_invalid;
    let _mut_ptr = &raw mut string.maybe_invalid;

    let move_out = string.maybe_invalid;

    // SAFETY: doesn't modify the value at all
    unsafe {
        string.maybe_invalid = move_out;
    }

    let move_out_again = string.maybe_invalid;
    // SAFETY: `_inner_ref` is entirely unused
    let _inner_ref = unsafe {  move_out_again.get() }; 
}

Example 4: UniqueArc

/// An `Arc` that provides exclusive access to its referent.
///
/// A `UniqueArc` may have any number of `KeepAlive` handles which ensure that
/// the inner value is not dropped. These handles only control dropping, and do
/// not provide read or write access to the value.
pub struct UniqueArc<T: 'static> {
    /// # Safety
    ///
    /// While this `UniqueArc` exists, the value pointed by this `arc` may not
    /// be accessed (read or written) other than via this `UniqueArc`.
    unsafe arc: Arc<UnsafeCell<T>>,
}

/// Keeps the parent [`UniqueArc`] alive without providing read or write access
/// to its value.
pub struct KeepAlive<T> {
    /// # Safety
    ///
    /// `T` may not be accessed (read or written) via this `Arc`.
    #[expect(unused)]
    arc: Unsafe<Arc<UnsafeCell<T>>>,
}


fn main() {
    // SAFETY: The `Arc` is newly created,
    // so there are no other clones of it around
    let unique_arc: UniqueArc<i32> = unsafe {
        UniqueArc {
            arc: Arc::new(UnsafeCell::new(42)),
        }
    };

    let keep_alive = KeepAlive {
        // SAFETY: we immediately put the cloned `Arc` in a `KeepAlive`,
        // where it will be inaccessible
        arc: Unsafe::new(unsafe { unique_arc.arc.clone() }),
    };

    let keep_alive_arc = keep_alive.arc;

    // By moving out of `arc`, we destroy the `UniqueArc`¸
    // and are no longer bound by its uniqueness requirement
    let no_longer_unique_arc = unique_arc.arc;
    let cloned_arc = no_longer_unique_arc.clone();

    drop(keep_alive_arc);
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another way to think about it:

  • An unsafe(mut) field strengthens the requirements on the field contents. (Not every value of the type that is normally safe to get an &mut to, is allowed.)
  • An unsafe field strengthens them further. (Not every value of the type that is normally safe to get an & to, is allowed.)
  • An Unsafe<...> wrapper type weakens the requirements compared to the baseline for the type. (Some values are accepted even if it is not normally safe to get an & to them.)
  • unsafe(mut) Unsafe<...> (combining unsafe fields and wrapper type) can weaken, strengthen, or both at once.

Comment on lines 429 to 430
/// So long as `T` is owned by `UniqueArc`, `T` may not be accessed (read or
/// written) other than via this `UniqueArc`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When is T owned by UniqueArc? Without defining that, this invariant is insufficiently defined and I don't know how to make sense of it.

Copy link
Contributor

@joshlf joshlf Apr 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this safety comment could be slightly reworded. IIUC the intention is that so long as a u: UniqueArc instance exists, then u.arc's referent may not be accessed other than via u. Once u has been dropped, then it's possible that the last Arc to be dropped will be inside of a KeepAlive, at which point that KeepAlive will destruct the inner T (which is an access of T). That's why the "so long as u exists" condition is necessary.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be correct to rephrase this as "While the UniqueArc exists, the value pointed by this Arc must not be accessed (read or written) through any other pointer"? Given the API, I imagine this could also just be strengthened to "this must be the only Arc pointing to that location".

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be correct to rephrase this as "While the UniqueArc exists, the value pointed by this Arc must not be accessed (read or written) through any other pointer"?

Yes, I think that's correct. I've made that edit: 322c4a8

Given the API, I imagine this could also just be strengthened to "this must be the only Arc pointing to that location".

Not quite, since KeepAlive also holds an Arc to the same location.


// SAFETY: `UniqueArc<T>` has the same aliasing and synchronization properties
// as `&mut T`, and so it is `Sync` if `&mut T` is `Sync`.
unsafe impl<T> Sync for UniqueArc<T> where &'static mut T: Sync {}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this unnecessarily restricts T: 'static due to the &'static mut T

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. a5cc19f


// SAFETY: `UniqueArc<T>` has the same aliasing and synchronization properties
// as `&mut T`, and so it is `Sync` if `&mut T` is `Sync`.
unsafe impl<T> Sync for UniqueArc<T> where for<'a> &'a mut T: Sync {}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually, thinking about it a bit more, being Sync is enough to send a &UniqueArc to another thread, which allows you to create a new KeepAlive on the other thread, which can then outlive the original UniqueArc, dropping the T on the other thread, so that unsafe impl Sync is unsound. you need T: Send + Sync. however, you can unsafe impl<T: Send> Send for UniqueArc<T> afaict

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe just remove all the explicit Sync/Send impls and rely on the automatic impls from Arc for simplicity?

Comment on lines +959 to +961
breaking existing code. Over an edition boundary, safe reads of `unsafe` fields could be permitted
by rewriting existing `unsafe` fields to wrap their field type in a tooling-aware `Unsafe` wrapper
type.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not convinced that changing the type of the field over an edition is viable. (However, I agree that safe-to-& unsafe fields is something that can be left for later.)

Comment on lines +871 to +876
- `unsafe(init,&mut,&,read)` (everything is unsafe)
- `unsafe(init,&mut,&)` (everything except reading unsafe)
- `unsafe(init,&mut)` (everything except reading and `&`-referencing unsafe)
- etc.

Besides the unclear semantics of an unparameterized `unsafe()`, this design has the disadvantage
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The way I conceptualized it in my proposal at #3458 (comment): the argument to unsafe denotes the weakest reference type that is unsafe to obtain. So unsafe() makes & and &mut references unsafe, while unsafe(mut) only makes &mut unsafe.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If Rust ever gets partial borrows, the syntax for that could be adopted to extend this feature as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
I-lang-radar Items that are on lang's radar and will need eventual work or consideration. T-lang Relevant to the language team, which will review and decide on the RFC.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Unsafe fields