Skip to content

Simplify lightweight clones, including into closures and async blocks #3680

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 36 commits into
base: master
Choose a base branch
from
Open
Changes from 7 commits
Commits
Show all changes
36 commits
Select commit Hold shift + click to select a range
cbc1941
Simplify lightweight clones, including closures and async blocks
joshtriplett Jul 28, 2024
396cc14
Add RFC number for cross-reference to RFC 3678
joshtriplett Aug 20, 2024
07679f1
Clarify wording
joshtriplett Aug 20, 2024
f3b9668
Add future possibility of `Use` on small arrays
joshtriplett Aug 20, 2024
8ad5862
Expand on evaluation of use cases in future work
joshtriplett Aug 20, 2024
6fa409a
Record `ToOwned` possibility as an unresolved question
joshtriplett Aug 20, 2024
fd42e86
Suggest a lint for expensive `Use` implementations
joshtriplett Aug 20, 2024
64ed357
RFC 3680
joshtriplett Aug 20, 2024
7146131
Add RFC number to RFC PR link
joshtriplett Aug 20, 2024
5b85390
Add alternative for calling `Clone::clone` directly
joshtriplett Aug 20, 2024
5bae2bf
Use `Clone::clone` directly; eliminate `Use::do_use`
joshtriplett Aug 20, 2024
969836b
Add an alternative for general `Clone::clone` elision
joshtriplett Aug 20, 2024
823d892
Add alternative/rationale explaining why automatic clone doesn't suffice
joshtriplett Aug 20, 2024
7d41d87
Add the possibility of using a method while still having elision
joshtriplett Aug 20, 2024
de930d8
Alternative: constrain the elision rules and disallow future optimiza…
joshtriplett Aug 20, 2024
da366a0
Fix typo (missing backquote)
joshtriplett Aug 20, 2024
7f2da74
Remove fallback to borrowing, and add an unresolved question
joshtriplett Aug 20, 2024
4cc8920
Add alternative for adding this to bindings rather than types
joshtriplett Aug 20, 2024
dc8745b
Add possibility of direct `Copy` (per nikomatsakis)
joshtriplett Aug 21, 2024
122cf55
Add possibility of closures/blocks moving things if dead
joshtriplett Aug 21, 2024
ab000b9
Rework motivation to make the RFC more neutral towards future RFCs
joshtriplett Aug 21, 2024
7825d31
Future possibility: lints to help catch large array copies
joshtriplett Aug 21, 2024
3ef0e56
Add more general explorations of Copy/Use
joshtriplett Aug 21, 2024
ca82408
Aligns with other proposals for precise capture syntax
joshtriplett Aug 21, 2024
10671f4
Closures/blocks move if statically dead
joshtriplett Aug 21, 2024
80745d6
Clarify "statically dead" to "statically dead after the closure/block"
joshtriplett Aug 21, 2024
de7d33b
Superseding `move` closures?
joshtriplett Aug 21, 2024
a130f07
Unresolved question: `#[marker]` trait?
joshtriplett Aug 21, 2024
1fc1d6a
Add alternative of floating `x.use` out of closures
joshtriplett Aug 21, 2024
b7b8474
Unresolved questions: remove a resolved question
joshtriplett Aug 21, 2024
84a5614
Add unresolved question for clone elision
joshtriplett Aug 21, 2024
16d8f54
Expand mention of `Copy` handling in alternatives
joshtriplett Aug 21, 2024
d9faf4c
Clarify an elision case
joshtriplett Aug 21, 2024
856093b
We may want a derive(Use)
joshtriplett Aug 21, 2024
9d5166d
Future work: Smarter suggestions for missing bounds
joshtriplett Aug 21, 2024
041a840
Mention alternate keywords
joshtriplett Aug 21, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
379 changes: 379 additions & 0 deletions text/0000-use.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,379 @@
- Feature Name: `use`
- Start Date: 2024-07-20
- RFC PR: [rust-lang/rfcs#0000](https://github.com/rust-lang/rfcs/pull/0000)
- Rust Issue: [rust-lang/rust#0000](https://github.com/rust-lang/rust/issues/0000)

# Summary
[summary]: #summary

Provide a feature to simplify performing lightweight clones (such as of
`Arc`/`Rc`), particularly cloning them into closures or async blocks, while
still keeping such cloning visible and explicit.

# Motivation
[motivation]: #motivation

A very common source of friction in asynchronous or multithreaded Rust
programming is having to clone various `Arc<T>` reference-counted objects into
an async block or task. This is particularly common when spawning a closure as
a thread, or spawning an async block as a task. Common patterns for doing so
include:

```rust
// Use new names throughout the block
let new_x = x.clone();
let new_y = y.clone();
spawn(async move {
func1(new_x).await;
func2(new_y).await;
});

// Introduce a scope to perform the clones in
{
let x = x.clone();
let y = y.clone();
spawn(async move {
func1(x).await;
func2(y).await;
});
}

// Introduce a scope to perform the clones in, inside the call
spawn({
let x = x.clone();
let y = y.clone();
async move {
func1(x).await;
func2(y).await;
}
});
```

All of these patterns introduce noise every time the program wants to spawn a
thread or task, or otherwise clone an object into a closure or async block.
Feedback on Rust regularly brings up this friction, seeking a simpler solution.

In addition, Rust developers trying to avoid heavyweight clones will sometimes
suggest eschewing invocations of `obj.clone()` in favor of writing
`Arc::clone(&obj)` explicitly, to mark the call explicitly as a lightweight
clone, at the cost of syntactic salt. This RFC proposes a syntax that can
*only* make a lightweight clone, while still using a simple postfix syntax.

In some cases, people ask for fully *automatic* cloning, requiring no visible
indication at the point of the clone. However, Rust has long attempted to keep
user-provided code visible, such as by not providing copy constructors. Rust
users regularly provide feedback on this point as well, asking for clones to
not become implicit, or otherwise confirming that they appreciate the absence
of copy constructors.

This RFC proposes solutions to *minimize* the syntactic weight of
lightweight-cloning objects, particularly cloning objects into a closure or
async block, while still keeping an indication of this operation.

# Guide-level explanation
[guide-level-explanation]: #guide-level-explanation

When working with objects that support lightweight cloning, such as `Rc` or
`Arc`, you can get an additional clone (a new "use") of the object by invoking
`.use`:

```rust
let obj: Arc<LargeComplexObject> = new_large_complex_object();
some_function(obj.use); // Pass a separate use of the object to `some_function`
obj.method(); // The object is still owned afterwards
```

If you want to create a closure or async block that captures new uses of such
objects, you can put the `use` keyword on the closure or async block, similar
to the `move` keyword.

```rust
let obj: Arc<LargeComplexObject> = new_large_complex_object();
let map: Arc<AnotherObject> = new_mapping();
std::thread::spawn(use || map.insert(42, func(obj)));
task::spawn(async use { op(map, obj).await });
another_func(obj, map);
```

(Note that `use` and `move` are mutually exclusive.)

`.use` supports chaining, so in particular it works when calling a method that
would otherwise consume `self`:

```rust
obj.use.consume();
obj.method();
```

Calling `x.use` requires that the type of `x` implement the `Use` trait. This
trait identifies types whose clone implementation is lightweight, such as
reference-counted types.

Various types in the standard library implement `Use`. You can implement this
trait for your own types, if they meet the requirements for being lightweight
to clone. (See the [reference-level explanation][reference-level-explanation]
for the requirements.)

```rust
impl Use for MyType {}
```

# Reference-level explanation
[reference-level-explanation]: #reference-level-explanation

## The `Use` trait

```rust
/// Trait for objects whose clone impl is lightweight (e.g. reference-counted)
///
/// Cloning an object implementing this trait should in general:
/// - be O(1) (constant) time regardless of the amount of data managed by the object,
/// - not require a memory allocation,
/// - not require copying more than roughly 64 bytes (a typical cache line size),
/// - not block,
/// - not have any semantic side effects (e.g. allocating a file descriptor), and
/// - not have overhead larger than a couple of atomic operations.
trait Use: Clone {
/// Add a use of the object, and return the new object
///
/// Note that implementers of the trait cannot override this method; it
/// will always call `Clone::clone`.
pub impl(crate) do_use(&self) -> Self {
Clone::clone(self)
}
}
```

Note that while this uses the `impl(crate)` syntax proposed in RFC 3678, this
is not a dependency for stabilization; we can use any mechanism that prevents
implementation of `Use::do_use()` in impl blocks outside the standard library.

This trait should be implemented for anything in the standard library that
meets these criteria. Some notable types in the standard library that implement
`Use`:

- `std::sync::Arc`
- `std::sync::Weak`
- `std::rc::Rc`
- `std::rc::Weak`
- `std::sync::mpsc::Sender` and `std::sync::mpsc::SyncSender`
- Tuples of types whose components all implement `Use`
- `Option<T>` where `T: Use`
- `Result<T, E>` where `T: Use` and `E: Use`

Some notable types that implement `Clone` but should *not* implement `Use`:
arrays, `String`, `Box`, `Vec`, `HashMap`, and `BTreeMap`.

We may want to add a clippy or rustc lint (e.g. `expensive_use`) for
implementations of `Use` on an excessively large type, or a type whose `clone`
implementation seems to be obviously breaking the intended constraints on the
`Use` trait. Such a lint would be best-effort only, and could always be marked
as `allow` by a crate, but could help to discourage such implementations.

We may want to add a clippy or rustc lint for calls to `.clone()` that could
use `.use` instead. This would help the remaining calls to `.clone()` stand out
as "expensive" clones. (Such a lint would need to take MSRV into account before
making such a suggestion; clippy already has such a mechanism and rustc may
gain one in the future.)

## The implementation and optimization of `.use`

An expression `x.use`, where `x` has type `T`, requires that `T: Use`. However,
`x.use` does not always invoke `Use::do_use(x)`; in some cases the compiler can
optimize away a use.

If `x` is statically known to be dead, the compiler will move `x` rather than
using it. This allows functions to write `x.use` without concern for whether
it's the last usage of `x` (e.g. when passing `x.use` to a series of
functions). Much like a trailing comma, this allows every usage to be
symmetric, making it easy to compare uses or add more uses afterwards. (This
optimization also means we should generally not lint on a final use of `x.use`,
such as we currently do with the clippy lint `redundant_clone`.)

If `x` is not statically known to be dead, but *all* of the following
conditions are met, the compiler *may* elide an `x.use` and use `&x` instead:
- The compiler can statically see that `x` outlives the result of `x.use`,
- The compiler can statically see that the result of `x.use` is only accessed
via shared reference (including methods with `&self`, dereferences via
`Deref`, other invocations of `.use`, `use ||` closures, or `async use`
blocks). Effectively, these conditions mean that the user could theoretically
have refactored the code to use `&x` rather than `x.use`.

An example of these elisions:

```rust
fn f(obj: Arc<Object>) {
g(obj.use); // `g` takes `Arc<Object>`
h(use || obj.method()); // `Object::method` here takes `&self`
}

fn main() {
let obj = Arc::new(Object::new());
f(obj.use);
f(obj.use);
f(obj.use);
g(obj.use);
}
```

If every invocation of `.use` or `use ||` here resulted in a call to
`.do_use()`, this program would call `.do_use()` 10 times (and have 11 `Arc`s
to drop); however, the compiler can elide all the uses in `main` and have `f`
work with the original Arc, resulting in only 6 calls to `.do_use()` (and only
7 `Arc`s to drop). When an object is used repeatedly, such as in a loop, this
can result in many elisions over the course of the program, and lower
contention for atomics.

If a user has an unusual `Use` type for which they wish to avoid these
potential elisions, they can call `.do_use()` directly. (In a future edition of
Rust, `Use` will be in the prelude, allowing calls to `.do_use()` without
importing anything.)

At any time, we could potentially instrument the compiler to detect the number
of elided calls to `.do_use()` in a crater run, to demonstrate the value of
this optimization.

## `use ||` closures and `async use` blocks

A closure can be written as `use |args| ...`, and an async block can be written
as `async use { ... }`, analogous to the use of `move`. (Note that `use` and
`move` are mutually exclusive.)

For any object referenced within the closure or block that a `move`
closure/block would move, `use` will `.use` that object and have the closure
own the new use of the object. If any object referenced within the closure or
block does not implement `Use` (including generic types whose bounds do not
require `Use`), the closure or block will attempt to borrow that object instead
(as it would do without `move`/`use`). If that borrow results in a
borrow-checker error, the compiler will report the error stating that *either*
the object must implement `Use` or it must outlive the closure/block.

Note in particular that this allows the same closure to `use` an `Arc` and
borrow a large array. Without the fallback to attempting a borrow, it would not
be possible to borrow a large array.

# Drawbacks
[drawbacks]: #drawbacks

This adds language surface area.

While this still makes lightweight clones visible, it makes them *less*
visible. (See "Rationale and alternatives".)

Users may misuse this by implementing `Use` for a type that doesn't meet the
requirements. Not all of the requirements can be checked by the compiler. While
this is still *safe*, it may result in types that violate user's expectations.

# Rationale and alternatives
[rationale-and-alternatives]: #rationale-and-alternatives

We could do nothing, and require people to continue calling `.clone()` and
introducing new bindings for closures and async blocks.

There are myriad names we could use for the trait, method, invocation syntax,
and closure / async block syntax. An ideal name for this needs to convey the
semantic of taking an additional reference to an object, without copying or
duplicating the object. Other names proposed for this mechanism include
`Claim`.

We could use a name that directly references referencing counting; for
instance, `AddRef`/`add_ref`. However, that would be confusing for applications
using this with objects managed by something other than reference counting,
such as RCU or hazard pointers, or with (small) objects being copied.

Rather than using the special syntax `.use`, we could use an ordinary trait
method and invoke that method directly (e.g. `.claim()`). The syntax provided
for closures and async blocks could likewise always invoke that method. This
would not be compatible with adding smarter semantics such as eliding uses,
however. In addition, `.use` can be used in all editions of Rust (because `use`
is a reserved keyword), while a new trait with a method like `.claim()` would
require an import in existing editions and could only become part of the
prelude in a future edition.

Rather than having a single keyword in `use ||` or `async use`, we could
require naming every individual object being used (e.g. `use(x, y) ||`).
There's precedent for this kind of explicit capture syntax in other languages
(and having it as an available *option* is in the
[future possibilities][future-possibilities] section). However, requiring a
list of every object used adds overhead to closures and async blocks throughout
a program. This RFC proposes that the single keyword `use` suffices to indicate
that lightweight cloning will take place.

Rather than having `Use` act as `Clone` and go from `&self` to `Self`, we could
translate through a trait like `ToOwned`. This would allow using `Use` when
owned values have a different type, such as `&MyType -> SmartPtr<MyType>`.
However, this would also add complexity to the common case.

We could omit the elision optimizations, and have `.use` *always* call
`.do_use()` unconditionally. This would be slightly simpler, but would add
unnecessary overhead, and would encourage users to micro-optimize their code by
arranging to omit calls to `.use`. The elision optimizations encourage users to
always call `.use`.

In the elision optimizations, we could potentially allow the last usage of the
result of `x.use` to take ownership of it, and the compiler could insert a
`.use` at that point. However, at that point the elision would not have
resulted in any fewer calls to `.use`, so this does not seem worth the extra
complexity.

# Prior art
[prior-art]: #prior-art

Many languages have built-in reference counting, and automatically manage
reference counts when copying or passing around objects. `.use` provides a
simple and efficient way for Rust to manage reference counts.

Some languages have copy constructors, allowing arbitrary code to run when
copying an object. Such languages can manage reference-counted smart pointers
implicitly.

Rust already has other uses of postfix syntax, notably `.await` and `?`. In
particular, `.await` provides precedent for the use of `.keyword`.

# Unresolved questions
[unresolved-questions]: #unresolved-questions

Are there use cases for using something like `ToOwned` rather than always going
from `&self` to `Self`? Would any smart pointers need that?

# Future possibilities
[future-possibilities]: #future-possibilities

We could implement `Use` for small arrays (e.g. arrays smaller than 64 bytes).

We could allow `.use` in struct construction without duplicating the name of a
field. For instance:

```rust
let s = SomeStruct { field, field2.use, field3.use };
/// This expands to:
let s = SomeStruct {
field: field,
field2: field2.use,
field3: field3.use,
};
```

We could extend the `use` syntax on closures and async blocks to support
naming specific objects to use:

```rust
use(x) || { ... }

async use(x) { ... }
```

We could further extend the `use` syntax to support an explicit capture list of
named expressions:

```rust
use(x = x.method(), y) || { ... }

// `..` additionally captures everything that a plain `use` would
async use(x = &obj, move y, ..)
```

We could consider providing a syntax to make invocations like `func(a.use,
b.use, c.use)` less verbose. In many cases, such a function could accept a
reference and call `.use` itself if needed, but we should evaluate whether
there are use cases not covered by that.