Skip to content

Commit bb00466

Browse files
committed
race: Relax compare_exchange success ordering from AcqRel to Release.
[cherry-pick once_cell 2a707eedb459369687ffdb2183ee82fabaa5d97a] See the analogous change in rust-lang/rust#131746 and the discussion in matklad/once_cell#220. What is the effect of this change? Not much, because before we ever execute the `compare_exchange`, we do a load with `Ordering::Acquire`; the `compare_exchange` is in the `#[cold]` path already. Thus, this just mostly clarifies our expectations. See the non-doc comment added under the module's doc comment for the reasoning. How does this change the code gen? Consider this analogous example: ```diff #[no_mangle] fn foo1(y: &mut i32) -> bool { - let r = X.compare_exchange(0, 1, Ordering::AcqRel, Ordering::Acquire).is_ok(); + let r = X.compare_exchange(0, 1, Ordering::Release, Ordering::Acquire).is_ok(); r } ``` On x86_64, there is no change. Here is the generated code before and after: ``` foo1: mov rcx, qword ptr [rip + example::X::h9e1b81da80078af7@GOTPCREL] mov edx, 1 xor eax, eax lock cmpxchg dword ptr [rcx], edx sete al ret example::X::h9e1b81da80078af7: .zero 4 ``` On AArch64, regardless of whether atomics are outlined or not, there is no change. Here is the generated code with inlined atomics: ``` foo1: adrp x8, :got:example::X::h40b04fb69d714de3 ldr x8, [x8, :got_lo12:example::X::h40b04fb69d714de3] .LBB0_1: ldaxr w9, [x8] cbnz w9, .LBB0_4 mov w0, #1 stlxr w9, w0, [x8] cbnz w9, .LBB0_1 ret .LBB0_4: mov w0, wzr clrex ret example::X::h40b04fb69d714de3: .zero 4 ``` For 32-bit ARMv7, with inlined atomics, the resulting diff in the object code is: ```diff @@ -10,14 +10,13 @@ mov r0, #1 strex r2, r0, [r1] cmp r2, #0 - beq .LBB0_5 + bxeq lr ldrex r0, [r1] cmp r0, #0 beq .LBB0_2 .LBB0_4: - mov r0, #0 clrex -.LBB0_5: + mov r0, #0 dmb ish bx lr .LCPI0_0: @@ -54,4 +53,3 @@ example::X::h47e2038445e1c648: .zero 4 ```
1 parent eda1128 commit bb00466

File tree

1 file changed

+9
-1
lines changed

1 file changed

+9
-1
lines changed

src/polyfill/once_cell/race.rs

+9-1
Original file line numberDiff line numberDiff line change
@@ -19,6 +19,14 @@
1919
//! `Acquire` and `Release` have very little performance overhead on most
2020
//! architectures versus `Relaxed`.
2121
22+
// The "atomic orderings" section of the documentation above promises
23+
// "happens-before" semantics. This drives the choice of orderings in the uses
24+
// of `compare_exchange` below. On success, the value was zero/null, so there
25+
// was nothing to acquire (there is never any `Ordering::Release` store of 0).
26+
// On failure, the value was nonzero, so it was initialized previously (perhaps
27+
// on another thread) using `Ordering::Release`, so we must use
28+
// `Ordering::Acquire` to ensure that store "happens-before" this load.
29+
2230
use core::sync::atomic;
2331

2432
use atomic::{AtomicUsize, Ordering};
@@ -102,7 +110,7 @@ impl OnceNonZeroUsize {
102110
let mut val = f().get();
103111
let exchange = self
104112
.inner
105-
.compare_exchange(0, val, Ordering::AcqRel, Ordering::Acquire);
113+
.compare_exchange(0, val, Ordering::Release, Ordering::Acquire);
106114
if let Err(old) = exchange {
107115
val = old;
108116
}

0 commit comments

Comments
 (0)