Skip to content

runtime: canonical 48-bit address bug in lfstack #49405

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
Sonicadvance1 opened this issue Nov 6, 2021 · 31 comments
Open

runtime: canonical 48-bit address bug in lfstack #49405

Sonicadvance1 opened this issue Nov 6, 2021 · 31 comments
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Milestone

Comments

@Sonicadvance1
Copy link

What version of Go are you using (go version)?

go version go1.16.2 linux/amd64

Does this issue reproduce with the latest release?

Yes

What operating system and processor architecture are you using (go env)?

go env Output
$ go env
GO111MODULE=""
GOARCH="amd64"
GOBIN=""
GOCACHE="/home/ryanh/.cache/go-build"
GOENV="/home/ryanh/.config/go/env"
GOEXE=""
GOFLAGS=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOINSECURE=""
GOMODCACHE="/home/ryanh/go/pkg/mod"
GONOPROXY=""
GONOSUMDB=""
GOOS="linux"
GOPATH="/home/ryanh/go"
GOPRIVATE=""
GOPROXY="https://proxy.golang.org,direct"
GOROOT="/usr/lib/go-1.16"
GOSUMDB="sum.golang.org"
GOTMPDIR=""
GOTOOLDIR="/usr/lib/go-1.16/pkg/tool/linux_amd64"
GOVCS=""
GOVERSION="go1.16.2"
GCCGO="gccgo"
AR="ar"
CC="gcc"
CXX="g++"
CGO_ENABLED="1"
GOMOD="/dev/null"
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build1471403295=/tmp/go-build -gno-record-gcc-switches"

What did you do?

If lfstack gets any pointer on amd64 platforms where the top bit is set then an assert is thrown.
In most cases this won't occur since the kernel will consume the top bit of VA, leaving the userspace with 47bits of VA.
This is a logic bug around address canonicalization.

Check is here: https://github.com/golang/go/blob/master/src/runtime/lfstack.go#L28
Pointer sign extension is here: https://github.com/golang/go/blob/master/src/runtime/lfstack_64bit.go#L52

I don't have a basic application or environment for reproducing this. Easy to see the logic bug.

What did you expect to see?

While it is correct that you must canonicalize the pointer by sign extending it on 48bit systems (I'm not even going to bring up LA57). The check inside of lfstack is incorrect since it assumes the sign extension will never occur. Thus assuming it will never receive a true 48-bit pointer.
This assumption is broken in environments where you get 48-bit pointers with the high bit set.
This is purely a simple logic bug, it shouldn't assert in this case.

What did you see instead?

An assert on 48-bit address with high bit set.

Sonicadvance1 added a commit to Sonicadvance1/FEX that referenced this issue Nov 6, 2021
Only a partial fix for FEX-Emu#1330, still needs preemption disabled to work.

On x86-64 hosts the Linux kernel resides in the top bit of VA which
isn't mapped in to userspace.
This means that userspace will never receive pointers living with that
top bit set unless you're running a 57bit VA host.

This results in userspace pointers never needing to do the sign
extending pointer canonicalization. But additionally some applications
actually don't understand the pointer canonicalization.
This results in bugs like: golang/go#49405
Now if you're running on a 57bit VA host, this will end up behaving like
FEX but it seems like no one in golang land has really messed with 57bit
VA yet.

In AArch64, when configured with a 48bit VA, the userspace gets the full
48bit VA space and on EL mode switch has the full address range change
to the kernel's 48bit VA.
This means that we will /very/ likely allocate pointers in the high
48bit space since Linux currently allocates top-down.

So behave more like x86-64, hide the top 128TB of memory space from the
guest before boot.

Testing: Took the M1Max 15ms to 21ms allocate the top 128TB.
@ianlancetaylor ianlancetaylor changed the title AMD64: Canonical 48-bit address bug in lfstack runtime: canonical 48-bit address bug in lfstack Nov 7, 2021
@ianlancetaylor
Copy link
Contributor

Thanks. Note that this code is deeply system dependent. Can the problem you mention actually happen on any existing system? If not, it's not worth worrying about.

@Sonicadvance1
Copy link
Author

Currently the only real concern is if anyone tries running on Intel's latest server platforms with LA57 enabled.
Which doesn't pass the userspace a 48-bit (or higher) pointer unless explicitly asked.
Otherwise it would likely just be toy environments attempting a 48-bit VA space like AArch64 platforms. Switching memory mappings on CPL change.

@thanm thanm added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Nov 8, 2021
@gopherbot gopherbot added the compiler/runtime Issues related to the Go compiler and/or runtime. label Jul 7, 2022
@mknyszek mknyszek moved this to Triage Backlog in Go Compiler / Runtime Jul 15, 2022
@seankhliao seankhliao added this to the Unplanned milestone Aug 20, 2022
@assistcontrol
Copy link

This bug hit FreeBSD, where it caused all Go packages to fail. We have spent a long time without any Go packages being available.

We found this issue yesterday and by working around it we have packages again.

So, it's not theoretical.

@assistcontrol
Copy link

I should mention: our workaround required changing a sysctl, so it's not something we can account for in the ports tree. This will affect end-users too.

@randall77
Copy link
Contributor

@assistcontrol

Can you explain what your addresses look like that are triggering this assert?
You are on amd64, right?

@assistcontrol
Copy link

@randall77 Yeah, this occurred on amd64 builders (jailed on an amd64 host).

You asked a good question, but I don't know how to answer that. How can I get you that information?

@dbaio
Copy link

dbaio commented Apr 13, 2025

Hi @randall77
On FreeBSD (developer branch), we are noticing this issue on newer CPUs that have the LA57 feature.

Building Go cmd/dist using /home/dbaio/ports/lang/go121/work/go-freebsd-amd64-bootstrap. (go1.20 freebsd/amd64)
cmd/dist
runtime: lfstack.push invalid packing: node=0x6d66afc44d8d40 cnt=0x1 packed=0x66afc44d8d400001 -> node=0x66afc44d8d40
fatal error: lfstack.push

runtime stack:
runtime.throw({0xa3a2b1?, 0xaffb50e700000000?})
	/usr/home/dg/go/go-bootstrap/go-freebsd-amd64-bootstrap/src/runtime/panic.go:1047 +0x5d fp=0x84c48df00 sp=0x84c48ded0 pc=0x43457d
runtime.(*lfstack).push(0x2?, 0x84c48dfb0?)
	/usr/home/dg/go/go-bootstrap/go-freebsd-amd64-bootstrap/src/runtime/lfstack.go:29 +0x125 fp=0x84c48df40 sp=0x84c48df00 pc=0x40bbc5
runtime.(*spanSetBlockAlloc).free(...)
	/usr/home/dg/go/go-bootstrap/go-freebsd-amd64-bootstrap/src/runtime/mspanset.go:322
runtime.(*spanSet).reset(0xe828b0)
	/usr/home/dg/go/go-bootstrap/go-freebsd-amd64-bootstrap/src/runtime/mspanset.go:264 +0x87 fp=0x84c48df70 sp=0x84c48df40 pc=0x42f727
runtime.finishsweep_m()
	/usr/home/dg/go/go-bootstrap/go-freebsd-amd64-bootstrap/src/runtime/mgcsweep.go:260 +0x9b fp=0x84c48dfb0 sp=0x84c48df70 pc=0x423b1b
runtime.gcStart.func1()
	/usr/home/dg/go/go-bootstrap/go-freebsd-amd64-bootstrap/src/runtime/mgc.go:668 +0x17 fp=0x84c48dfc0 sp=0x84c48dfb0 pc=0x45ec17
runtime.systemstack()
	/usr/home/dg/go/go-bootstrap/go-freebsd-amd64-bootstrap/src/runtime/asm_amd64.s:496 +0x49 fp=0x84c48dfc8 sp=0x84c48dfc0 pc=0x463d69

Full build log is here: https://people.freebsd.org/~dbaio/go/go121-gohan06-logs.txt

I can run some tests. Just guide me, please.

Details about the CPU:

CPU: Intel(R) Xeon(R) Silver 4314 CPU @ 2.40GHz (2400.00-MHz K8-class CPU)
  Origin="GenuineIntel"  Id=0x606a6  Family=0x6  Model=0x6a  Stepping=6
  Features=0xbfebfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE>
  Features2=0x7ffefbff<SSE3,PCLMULQDQ,DTES64,MON,DS_CPL,VMX,SMX,EST,TM2,SSSE3,SDBG,FMA,CX16,xTPR,PDCM,PCID,DCA,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,TSCDLT,AESNI,XSAVE,OSXSAVE,AVX,F16C,RDRAND>
  AMD Features=0x2c100800<SYSCALL,NX,Page1GB,RDTSCP,LM>
  AMD Features2=0x121<LAHF,ABM,Prefetch>
  Structured Extended Features=0xf3bfbfff<FSGSBASE,TSCADJ,SGX,BMI1,HLE,AVX2,FDPEXC,SMEP,BMI2,ERMS,INVPCID,RTM,PQM,NFPUSG,PQE,AVX512F,AVX512DQ,RDSEED,ADX,SMAP,AVX512IFMA,CLFLUSHOPT,CLWB,PROCTRACE,AVX512CD,SHA,AVX512BW,AVX512VL>
  Structured Extended Features2=0x40417f5e<AVX512VBMI,UMIP,PKU,OSPKE,AVX512VBMI2,GFNI,VAES,VPCLMULQDQ,AVX512VNNI,AVX512BITALG,TME,AVX512VPOPCNTDQ,LA57,RDPID,SGXLC>
  Structured Extended Features3=0xbc040412<FSRM,MD_CLEAR,PCONFIG,IBPB,STIBP,L1DFL,ARCH_CAP,SSBD>
  XSAVE Features=0xf<XSAVEOPT,XSAVEC,XINUSE,XSAVES>
  IA32_ARCH_CAPS=0x6bdeb<RDCL_NO,IBRS_ALL,SKIP_L1DFL_VME,MDS_NO,TSX_CTRL,TAA_NO>
  AMD Extended Feature Extensions ID EBX=0x200<WBNOINVD>
  VT-x: PAT,HLT,MTF,PAUSE,EPT,UG,VPID,VID,PostIntr
  TSC: P-state invariant, performance statistics

When building on another box (same FreeBSD version) that doesn't have LA57, it works just fine.

@randall77
Copy link
Contributor

The obvious experiment is to change this line

addrBits = 48
from 48 to 57. That almost certainly will fix your problem. The question is, is that safe?
It is reducing the tag space from 19 to 10 bits. I don't think that's a showstopper (some of the other archs have that number) but it is worrying, as those tags are there to defend against the ABA ambiguity in our lock-free linked list. A tag collision that results in not detecting an ABA situation would be really bad.

@randall77
Copy link
Contributor

Is there any way to mark a binary to tell the OS that we don't want LA57?

@arrowd
Copy link

arrowd commented Apr 14, 2025

On FreeBSD it is possible with elfctl -e +la48 /path/to/binary

@prattmic
Copy link
Member

prattmic commented Apr 14, 2025

Just to clarify: on Linux, I believe mmap will only return an address above 1<<47 if you pass a hint address above 1<<47 to mmap. I assume this is not true on FreeBSD? That mmap with hint address of 0 can return a higher address unless the elfctl -e +la48 is set?

@bsdjhb
Copy link

bsdjhb commented Apr 14, 2025

Just to clarify: on Linux, I believe mmap will only return an address above 1<<47 if you pass a hint address above 1<<47 to mmap. I assume this is not true on FreeBSD? That mmap with hint address of 0 can return a higher address unless the elfctl -e +la48 is set?

Correct, if LA57 is enabled, then mmap(NULL, ...) can return addresses above 1 << 47. The main thread stack is also likely to be above 1 << 47 as well since main thread stacks are typically allocated at the "top" of the user address space.

@randall77
Copy link
Contributor

Does elfctl -e +la48 just set this bit?

https://github.com/freebsd/freebsd-src/blob/6fbd1bed6e7bf880a6cc579b06bdc6476983613a/sys/sys/elf_common.h#L825

Can we just do that? Probably just add that constant to the writes on both sides of this if statement:

if *flagRace {

@bsdjhb
Copy link

bsdjhb commented Apr 14, 2025

Does elfctl -e +la48 just set this bit?

https://github.com/freebsd/freebsd-src/blob/6fbd1bed6e7bf880a6cc579b06bdc6476983613a/sys/sys/elf_common.h#L825

Can we just do that? Probably just add that constant to the writes on both sides of this if statement:

go/src/cmd/link/internal/ld/elf.go

Line 798 in 80bff42

if *flagRace {

Yes, you could just set the bit there. Note that this bit is amd64-only. We have not implemented a control for RISC-V. Are you doing anything comparable to disable LA57 on other OS's such as Linux currently?

@randall77
Copy link
Contributor

Yes, you could just set the bit there. Note that this bit is amd64-only.

That's fine.

We have not implemented a control for RISC-V.

We are currently assuming 56 bit (userspace) pointers for riscv. See #54104.

Are you doing anything comparable to disable LA57 on other OS's such as Linux currently?

I don't believe so. Linux is much better about never breaking existing binaries, so any la57 support would require explicit opt-in by the binary.
https://docs.kernel.org/arch/x86/x86_64/5level-paging.html#user-space-and-large-virtual-address-space

@randall77
Copy link
Contributor

The one thing I'm not sure about is how to handle external linking. Is there a command-line flag to the linker to set this bit?
I guess we could shell out to elfctl, but that requires assuming the existence of another tool.

@emaste
Copy link

emaste commented Apr 14, 2025

Linux is much better about never breaking existing binaries, so any la57 support would require explicit opt-in by the binary.

Just to make sure I understand correctly, this is only the case for x86 (i.e., #54104 is the same issue on Linux on RISC-V)?

Is there a command-line flag to the linker to set this bit? I guess we could shell out to elfctl, but that requires assuming the existence of another tool.

There is no cmdline flag. elfctl will exist on any FreeBSD build host, but of course that doesn't help with cross-compiling. I assume go just uses the existing system linker or a cross-linker from e.g. binutils or LLVM?

These flags are just bits in an NT_FREEBSD_FEATURE_CTL ELF note, which comes from /usr/lib/crt1.o (and friends). Note, elfctl requires that the note already exist.

@randall77
Copy link
Contributor

Just to make sure I understand correctly, this is only the case for x86 (i.e., #54104 is the same issue on Linux on RISC-V)?

Yes, this bug is only about x86. Go supports 56-bit pointers on riscv, since the fix for #54104 went in a few years ago.

I assume go just uses the existing system linker or a cross-linker from e.g. binutils or LLVM?

I believe so. I'm not sure the exact search path. I'm sure you can override it with some environment variables.

@kostikbel
Copy link

Or if go developers insist on requiring LA48 on amd64, something like this might work:

diff --git a/src/cmd/link/internal/ld/elf.go b/src/cmd/link/internal/ld/elf.go
index 6ff1d94383..2cebb0ab39 100644
--- a/src/cmd/link/internal/ld/elf.go
+++ b/src/cmd/link/internal/ld/elf.go
@@ -753,6 +753,7 @@ const (
 	ELF_NOTE_FREEBSD_FEATURE_CTL_TAG   = 4
 	ELF_NOTE_FREEBSD_VERSION           = 1203000 // 12.3-RELEASE
 	ELF_NOTE_FREEBSD_FCTL_ASLR_DISABLE = 0x1
+	ELF_NOTE_FREEBSD_FCTL_LA48         = 0x10
 )
 
 const ELF_NOTE_FREEBSD_NAME = "FreeBSD\x00"
@@ -797,9 +798,10 @@ func elfwritefreebsdsig(out *OutBuf) int {
 	out.WriteString(ELF_NOTE_FREEBSD_NAME)
 	if *flagRace {
 		// The race detector can't handle ASLR, turn the ASLR off when compiling with -race.
-		out.Write32(ELF_NOTE_FREEBSD_FCTL_ASLR_DISABLE)
+		out.Write32(ELF_NOTE_FREEBSD_FCTL_ASLR_DISABLE |
+		    ELF_NOTE_FREEBSD_FCTL_LA48)
 	} else {
-		out.Write32(0)
+		out.Write32(ELF_NOTE_FREEBSD_FCTL_LA48)
 	}
 
 	return int(sh.Size)

@randall77
Copy link
Contributor

Yes, that is what I was suggesting here.
That only works for internal linking though. If we're using the system-provided linker that code isn't executed.

@emaste
Copy link

emaste commented Apr 14, 2025

Does go link the system C startup objects when using the system linker? The system linker doesn't have any knowledge of the note, it just passes through from the csu bits.

@kostikbel
Copy link

Yes, that is what I was suggesting here. That only works for internal linking though. If we're using the system-provided linker that code isn't executed.

But then the build already rely on the execution of the external program (ld), and can rely on the elfctl presence and usability as well.

@gopherbot
Copy link
Contributor

Change https://go.dev/cl/665475 mentions this issue: cmd/link: add la48 to freebsd note section

@ayang64
Copy link
Member

ayang64 commented Apr 15, 2025

confirmed:

before patch:

ayan@kiwi:/tmp/hello$ go version
go version devel go1.25-ba7b8ca336 Mon Apr 14 15:10:49 2025 -0700 freebsd/amd64
ayan@kiwi:/tmp/hello$ elfctl -l ./hello.local
Known features are:
noaslr          Disable ASLR
noprotmax       Disable implicit PROT_MAX
nostackgap      Disable stack gap
wxneeded        Requires W+X mappings
la48            amd64: Limit user VA to 48bit
File './hello.local' features:
noaslr          'Disable ASLR' is unset.
noprotmax       'Disable implicit PROT_MAX' is unset.
nostackgap      'Disable stack gap' is unset.
wxneeded        'Requires W+X mappings' is unset.
la48            'amd64: Limit user VA to 48bit' is unset.

after patch:

$ elfctl -l ./hello.local
Known features are:
noaslr          Disable ASLR
noprotmax       Disable implicit PROT_MAX
nostackgap      Disable stack gap
wxneeded        Requires W+X mappings
la48            amd64: Limit user VA to 48bit
File './hello.local' features:
noaslr          'Disable ASLR' is unset.
noprotmax       'Disable implicit PROT_MAX' is unset.
nostackgap      'Disable stack gap' is unset.
wxneeded        'Requires W+X mappings' is unset.
la48            'amd64: Limit user VA to 48bit' is set.

@gopherbot
Copy link
Contributor

Change https://go.dev/cl/665815 mentions this issue: runtime: align taggable pointers more so we can use low bits for tag

@randall77
Copy link
Contributor

I found a few extra bits in the cushions of my couch, so we may be ok here. (See CL above.)

If that CL works out, then we can just bump to 57 address bits on freebsd.

The only hesitation I have is that that CL is nontrivial enough that backporting is dicey. If we want to backport something here (and I suspect we do not, but I'm willing to hear arguments), the la48 bit CL might be safer.

@kostikbel
Copy link

I am still curious why it is fine to use 56/57 bits on RISC-V and PPC AIX, while the tags are considered too short on LA57 amd64.

@assistcontrol
Copy link

Now that Go downloads and builds toolchains as needed, FreeBSD is strongly considering providing only the most recent release, rather than multiple versions.

Not backporting the CL is really all the justification I'd need to drop the old versions, so I wouldn't do any difficult backporting work if it were just for FreeBSD's benefit.

@randall77
Copy link
Contributor

I am still curious why it is fine to use 56/57 bits on RISC-V and PPC AIX, while the tags are considered too short on LA57 amd64.

Different levels of risk tolerance?

We also give second-class ports a fair amount of leeway in deciding what is best for them. That said, I think I would have pushed back on those changes had I seen them. Water under the bridge.

More generally, though, I don't think anyone has an idea about how big a tag space would be "safe". I have the feeling that we probably want more bits for large (in the # of processors sense) machines. Those machines tend to be amd64/arm64. But even there I'm not relying on any concrete arguments, just intuition.
My hesitation comes down to the fact that if we get a tag collision, really bad things happen. Very nondeterministic heap corruption, with no indication at all of the underlying cause. We could already be in a state where we don't have enough tag bits, and we'd never know (as long as the collision rate was not too high). Just a contributor to the background "maybe that was just a cosmic ray hitting the wrong bit?" failures.

@randall77
Copy link
Contributor

Not backporting the CL is really all the justification I'd need to drop the old versions, so I wouldn't do any difficult backporting work if it were just for FreeBSD's benefit.

A backport would be available earlier than the 1.25 release. (In a few weeks, say, versus a few months for 1.25.) But point taken, thanks.

@assistcontrol
Copy link

Not backporting the CL is really all the justification I'd need to drop the old versions, so I wouldn't do any difficult backporting work if it were just for FreeBSD's benefit.

A backport would be available earlier than the 1.25 release. (In a few weeks, say, versus a few months for 1.25.) But point taken, thanks.

Ahh ok, well I was pretty optimistic there. It would of course be nice to get this into the ports tree sooner rather than later so we can stop universally limiting pointer width on the builders (with all of its downstream effects).

So, if it's possible to backport to 1.24, it'd make a difference for us. But we also understand that sometimes dicey is dicey.

gopherbot pushed a commit that referenced this issue Apr 24, 2025
Currently we assume alignment to 8 bytes, so we can steal the low 3 bits.
This CL assumes alignment to 512 bytes, so we can steal the low 9 bits.

That's 6 extra bits!

Aligning to 512 bytes wastes a bit of space but it is not egregious.
Most of the objects that we make tagged pointers to are pretty big.

Update #49405

Change-Id: I66fc7784ac1be5f12f285de1d7851d5a6871fb75
Reviewed-on: https://go-review.googlesource.com/c/go/+/665815
Reviewed-by: Keith Randall <[email protected]>
Reviewed-by: Michael Knyszek <[email protected]>
Auto-Submit: Keith Randall <[email protected]>
LUCI-TryBot-Result: Go LUCI <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Projects
Status: Triage Backlog
Development

No branches or pull requests