gccgo: seg fault if profiling signal arrives when collecting backtrace #29448

cherrymui · 2018-12-29T00:08:37Z

What version of Go are you using (`go version`)?

gccgo (GCC) 9.0.0 20181224 (experimental)

What operating system and processor architecture are you using (`go env`)?

linux/amd64

What did you do?

package main

import (
	"log"
	"os"
	"runtime"
	"runtime/pprof"
)

const N = 10000

func main() {
	f, err := os.Create("/tmp/prof")
	if err != nil {
		log.Fatal(err)
	}

	pprof.StartCPUProfile(f)

	go func(){
		select {}
	}()

	buf := make([]byte, 99999)
	for i := 0; i < N; i++ {
		runtime.Stack(buf, true)
	}

	pprof.StopCPUProfile()
	f.Close()
}

This program, compiled with gccgo, crashes with high probability. Here is the stack trace at crash:

(gdb) bt
#0  0x0000000000544562 in runtime.sigtramp (sig=27, info=0x7ffff7e88bf0, context=0x7ffff7e88ac0) at /usr/local/google/home/cherryyz/src/gccgo/libgo/runtime/go-signal.c:94
#1  <signal handler called>
#2  0x00007ffff7342b3c in setcontext () from /lib/x86_64-linux-gnu/libc.so.6
#3  0x000000000040706b in runtime.gogo (newg=newg@entry=0xc0001a4a80) at /usr/local/google/home/cherryyz/src/gccgo/libgo/runtime/proc.c:292
#4  0x00000000004070f2 in runtime.getTraceback (me=me@entry=0xc000000a80, gp=gp@entry=0xc0001a4a80) at /usr/local/google/home/cherryyz/src/gccgo/libgo/runtime/proc.c:451
#5  0x000000000048feda in runtime.tracebackothers (me=me@entry=0xc000000a80) at /usr/local/google/home/cherryyz/src/gccgo/libgo/go/runtime/traceback_gccgo.go:229
#6  0x00000000004aee00 in runtime.Stack (buf=..., all=<optimized out>) at /usr/local/google/home/cherryyz/src/gccgo/libgo/go/runtime/mprof.go:812
#7  0x0000000000405bab in main.main () at p.go:26

(gdb) p gp
$1 = (G *) 0xc0001a4a80
(gdb) p gp->m
$2 = (struct m *) 0x0

The signal arrives half-way through a goroutine switch, where gp is not nil but gp->m is nil. Function getTraceback switches to gp without setting up an m. When setcontext finishes, it will be in mcall, which will call gtraceback, which will set up an m. But there is a gap and profiling signal could arrive in between. Maybe we could set up an m earlier, in getTraceback, before switching to gp.

Another possibility would be in sigtramp, treating "gp is not nil but gp->m is nil" the same as gp is nil.

Found this initially when investigating a similar seg fault with precise stack scan (with gollvm), where it uses a similar way to switch goroutine when scanning another goroutine's stack.

cc @ianlancetaylor @thanm

The text was updated successfully, but these errors were encountered:

gopherbot · 2018-12-31T18:30:15Z

Change https://golang.org/cl/156037 mentions this issue: runtime: prevent deadlock when profiling signal arrives during traceback

gopherbot · 2018-12-31T18:30:15Z

Change https://golang.org/cl/156038 mentions this issue: runtime: in getTraceback, set gp->m before gogo

gopherbot · 2018-12-31T18:34:01Z

Change https://golang.org/cl/156018 mentions this issue: runtime/pprof: add a test for gccgo bug #29448

Traceback routines, e.g. callers and funcentry, may call __go_get_backtrace_state. If a profiling signal arrives while we are in the critical section of __go_get_backtrace_state, it tries to do a traceback, which also calls __go_get_backtrace_state, which tries to enter the same critical section and will deadlock. Prevent this deadlock by setting up runtime_in_callers before calling __go_get_backtrace_state. Found while investigating golang/go#29448. Will add a test in the next CL. Updates golang/go#29448. Change-Id: Ifb50c50aedc6faf7259757b82e11b30b867bc70d Reviewed-on: https://go-review.googlesource.com/c/156037 Reviewed-by: Ian Lance Taylor <[email protected]>

…back Traceback routines, e.g. callers and funcentry, may call __go_get_backtrace_state. If a profiling signal arrives while we are in the critical section of __go_get_backtrace_state, it tries to do a traceback, which also calls __go_get_backtrace_state, which tries to enter the same critical section and will deadlock. Prevent this deadlock by setting up runtime_in_callers before calling __go_get_backtrace_state. Found while investigating golang/go#29448. Will add a test in the next CL. Updates golang/go#29448. Reviewed-on: https://go-review.googlesource.com/c/156037 git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@267590 138bc75d-0d04-0410-961f-82ee72b054a4

With gccgo, if a profiling signal arrives in certain time during traceback, it may crash or hang. The fix is CL 156037 and CL 156038. This CL adds a test. Updates #29448. Change-Id: Idb36af176b4865b8fb31a85cad185ed4c07ade0c Reviewed-on: https://go-review.googlesource.com/c/156018 Reviewed-by: Ian Lance Taylor <[email protected]> Reviewed-by: Brad Fitzpatrick <[email protected]> Run-TryBot: Ian Lance Taylor <[email protected]> TryBot-Result: Gobot Gobot <[email protected]>

gopherbot · 2019-01-07T21:39:21Z

Change https://golang.org/cl/156697 mentions this issue: runtime: in doscanstackswitch, set gp->m before gogo

This is following CL 156038. doscanstackswitch uses the same mechanism of switching goroutines as getTraceback, and so has the same problem as described in issue golang/go#29448. This CL applies the same fix. Change-Id: I0da9996b0f73755f0171c7c10be6c77cb298b517 Reviewed-on: https://go-review.googlesource.com/c/156697 Reviewed-by: Ian Lance Taylor <[email protected]>

Currently, when collecting a traceback for another goroutine, getTraceback calls gogo(gp) switching to gp, which will resume in mcall, which will call gtraceback, which will set up gp->m. There is a gap between setting the current running g to gp and setting gp->m. If a profiling signal arrives in between, sigtramp will see a non-nil gp with a nil m, and will seg fault. Fix this by setting up gp->m first. Fixes golang/go#29448. Reviewed-on: https://go-review.googlesource.com/c/156038 git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@267658 138bc75d-0d04-0410-961f-82ee72b054a4

This is following CL 156038. doscanstackswitch uses the same mechanism of switching goroutines as getTraceback, and so has the same problem as described in issue golang/go#29448. This CL applies the same fix. Reviewed-on: https://go-review.googlesource.com/c/156697 git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@267661 138bc75d-0d04-0410-961f-82ee72b054a4

gopherbot added this to the Gccgo milestone Dec 29, 2018

cherrymui self-assigned this Dec 31, 2018

gopherbot closed this as completed in golang/gofrontend@c257303 Jan 7, 2019

golang locked and limited conversation to collaborators Jan 7, 2020

gopherbot added the FrozenDueToAge label Jan 7, 2020

rsc unassigned cherrymui Jun 23, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

gccgo: seg fault if profiling signal arrives when collecting backtrace #29448

gccgo: seg fault if profiling signal arrives when collecting backtrace #29448

cherrymui commented Dec 29, 2018

gopherbot commented Dec 31, 2018

gopherbot commented Dec 31, 2018

gopherbot commented Dec 31, 2018

gopherbot commented Jan 7, 2019

gccgo: seg fault if profiling signal arrives when collecting backtrace #29448

gccgo: seg fault if profiling signal arrives when collecting backtrace #29448

Comments

cherrymui commented Dec 29, 2018

What version of Go are you using (go version)?

What operating system and processor architecture are you using (go env)?

What did you do?

gopherbot commented Dec 31, 2018

gopherbot commented Dec 31, 2018

gopherbot commented Dec 31, 2018

gopherbot commented Jan 7, 2019

What version of Go are you using (`go version`)?

What operating system and processor architecture are you using (`go env`)?