Skip to content

Occasional crash when using sensors on MacOS arm64 #1832

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
1 task done
PierreF opened this issue Apr 3, 2025 · 2 comments
Open
1 task done

Occasional crash when using sensors on MacOS arm64 #1832

PierreF opened this issue Apr 3, 2025 · 2 comments

Comments

@PierreF
Copy link
Contributor

PierreF commented Apr 3, 2025

Describe the bug

I got some random crash when calling SensorsTemperatures() on MacOS (24.3.0 on arm64).

The error seems to be some kind of race-condition, since it only occur when multiple threads are calling SensorsTemperatures concurrently.

On real usage, I think my true issue is with concurrent call between SensorsTemperatures and another usage of ioKit and/or corefoundation. But I was not able to have reproducible code sample that only code one call per system (sensors, disk, cpu, mem...)

To Reproduce

package main

import (
        "log/slog"
	"sync"

	"github.com/shirou/gopsutil/v4/sensors"
)

func main() {
	var wg sync.WaitGroup

	for range 30 { // The higher is this number, the more likely issue will occur. Empirically 30 seems a good value
		wg.Add(1)

		go func() {
			defer wg.Done()
			r, err := sensors.SensorsTemperatures()

			if false {
				// The log itself isn't required to produce the bug, but without
				// assigning SensorsTemperatures result to variable the bug don't seems to
				// occure, maybe due to compiler optimization ?
				slog.Info("sensors", slog.Any("r", r), slog.Any("err", err))
			}
		}()
	}

	wg.Wait()
}

Run the program (possibly multiple time, the race condition seems rather unlikely):

go build sensors_bug.go

while ./sensors_bug ; do echo "Sucess"; done 2>&1 | tee large_error_message.log

It result in error like:

unexpected fault address 0x100921808
fatal error: fault
[signal SIGBUS: bus error code=0x1 addr=0x100921808 pc=0x10092181c]

goroutine 39 gp=0x14000106c40 m=28 mp=0x140000ee008 [running]:
runtime.throw({0x100923b7e?, 0x0?})
	/opt/homebrew/Cellar/go/1.24.2/libexec/src/runtime/panic.go:1101 +0x38 fp=0x14000297a70 sp=0x14000297a40 pc=0x1008d0fe8
runtime.sigpanic()
	/opt/homebrew/Cellar/go/1.24.2/libexec/src/runtime/signal_unix.go:922 +0x170 fp=0x14000297ad0 sp=0x14000297a70 pc=0x1008d2800
github.com/shirou/gopsutil/v4/internal/common.NewLibrary({0x0, 0x0})
	/Users/pierref/go/pkg/mod/github.com/shirou/gopsutil/[email protected]/internal/common/common_darwin.go:97 +0x9c fp=0x14000297b20 sp=0x14000297ae0 pc=0x10092181c
github.com/shirou/gopsutil/v4/sensors.TemperaturesWithContext({0x0?, 0x0?})
	/Users/pierref/go/pkg/mod/github.com/shirou/gopsutil/[email protected]/sensors/sensors_darwin_arm64.go:54 +0x6d4 fp=0x14000297fc0 sp=0x14000297b20 pc=0x100922144
created by main.main in goroutine 1
	/Users/pierref/tmp/20250403-1426/sensors_bug.go:16 +0x38

goroutine 1 gp=0x140000021c0 m=nil [sync.WaitGroup.Wait]:
runtime.gopark(0x100a29680?, 0x1008d1310?, 0x0?, 0x40?, 0x100da7f28?)
	/opt/homebrew/Cellar/go/1.24.2/libexec/src/runtime/proc.go:435 +0xc8 fp=0x1400006de50 sp=0x1400006de30 pc=0x1008d10c8
runtime.goparkunlock(...)
	/opt/homebrew/Cellar/go/1.24.2/libexec/src/runtime/proc.go:441
runtime.semacquire1(0x140001140b8, 0x0, 0x1, 0x0, 0x18)
	/opt/homebrew/Cellar/go/1.24.2/libexec/src/runtime/sema.go:188 +0x204 fp=0x1400006dea0 sp=0x1400006de50 pc=0x1008b4604
sync.runtime_SemacquireWaitGroup(0x140000021c0?)
	/opt/homebrew/Cellar/go/1.24.2/libexec/src/runtime/sema.go:110 +0x2c fp=0x1400006dee0 sp=0x1400006dea0 pc=0x1008d24ac
sync.(*WaitGroup).Wait(0x140001140b0)
[... truncated since I don't belive it matter for this bug]

Expected behavior

No crash :)

Environment (please complete the following information):

  • Mac OS: [paste the result of sw_vers and uname -a
$ sw_vers
ProductName:            macOS
ProductVersion:         15.3.2
BuildVersion:           24D81
$ uname -a
Darwin mbp-de-pierre.bleemeo.work 24.3.0 Darwin Kernel Version 24.3.0: Thu Jan  2 20:24:16 PST 2025; root:xnu-11215.81.4~3/RELEASE_ARM64_T6000 arm64 arm Darwin

gopsutil version:

$ cat go.mod 
module test

go 1.24.2

require github.com/shirou/gopsutil/v4 v4.25.3

require (
        github.com/ebitengine/purego v0.8.2 // indirect
        github.com/go-ole/go-ole v1.2.6 // indirect
        github.com/yusufpapurcu/wmi v1.2.4 // indirect
        golang.org/x/sys v0.28.0 // indirect
)

Additional context

I think the bug is due to ioKit and/or corefoundation library being closed by another gorouting while still being used by the one who crash.

To experiment with this, I've modified TemperaturesWithContext (go mod vendor then edit "vendor/github.com/shirou/gopsutil/v4/sensors/sensors_darwin_arm64.go).

The idea is to make TemperaturesWithContext doing concurrent call (like the minimal step to reproduce it), but this time the ioKit and coreFoundation library are shared between gorouting.

func TemperaturesWithContext(_ context.Context) ([]TemperatureStat, error) {
	var wg sync.WaitGroup

	var (
		globalResult []TemperatureStat
		globalErr    error
		l            sync.Mutex
	)

	ioKit, err := common.NewLibrary(common.IOKit)
	if err != nil {
		return nil, err
	}
	defer ioKit.Close()

	coreFoundation, err := common.NewLibrary(common.CoreFoundation)
	if err != nil {
		return nil, err
	}
	defer coreFoundation.Close()

	for range 30 { // Once more, the higher the most likely to produce the bug
		wg.Add(1)

		go func() {
			defer wg.Done()

			r, err := temperaturesWithContext(ioKit, coreFoundation)

			l.Lock()
			defer l.Unlock()
			globalResult = r
			globalErr = err
		}()
	}

	wg.Wait()

	return globalResult, globalErr
}

func temperaturesWithContext(ioKit *common.Library, coreFoundation *common.Library) ([]TemperatureStat, error) {
	ta := &temperatureArm{
		ioKit:                              ioKit,
		cf:                                 coreFoundation,
[... the remaining of the original TemperaturesWithContext unmodified]

With this change, calling TemperaturesWithContext no longer crash:

$ cat single_call.go 
package main

import (
        "log/slog"

        "github.com/shirou/gopsutil/v4/sensors"
)

func main() {
        r, err := sensors.SensorsTemperatures()
        slog.Info("sensors", slog.Any("r", r), slog.Any("err", err))
}

$ go build single_call.go; while ./single_call ; do echo "Sucess"; done 2>&1 | tee large_error_message.log

If you move ioKit & coreFoundation inside the go func() { } (i.e. initialize and close) the libraries per gorouting, it will crash.

Very final note: only Sensors seems affected by this bug (maybe because sensor does the more complex usage of the ioKit/CF libraries ?): the following code don't exhibit the crash even if it use ioKit/CF concurrently on cpu/disk/mem: https://gist.github.com/PierreF/dd5864811ef6de22bfcb431810fe4f4f

@uubulb
Copy link
Contributor

uubulb commented Apr 7, 2025

The reproduce condition seems to be a bit extreme, I ran the code you provided 102 times (as suggested by the length of large_error_message.log) before it crashed. In that case it means an unreasonable amount of IOKit / Core Foundation calls, and sure the sensor package is more complex, so it might be easier for it to reach system limits.

@PierreF
Copy link
Contributor Author

PierreF commented Apr 7, 2025

In my test, I usually get it in less than 10 tries :/
It probably means that the race condition isn't linked only to calling sensors concurrently, and might even depends on something running elsewhere... (another process on the system ? I also think to other gorouting / GC ?).

If I can found some time, I'll try to come with more realistic way to reproduce it. In real usage I don't call sensors concurrently (only concurrently with disk/cpu/mem) and I do hit the bug "fast" (like in few hundreds call to sensors - i.e. 1 hours with one call to sensors every 10 seconds).

@Lomanic Lomanic changed the title Occational crash when using sensors on MacOS arm64 Occasional crash when using sensors on MacOS arm64 Apr 16, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants