Description
We experience (under certain conditions) reproducible crash of python process.
panic: pthread_mutex_lock: Invalid argument
This message is indicative of a BUG.
Report this at https://github.com/nanomsg/nng/issues
/home/vsts/work/1/lib/python3.8/site-packages/pynng/_nng.abi3.so(nni_panic+0x112) [0x7fc452152762]
/home/vsts/work/1/lib/python3.8/site-packages/pynng/_nng.abi3.so(nni_aio_fini+0x20) [0x7fc45214bfe0]
/home/vsts/work/1/lib/python3.8/site-packages/pynng/_nng.abi3.so(nni_aio_free+0x1b) [0x7fc45214c11b]
/home/vsts/work/1/lib/python3.8/site-packages/pynng/_nng.abi3.so(+0x4632f) [0x7fc4520fd32f]
/opt/hostedtoolcache/Python/3.8.13/x64/lib/libpython3.8.so.1.0(+0x158783) [0x7fc4569ea783]
/opt/hostedtoolcache/Python/3.8.13/x64/lib/libpython3.8.so.1.0(_PyEval_EvalFrameDefault+0x4adc) [0x7fc456a31fcc]
/opt/hostedtoolcache/Python/3.8.13/x64/lib/libpython3.8.so.1.0(_PyFunction_Vectorcall+0xfa) [0x7fc4569c2dfa]
/opt/hostedtoolcache/Python/3.8.13/x64/lib/libpython3.8.so.1.0(_PyEval_EvalFrameDefault+0x7b8) [0x7fc456a2dca8]
/opt/hostedtoolcache/Python/3.8.13/x64/lib/libpython3.8.so.1.0(_PyEval_EvalCodeWithName+0x301) [0x7fc456a2c8d1]
/opt/hostedtoolcache/Python/3.8.13/x64/lib/libpython3.8.so.1.0(_PyFunction_Vectorcall+0x18e) [0x7fc4569c2e8e]
/opt/hostedtoolcache/Python/3.8.13/x64/lib/libpython3.8.so.1.0(+0x1337b3) [0x7fc4569c57b3]
/opt/hostedtoolcache/Python/3.8.13/x64/lib/libpython3.8.so.1.0(_PyEval_EvalFrameDefault+0x27c4) [0x7fc456a2fcb4]
/opt/hostedtoolcache/Python/3.8.13/x64/lib/libpython3.8.so.1.0(+0x13770b) [0x7fc4569c970b]
/opt/hostedtoolcache/Python/3.8.13/x64/lib/libpython3.8.so.1.0(+0x1e8624) [0x7fc456a7a624]
/opt/hostedtoolcache/Python/3.8.13/x64/lib/libpython3.8.so.1.0(+0x1e8d67) [0x7fc456a7ad67]
/opt/hostedtoolcache/Python/3.8.13/x64/lib/libpython3.8.so.1.0(+0x1e867e) [0x7fc456a7a67e]
My investigation showed that it is caused by access of nni_aio_lk
which was deinit'd by nng_fini
/ nni_fini
/ nni_aio_sys_fini
. nni_aio_sys_fini
is triggered by _pynng_atexit and nni_aio_free
by AIOHelper.__del__
This looks like a timing issue on python shutdown: depending on atexit functions are called and objects garbage collected, it might be possible that nng was already deinitialized and pynng, but pynng is still calling the nng lib
It is currently unclear how best to address this issue. Any ideas?
(for background, this occurs reproducibly in our CI when using pynng 0.7.1, asyncio & pytest, but we did also see it from time to time when running our application on developer machines and stopping the application)