Skip to content

NNG Panic on Python shutdown #108

Open
@broglep-work

Description

@broglep-work

We experience (under certain conditions) reproducible crash of python process.

panic: pthread_mutex_lock: Invalid argument
This message is indicative of a BUG.
Report this at https://github.com/nanomsg/nng/issues
/home/vsts/work/1/lib/python3.8/site-packages/pynng/_nng.abi3.so(nni_panic+0x112) [0x7fc452152762]
/home/vsts/work/1/lib/python3.8/site-packages/pynng/_nng.abi3.so(nni_aio_fini+0x20) [0x7fc45214bfe0]
/home/vsts/work/1/lib/python3.8/site-packages/pynng/_nng.abi3.so(nni_aio_free+0x1b) [0x7fc45214c11b]
/home/vsts/work/1/lib/python3.8/site-packages/pynng/_nng.abi3.so(+0x4632f) [0x7fc4520fd32f]
/opt/hostedtoolcache/Python/3.8.13/x64/lib/libpython3.8.so.1.0(+0x158783) [0x7fc4569ea783]
/opt/hostedtoolcache/Python/3.8.13/x64/lib/libpython3.8.so.1.0(_PyEval_EvalFrameDefault+0x4adc) [0x7fc456a31fcc]
/opt/hostedtoolcache/Python/3.8.13/x64/lib/libpython3.8.so.1.0(_PyFunction_Vectorcall+0xfa) [0x7fc4569c2dfa]
/opt/hostedtoolcache/Python/3.8.13/x64/lib/libpython3.8.so.1.0(_PyEval_EvalFrameDefault+0x7b8) [0x7fc456a2dca8]
/opt/hostedtoolcache/Python/3.8.13/x64/lib/libpython3.8.so.1.0(_PyEval_EvalCodeWithName+0x301) [0x7fc456a2c8d1]
/opt/hostedtoolcache/Python/3.8.13/x64/lib/libpython3.8.so.1.0(_PyFunction_Vectorcall+0x18e) [0x7fc4569c2e8e]
/opt/hostedtoolcache/Python/3.8.13/x64/lib/libpython3.8.so.1.0(+0x1337b3) [0x7fc4569c57b3]
/opt/hostedtoolcache/Python/3.8.13/x64/lib/libpython3.8.so.1.0(_PyEval_EvalFrameDefault+0x27c4) [0x7fc456a2fcb4]
/opt/hostedtoolcache/Python/3.8.13/x64/lib/libpython3.8.so.1.0(+0x13770b) [0x7fc4569c970b]
/opt/hostedtoolcache/Python/3.8.13/x64/lib/libpython3.8.so.1.0(+0x1e8624) [0x7fc456a7a624]
/opt/hostedtoolcache/Python/3.8.13/x64/lib/libpython3.8.so.1.0(+0x1e8d67) [0x7fc456a7ad67]
/opt/hostedtoolcache/Python/3.8.13/x64/lib/libpython3.8.so.1.0(+0x1e867e) [0x7fc456a7a67e]

My investigation showed that it is caused by access of nni_aio_lk which was deinit'd by nng_fini / nni_fini / nni_aio_sys_fini. nni_aio_sys_fini is triggered by _pynng_atexit and nni_aio_free by AIOHelper.__del__

This looks like a timing issue on python shutdown: depending on atexit functions are called and objects garbage collected, it might be possible that nng was already deinitialized and pynng, but pynng is still calling the nng lib

It is currently unclear how best to address this issue. Any ideas?

(for background, this occurs reproducibly in our CI when using pynng 0.7.1, asyncio & pytest, but we did also see it from time to time when running our application on developer machines and stopping the application)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions