Skip to content

refactor: EventLoop locking cleanups + client disconnect exception #160

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 13 commits into from
Jun 19, 2025

Conversation

ryanofsky
Copy link
Collaborator

@ryanofsky ryanofsky commented Feb 10, 2025

This PR was originally written as a code cleanup to follow up on review comments in earlier PRs. Several other commits were added afterward building on the earlier changes, and necessary for the bugfixes in bitcoin/bitcoin#32345, which resolves issues #123, #176 and #182.

Summary of changes:

  • The commit "Improve IPC client disconnected exceptions" is the only external change, and lets clients detect when they are trying to call a remote method after a connection has been closed. This change is used by
    bitcoin/bitcoin#32345 commit "ipc: Handle bitcoin-wallet disconnection" to fix #123.
  • The commits "Prevent EventLoop async cleanup thread early exit during shutdown" and "Prevent IPC server crash if disconnected during IPC call" fix issues reported in #182.
  • New tests are added in other commits to verify these fixes, covering different kinds of disconnections.
  • The change to detect disconnects also relies on an earlier "Add ProxyContext EventLoop* member" commit.
  • An "Add EventLoopRef RAII class" commit simplifies a recent bugfix in #159 and also relies on the earlier "EventLoop* member" commit.
  • The "Remove DestructorCatcher and AsyncCallable" commit follows up on #144 (comment).
  • The "Add clang thread safety annotations" commit follows up on #129 (comment).

@ryanofsky
Copy link
Collaborator Author

Because this change removes EventLoop addClient and removeClient methods it will require an update to Bitcoin core if it is merged.

--- a/src/ipc/capnp/protocol.cpp
+++ b/src/ipc/capnp/protocol.cpp
@@ -41,10 +41,7 @@ class CapnpProtocol : public Protocol
 public:
     ~CapnpProtocol() noexcept(true)
     {
-        if (m_loop) {
-            std::unique_lock<std::mutex> lock(m_loop->m_mutex);
-            m_loop->removeClient(lock);
-        }
+        m_loop_ref.reset();
         if (m_loop_thread.joinable()) m_loop_thread.join();
         assert(!m_loop);
     };
@@ -83,10 +80,7 @@ public:
         m_loop_thread = std::thread([&] {
             util::ThreadRename("capnp-loop");
             m_loop.emplace(exe_name, &IpcLogFn, &m_context);
-            {
-                std::unique_lock<std::mutex> lock(m_loop->m_mutex);
-                m_loop->addClient(lock);
-            }
+            m_loop_ref.emplace(*m_loop);
             promise.set_value();
             m_loop->loop();
             m_loop.reset();
@@ -96,6 +90,7 @@ public:
     Context m_context;
     std::thread m_loop_thread;
     std::optional<mp::EventLoop> m_loop;
+    std::optional<mp::EventLoopRef> m_loop_ref;
 };
 } // namespace

@ryanofsky ryanofsky changed the title refactor: EventLoop locking cleanups refactor: EventLoop locking cleanups and client Disconnect errors Apr 24, 2025
@ryanofsky ryanofsky changed the title refactor: EventLoop locking cleanups and client Disconnect errors refactor: EventLoop locking cleanups + client disconnect exceptions Apr 24, 2025
@ryanofsky ryanofsky changed the title refactor: EventLoop locking cleanups + client disconnect exceptions refactor: EventLoop locking cleanups + client disconnect exception Apr 24, 2025
@ryanofsky
Copy link
Collaborator Author

ryanofsky commented Apr 24, 2025

Updated ce4814f -> b47ea9f (pr/eventlock.1 -> pr/eventlock.2, compare), making test and comment fixes, splitting commits, and adding connection disconnects tests and exceptions to help resolve #123

ryanofsky added a commit to ryanofsky/bitcoin that referenced this pull request Apr 24, 2025
Use EventLoopRef to avoid reference counting bugs and be more exception safe
and deal with removal of addClient/removeClient methods in
bitcoin-core/libmultiprocess#160

A test update is also required due to
bitcoin-core/libmultiprocess#160 to deal with changed
reference count semantics. In IpcPipeTest(), it is is now necessary to destroy
the client Proxy object instead of just the client Connection object to
decrease the event loop reference count and allow the loop to exit so the test
does not hang on shutdown.
ryanofsky added a commit to ryanofsky/bitcoin that referenced this pull request Apr 24, 2025
This fixes an error reported by Antoine Poinsot <[email protected]> in
bitcoin-core/libmultiprocess#123 that does not happen
in master, but does happen with bitcoin#10102
applied, where if the child bitcoin-wallet process is killed (either by an
external signal or by Ctrl-C as reported in the issue) the bitcoin-node process
will not shutdown cleanly after that because chain client flush()
calls will fail.

This change fixes the problem by handling ipc::Exception errors thrown during
the flush() calls, and it relies on the fixes to disconnect detection
implemented in bitcoin-core/libmultiprocess#160 to work
effectively.
@ryanofsky
Copy link
Collaborator Author

Rebased b47ea9f -> f15ef6c (pr/eventlock.2 -> pr/eventlock.3, compare) so this can be merged cleanly in bitcoin PR bitcoin/bitcoin#32345. Also dropped DisconnectError exception type so original ipc::Exception can be used instead.

ryanofsky added a commit to ryanofsky/bitcoin that referenced this pull request Apr 25, 2025
Use EventLoopRef to avoid reference counting bugs and be more exception safe
and deal with removal of addClient/removeClient methods in
bitcoin-core/libmultiprocess#160

A test update is also required due to
bitcoin-core/libmultiprocess#160 to deal with changed
reference count semantics. In IpcPipeTest(), it is is now necessary to destroy
the client Proxy object instead of just the client Connection object to
decrease the event loop reference count and allow the loop to exit so the test
does not hang on shutdown.
ryanofsky added a commit to ryanofsky/bitcoin that referenced this pull request Apr 25, 2025
This fixes an error reported by Antoine Poinsot <[email protected]> in
bitcoin-core/libmultiprocess#123 that does not happen
in master, but does happen with bitcoin#10102
applied, where if the child bitcoin-wallet process is killed (either by an
external signal or by Ctrl-C as reported in the issue) the bitcoin-node process
will not shutdown cleanly after that because chain client flush()
calls will fail.

This change fixes the problem by handling ipc::Exception errors thrown during
the flush() calls, and it relies on the fixes to disconnect detection
implemented in bitcoin-core/libmultiprocess#160 to work
effectively.
ryanofsky added a commit to ryanofsky/bitcoin that referenced this pull request Apr 28, 2025
This fixes an error reported by Antoine Poinsot <[email protected]> in
bitcoin-core/libmultiprocess#123 that does not happen
in master, but does happen with bitcoin#10102
applied, where if the child bitcoin-wallet process is killed (either by an
external signal or by Ctrl-C as reported in the issue) the bitcoin-node process
will not shutdown cleanly after that because chain client flush()
calls will fail.

This change fixes the problem by handling ipc::Exception errors thrown during
the flush() calls, and it relies on the fixes to disconnect detection
implemented in bitcoin-core/libmultiprocess#160 to work
effectively.
ryanofsky added a commit to ryanofsky/bitcoin that referenced this pull request Apr 28, 2025
This fixes an error reported by Antoine Poinsot <[email protected]> in
bitcoin-core/libmultiprocess#123 that does not happen
in master, but does happen with bitcoin#10102
applied, where if the child bitcoin-wallet process is killed (either by an
external signal or by Ctrl-C as reported in the issue) the bitcoin-node process
will not shutdown cleanly after that because chain client flush()
calls will fail.

This change fixes the problem by handling ipc::Exception errors thrown during
the flush() calls, and it relies on the fixes to disconnect detection
implemented in bitcoin-core/libmultiprocess#160 to work
effectively.
@ryanofsky
Copy link
Collaborator Author

Rebased f15ef6c -> 2214358 (pr/eventlock.3 -> pr/eventlock.4, compare) to fix conflict with #165

@ryanofsky
Copy link
Collaborator Author

Rebased 2214358 -> 6715a96 (pr/eventlock.4 -> pr/eventlock.5, compare) to fix conflicts with #172

ryanofsky added a commit to ryanofsky/bitcoin that referenced this pull request Jun 5, 2025
Use EventLoopRef to avoid reference counting bugs and be more exception safe
and deal with removal of addClient/removeClient methods in
bitcoin-core/libmultiprocess#160

A test update is also required due to
bitcoin-core/libmultiprocess#160 to deal with changed
reference count semantics. In IpcPipeTest(), it is is now necessary to destroy
the client Proxy object instead of just the client Connection object to
decrease the event loop reference count and allow the loop to exit so the test
does not hang on shutdown.
ryanofsky added a commit to ryanofsky/bitcoin that referenced this pull request Jun 5, 2025
This fixes an error reported by Antoine Poinsot <[email protected]> in
bitcoin-core/libmultiprocess#123 that does not happen
in master, but does happen with bitcoin#10102
applied, where if the child bitcoin-wallet process is killed (either by an
external signal or by Ctrl-C as reported in the issue) the bitcoin-node process
will not shutdown cleanly after that because chain client flush()
calls will fail.

This change fixes the problem by handling ipc::Exception errors thrown during
the flush() calls, and it relies on the fixes to disconnect detection
implemented in bitcoin-core/libmultiprocess#160 to work
effectively.
ryanofsky added a commit to ryanofsky/bitcoin that referenced this pull request Jun 5, 2025
Use EventLoopRef to avoid reference counting bugs and be more exception safe
and deal with removal of addClient/removeClient methods in
bitcoin-core/libmultiprocess#160

A test update is also required due to
bitcoin-core/libmultiprocess#160 to deal with changed
reference count semantics. In IpcPipeTest(), it is is now necessary to destroy
the client Proxy object instead of just the client Connection object to
decrease the event loop reference count and allow the loop to exit so the test
does not hang on shutdown.
ryanofsky added a commit to ryanofsky/bitcoin that referenced this pull request Jun 5, 2025
This fixes an error reported by Antoine Poinsot <[email protected]> in
bitcoin-core/libmultiprocess#123 that does not happen
in master, but does happen with bitcoin#10102
applied, where if the child bitcoin-wallet process is killed (either by an
external signal or by Ctrl-C as reported in the issue) the bitcoin-node process
will not shutdown cleanly after that because chain client flush()
calls will fail.

This change fixes the problem by handling ipc::Exception errors thrown during
the flush() calls, and it relies on the fixes to disconnect detection
implemented in bitcoin-core/libmultiprocess#160 to work
effectively.
ryanofsky added a commit to ryanofsky/bitcoin that referenced this pull request Jun 5, 2025
This fixes an error reported by Antoine Poinsot <[email protected]> in
bitcoin-core/libmultiprocess#123 that does not happen
in master, but does happen with bitcoin#10102
applied, where if the child bitcoin-wallet process is killed (either by an
external signal or by Ctrl-C as reported in the issue) the bitcoin-node process
will not shutdown cleanly after that because chain client flush()
calls will fail.

This change fixes the problem by handling ipc::Exception errors thrown during
the flush() calls, and it relies on the fixes to disconnect detection
implemented in bitcoin-core/libmultiprocess#160 to work
effectively.
ryanofsky added a commit to ryanofsky/bitcoin that referenced this pull request Jun 6, 2025
This fixes an error reported by Antoine Poinsot <[email protected]> in
bitcoin-core/libmultiprocess#123 that does not happen
in master, but does happen with bitcoin#10102
applied, where if the child bitcoin-wallet process is killed (either by an
external signal or by Ctrl-C as reported in the issue) the bitcoin-node process
will not shutdown cleanly after that because chain client flush()
calls will fail.

This change fixes the problem by handling ipc::Exception errors thrown during
the flush() calls, and it relies on the fixes to disconnect detection
implemented in bitcoin-core/libmultiprocess#160 to work
effectively.
@ryanofsky
Copy link
Collaborator Author

Rewrote the PR description since this is now a dependency of bitcoin/bitcoin#32345 and needed to fix bugs #123 and #176

ryanofsky added a commit to ryanofsky/libmultiprocess that referenced this pull request Jul 1, 2025
This reverts commit 196e6fc from
bitcoin-core#118 which added a
workaround which is no longer needed after
315ff53 from
bitcoin-core#160

When the workaround was introduced it prevented segfaults that happened when an
IPC client disconnected during a long-running asynchronous call by leaking
server objects instead of crashing. These leaks could sometimes cause the new
"disconnecting and blocking" unit test introduced in bitcoin-core#160 to hang, and since
the workaround is no longer necessary, the fix is to revert it.

The problem with the hanged test was reported:

- Sjors/bitcoin#90 (comment)
- https://github.com/Sjors/bitcoin/actions/runs/15966265407/job/45027248310?pr=90
- Sjors/bitcoin#90 (comment)
- https://cirrus-ci.com/task/4999408900636672

And the fix was posted:

- Sjors/bitcoin#90 (comment)
ryanofsky added a commit to ryanofsky/bitcoin that referenced this pull request Jul 1, 2025
Use EventLoopRef to avoid reference counting bugs and be more exception safe
and deal with removal of addClient/removeClient methods in
bitcoin-core/libmultiprocess#160

A test update is also required due to
bitcoin-core/libmultiprocess#160 to deal with changed
reference count semantics. In IpcPipeTest(), it is is now necessary to destroy
the client Proxy object instead of just the client Connection object to
decrease the event loop reference count and allow the loop to exit so the test
does not hang on shutdown.
ryanofsky added a commit to ryanofsky/bitcoin that referenced this pull request Jul 1, 2025
This fixes an error reported by Antoine Poinsot <[email protected]> in
bitcoin-core/libmultiprocess#123 that does not happen
in master, but does happen with bitcoin#10102
applied, where if the child bitcoin-wallet process is killed (either by an
external signal or by Ctrl-C as reported in the issue) the bitcoin-node process
will not shutdown cleanly after that because chain client stop()
calls will fail.

This change fixes the problem by handling ipc::Exception errors thrown during
the stop() calls, and it relies on the fixes to disconnect detection
implemented in bitcoin-core/libmultiprocess#160 to work
effectively.
ryanofsky added a commit to ryanofsky/bitcoin that referenced this pull request Jul 1, 2025
…adc9ae

725beeadc9ae doc: fix DrahtBot LLM Linter error
7455fc0e3781 type-context: revert client disconnect workaround
47f052c3ed94 proxy-types: fix UndefinedBehaviorSanitizer: null-pointer-use
f6bd4385c057 mptest: fix MemorySanitizer: use-of-uninitialized-value
8218a1d8760d proxy-io: fix race conditions in disconnect callback code
88695fc05303 proxy-io: fix race conditions in ProxyClientBase cleanup handler
63bd2a201dc2 doc: Add note about Waiter::m_mutex and interaction with the EventLoop::m_mutex
81d58f5580e8 refactor: Rename ProxyClient cleanup_it variable
07230f259f55 refactor: rename ProxyClient<Thread>::m_cleanup_it
0d986ff144cd mptest: fix race condition in TestSetup constructor
d2f6aa2e84ef ci: add thread sanitizer job
c0efaa5e8cb1 Merge bitcoin-core/libmultiprocess#187: ci: have bash scripts explicitly opt out of locale dependence.
3a6db38e561f ci: rename configs to .bash
401e0ce1d9c3 ci: add copyright to bash scripts
e956467ae464 ci: export LC_ALL
8954cc0377d8 Merge bitcoin-core/libmultiprocess#184: Add CI jobs and fix clang-tidy and iwyu errors
757e13a75546 ci: add gnu32 cross-compiled 32-bit build
15bf349000eb doc: fix typo found by DrahtBot
1a598d5905f7 clang-tidy: drop 'bitcoin-*' check
cbb1e43fdc6e ci: test libc++ instead of libstdc++ in one job
76313450c2c4 type-context: disable clang-tidy UndefinedBinaryOperatorResult error
4896e7fe51ba proxy-types: fix clang-tidy EnumCastOutOfRange error
060a73926956 proxy-types: fix clang-tidy StackAddressEscape error
977d721020f6 ci: add github actions jobs testing gcc, clang-20, clang-tidy, and iwyu
0d5f1faae5da iwyu: fix add/remove include errors
753d2b10cc27 util: fix clang-tidy modernize-use-equals-default error
ae4f1dc2bb1a type-number: fix clang-tidy modernize-use-nullptr error
07a741bf6946 proxy-types: fix clang-tidy bugprone-use-after-move error
3673114bc9d9 proxy-types: fix clang-tidy bugprone-use-after-move error
422923f38485 proxy-types: fix clang-tidy bugprone-use-after-move error
c6784c6adefa mpgen: disable clang-tidy misc-no-recursion error
c5498aa11ba6 tidy: copy clang-tidy file from bitcoin core
258a617c1eec Merge bitcoin-core/libmultiprocess#160: refactor: EventLoop locking cleanups + client disconnect exception
84cf56a0b5f4 test: Test disconnects during IPC calls
949573da8411 Prevent IPC server crash if disconnected during IPC call
019839758085 Merge bitcoin-core/libmultiprocess#179: scripted-diff: Remove copyright year (ranges)
ea38392960e1 Prevent EventLoop async cleanup thread early exit during shutdown
616d9a75d20a doc: Document ProxyClientBase destroy_connection option
56fff76f940b Improve IPC client disconnected exceptions
9b8ed3dc5f87 refactor: Add clang thread safety annotations to EventLoop
52256e730f51 refactor: Remove DestructorCatcher and AsyncCallable
f24894794adf refactor: Drop addClient/removeClient methods
2b830e558e61 refactor: Use EventLoopRef instead of addClient/removeClient
315ff537fb65 refactor: Add ProxyContext EventLoop* member
9aaeec3678d3 proxy-io.h: Add EventLoopRef RAII class handle addClient/removeClient refcounting
f58c8d8ba2f0 proxy-io.h: Add more detailed EventLoop comment
5108445e5d16 test: Add test coverage for client & server disconnections
59030c68cb5f Merge bitcoin-core/libmultiprocess#181: type-function.h: Fix CustomBuildField overload
688140b1dffc test: Add coverage for type-function.h
8b96229da58e type-function.h: Fix CustomBuildField overload
fa2ff9a66842 scripted-diff: Remove copyright year (ranges)

git-subtree-dir: src/ipc/libmultiprocess
git-subtree-split: 725beeadc9ae30664aa3ed12a9ac2b77a9d8d4e0
ryanofsky added a commit to ryanofsky/bitcoin that referenced this pull request Jul 1, 2025
Use EventLoopRef to avoid reference counting bugs and be more exception safe
and deal with removal of addClient/removeClient methods in
bitcoin-core/libmultiprocess#160

A test update is also required due to
bitcoin-core/libmultiprocess#160 to deal with changed
reference count semantics. In IpcPipeTest(), it is is now necessary to destroy
the client Proxy object instead of just the client Connection object to
decrease the event loop reference count and allow the loop to exit so the test
does not hang on shutdown.
ryanofsky added a commit to ryanofsky/bitcoin that referenced this pull request Jul 1, 2025
This fixes an error reported by Antoine Poinsot <[email protected]> in
bitcoin-core/libmultiprocess#123 that does not happen
in master, but does happen with bitcoin#10102
applied, where if the child bitcoin-wallet process is killed (either by an
external signal or by Ctrl-C as reported in the issue) the bitcoin-node process
will not shutdown cleanly after that because chain client stop()
calls will fail.

This change fixes the problem by handling ipc::Exception errors thrown during
the stop() calls, and it relies on the fixes to disconnect detection
implemented in bitcoin-core/libmultiprocess#160 to work
effectively.
ryanofsky added a commit to ryanofsky/libmultiprocess that referenced this pull request Jul 1, 2025
This reverts commit 196e6fc from
bitcoin-core#118 which added a
workaround which is no longer needed after
315ff53 from
bitcoin-core#160

When the workaround was introduced it prevented segfaults that happened when an
IPC client disconnected during a long-running asynchronous call. But it
prevented them by leaking server objects, and these leaks now sometimes cause
the new "disconnecting and blocking" unit test introduced in bitcoin-core#160 to hang.
Since the workaround is no longer necessary, revert it now to fix the test
hangs.

The problem with test hangs was reported:

- Sjors/bitcoin#90 (comment)
- https://github.com/Sjors/bitcoin/actions/runs/15966265407/job/45027248310?pr=90
- Sjors/bitcoin#90 (comment)
- https://cirrus-ci.com/task/4999408900636672

And the fix was posted:

- Sjors/bitcoin#90 (comment)
ryanofsky added a commit to ryanofsky/bitcoin that referenced this pull request Jul 1, 2025
…583f2b

6f340a583f2b doc: fix DrahtBot LLM Linter error
c6f7fdf17350 type-context: revert client disconnect workaround
e09143d2ea2f proxy-types: fix UndefinedBehaviorSanitizer: null-pointer-use
84b292fcc4db mptest: fix MemorySanitizer: use-of-uninitialized-value
fe4a188803c6 proxy-io: fix race conditions in disconnect callback code
d8011c83608e proxy-io: fix race conditions in ProxyClientBase cleanup handler
97e82ce19c47 doc: Add note about Waiter::m_mutex and interaction with the EventLoop::m_mutex
81d58f5580e8 refactor: Rename ProxyClient cleanup_it variable
07230f259f55 refactor: rename ProxyClient<Thread>::m_cleanup_it
0d986ff144cd mptest: fix race condition in TestSetup constructor
d2f6aa2e84ef ci: add thread sanitizer job
c0efaa5e8cb1 Merge bitcoin-core/libmultiprocess#187: ci: have bash scripts explicitly opt out of locale dependence.
3a6db38e561f ci: rename configs to .bash
401e0ce1d9c3 ci: add copyright to bash scripts
e956467ae464 ci: export LC_ALL
8954cc0377d8 Merge bitcoin-core/libmultiprocess#184: Add CI jobs and fix clang-tidy and iwyu errors
757e13a75546 ci: add gnu32 cross-compiled 32-bit build
15bf349000eb doc: fix typo found by DrahtBot
1a598d5905f7 clang-tidy: drop 'bitcoin-*' check
cbb1e43fdc6e ci: test libc++ instead of libstdc++ in one job
76313450c2c4 type-context: disable clang-tidy UndefinedBinaryOperatorResult error
4896e7fe51ba proxy-types: fix clang-tidy EnumCastOutOfRange error
060a73926956 proxy-types: fix clang-tidy StackAddressEscape error
977d721020f6 ci: add github actions jobs testing gcc, clang-20, clang-tidy, and iwyu
0d5f1faae5da iwyu: fix add/remove include errors
753d2b10cc27 util: fix clang-tidy modernize-use-equals-default error
ae4f1dc2bb1a type-number: fix clang-tidy modernize-use-nullptr error
07a741bf6946 proxy-types: fix clang-tidy bugprone-use-after-move error
3673114bc9d9 proxy-types: fix clang-tidy bugprone-use-after-move error
422923f38485 proxy-types: fix clang-tidy bugprone-use-after-move error
c6784c6adefa mpgen: disable clang-tidy misc-no-recursion error
c5498aa11ba6 tidy: copy clang-tidy file from bitcoin core
258a617c1eec Merge bitcoin-core/libmultiprocess#160: refactor: EventLoop locking cleanups + client disconnect exception
84cf56a0b5f4 test: Test disconnects during IPC calls
949573da8411 Prevent IPC server crash if disconnected during IPC call
019839758085 Merge bitcoin-core/libmultiprocess#179: scripted-diff: Remove copyright year (ranges)
ea38392960e1 Prevent EventLoop async cleanup thread early exit during shutdown
616d9a75d20a doc: Document ProxyClientBase destroy_connection option
56fff76f940b Improve IPC client disconnected exceptions
9b8ed3dc5f87 refactor: Add clang thread safety annotations to EventLoop
52256e730f51 refactor: Remove DestructorCatcher and AsyncCallable
f24894794adf refactor: Drop addClient/removeClient methods
2b830e558e61 refactor: Use EventLoopRef instead of addClient/removeClient
315ff537fb65 refactor: Add ProxyContext EventLoop* member
9aaeec3678d3 proxy-io.h: Add EventLoopRef RAII class handle addClient/removeClient refcounting
f58c8d8ba2f0 proxy-io.h: Add more detailed EventLoop comment
5108445e5d16 test: Add test coverage for client & server disconnections
59030c68cb5f Merge bitcoin-core/libmultiprocess#181: type-function.h: Fix CustomBuildField overload
688140b1dffc test: Add coverage for type-function.h
8b96229da58e type-function.h: Fix CustomBuildField overload
fa2ff9a66842 scripted-diff: Remove copyright year (ranges)

git-subtree-dir: src/ipc/libmultiprocess
git-subtree-split: 6f340a583f2b29b32866c289d5f4d7662b4fe90e
ryanofsky added a commit to ryanofsky/bitcoin that referenced this pull request Jul 1, 2025
Use EventLoopRef to avoid reference counting bugs and be more exception safe
and deal with removal of addClient/removeClient methods in
bitcoin-core/libmultiprocess#160

A test update is also required due to
bitcoin-core/libmultiprocess#160 to deal with changed
reference count semantics. In IpcPipeTest(), it is is now necessary to destroy
the client Proxy object instead of just the client Connection object to
decrease the event loop reference count and allow the loop to exit so the test
does not hang on shutdown.
ryanofsky added a commit to ryanofsky/bitcoin that referenced this pull request Jul 1, 2025
This fixes an error reported by Antoine Poinsot <[email protected]> in
bitcoin-core/libmultiprocess#123 that does not happen
in master, but does happen with bitcoin#10102
applied, where if the child bitcoin-wallet process is killed (either by an
external signal or by Ctrl-C as reported in the issue) the bitcoin-node process
will not shutdown cleanly after that because chain client stop()
calls will fail.

This change fixes the problem by handling ipc::Exception errors thrown during
the stop() calls, and it relies on the fixes to disconnect detection
implemented in bitcoin-core/libmultiprocess#160 to work
effectively.
ryanofsky added a commit to ryanofsky/bitcoin that referenced this pull request Jul 1, 2025
Use EventLoopRef to avoid reference counting bugs and be more exception safe
and deal with removal of addClient/removeClient methods in
bitcoin-core/libmultiprocess#160

A test update is also required due to
bitcoin-core/libmultiprocess#160 to deal with changed
reference count semantics. In IpcPipeTest(), it is is now necessary to destroy
the client Proxy object instead of just the client Connection object to
decrease the event loop reference count and allow the loop to exit so the test
does not hang on shutdown.
ryanofsky added a commit to ryanofsky/bitcoin that referenced this pull request Jul 1, 2025
This fixes an error reported by Antoine Poinsot <[email protected]> in
bitcoin-core/libmultiprocess#123 that does not happen
in master, but does happen with bitcoin#10102
applied, where if the child bitcoin-wallet process is killed (either by an
external signal or by Ctrl-C as reported in the issue) the bitcoin-node process
will not shutdown cleanly after that because chain client stop()
calls will fail.

This change fixes the problem by handling ipc::Exception errors thrown during
the stop() calls, and it relies on the fixes to disconnect detection
implemented in bitcoin-core/libmultiprocess#160 to work
effectively.
ryanofsky added a commit to ryanofsky/bitcoin that referenced this pull request Jul 1, 2025
This fixes an error reported by Antoine Poinsot <[email protected]> in
bitcoin-core/libmultiprocess#123 that does not happen
in master, but does happen with bitcoin#10102
applied, where if the child bitcoin-wallet process is killed (either by an
external signal or by Ctrl-C as reported in the issue) the bitcoin-node process
will not shutdown cleanly after that because chain client stop()
calls will fail.

This change fixes the problem by handling ipc::Exception errors thrown during
the stop() calls, and it relies on the fixes to disconnect detection
implemented in bitcoin-core/libmultiprocess#160 to work
effectively.
ryanofsky added a commit to ryanofsky/bitcoin that referenced this pull request Jul 1, 2025
Use EventLoopRef to avoid reference counting bugs and be more exception safe
and deal with removal of addClient/removeClient methods in
bitcoin-core/libmultiprocess#160

A test update is also required due to
bitcoin-core/libmultiprocess#160 to deal with changed
reference count semantics. In IpcPipeTest(), it is now necessary to destroy
the client Proxy object instead of just the client Connection object to
decrease the event loop reference count and allow the loop to exit so the test
does not hang on shutdown.
ryanofsky added a commit to ryanofsky/bitcoin that referenced this pull request Jul 1, 2025
This fixes an error reported by Antoine Poinsot <[email protected]> in
bitcoin-core/libmultiprocess#123 that does not happen
in master, but does happen with bitcoin#10102
applied, where if the child bitcoin-wallet process is killed (either by an
external signal or by Ctrl-C as reported in the issue) the bitcoin-node process
will not shutdown cleanly after that because chain client stop()
calls will fail.

This change fixes the problem by handling ipc::Exception errors thrown during
the stop() calls, and it relies on the fixes to disconnect detection
implemented in bitcoin-core/libmultiprocess#160 to work
effectively.
ryanofsky added a commit that referenced this pull request Jul 1, 2025
6f340a5 doc: fix DrahtBot LLM Linter error (Ryan Ofsky)
c6f7fdf type-context: revert client disconnect workaround (Ryan Ofsky)
e09143d proxy-types: fix UndefinedBehaviorSanitizer: null-pointer-use (Ryan Ofsky)
84b292f mptest: fix MemorySanitizer: use-of-uninitialized-value (Ryan Ofsky)
fe4a188 proxy-io: fix race conditions in disconnect callback code (Ryan Ofsky)
d8011c8 proxy-io: fix race conditions in ProxyClientBase cleanup handler (Ryan Ofsky)
97e82ce doc: Add note about Waiter::m_mutex and interaction with the EventLoop::m_mutex (Ryan Ofsky)
81d58f5 refactor: Rename ProxyClient cleanup_it variable (Ryan Ofsky)
07230f2 refactor: rename ProxyClient<Thread>::m_cleanup_it (Ryan Ofsky)
0d986ff mptest: fix race condition in TestSetup constructor (Ryan Ofsky)
d2f6aa2 ci: add thread sanitizer job (Ryan Ofsky)

Pull request description:

  Recently merged PR #160 expanded unit tests to cover various unclean disconnection scenarios, but the new unit tests cause failures in bitcoin CI, despite passing in local CI (which doesn't test as many sanitizers and platforms). Some of the errors are just test bugs, but others are real library bugs and race conditions.

  The bugs were reported in two threads starting Sjors/bitcoin#90 (comment) and bitcoin/bitcoin#32345 (comment), and they are described in detail in individual commit messages in this PR. The changes here fix all the known bugs and add new CI jobs and tests to detect them and catch regressions.

ACKs for top commit:
  Sjors:
    re-ACK 6f340a5

Tree-SHA512: 20aa1992080a0329739d663edb636f218e88d521b17cd66c328051629c8efea802c0ac52a44d51cd58cfe60cc6beb6cdd4a2afa00a0ce36801724540f9e43d42
ryanofsky added a commit to ryanofsky/bitcoin that referenced this pull request Jul 1, 2025
Use EventLoopRef to avoid reference counting bugs and be more exception safe
and deal with removal of addClient/removeClient methods in
bitcoin-core/libmultiprocess#160

A test update is also required due to
bitcoin-core/libmultiprocess#160 to deal with changed
reference count semantics. In IpcPipeTest(), it is now necessary to destroy
the client Proxy object instead of just the client Connection object to
decrease the event loop reference count and allow the loop to exit so the test
does not hang on shutdown.
ryanofsky added a commit to ryanofsky/bitcoin that referenced this pull request Jul 1, 2025
This fixes an error reported by Antoine Poinsot <[email protected]> in
bitcoin-core/libmultiprocess#123 that does not happen
in master, but does happen with bitcoin#10102
applied, where if the child bitcoin-wallet process is killed (either by an
external signal or by Ctrl-C as reported in the issue) the bitcoin-node process
will not shutdown cleanly after that because chain client stop()
calls will fail.

This change fixes the problem by handling ipc::Exception errors thrown during
the stop() calls, and it relies on the fixes to disconnect detection
implemented in bitcoin-core/libmultiprocess#160 to work
effectively.
ryanofsky added a commit to ryanofsky/bitcoin that referenced this pull request Jul 1, 2025
…05c238

a11e6905c238 Merge bitcoin-core/libmultiprocess#186: Fix mptest failures in bitcoin CI
6f340a583f2b doc: fix DrahtBot LLM Linter error
c6f7fdf17350 type-context: revert client disconnect workaround
e09143d2ea2f proxy-types: fix UndefinedBehaviorSanitizer: null-pointer-use
84b292fcc4db mptest: fix MemorySanitizer: use-of-uninitialized-value
fe4a188803c6 proxy-io: fix race conditions in disconnect callback code
d8011c83608e proxy-io: fix race conditions in ProxyClientBase cleanup handler
97e82ce19c47 doc: Add note about Waiter::m_mutex and interaction with the EventLoop::m_mutex
81d58f5580e8 refactor: Rename ProxyClient cleanup_it variable
07230f259f55 refactor: rename ProxyClient<Thread>::m_cleanup_it
c0efaa5e8cb1 Merge bitcoin-core/libmultiprocess#187: ci: have bash scripts explicitly opt out of locale dependence.
0d986ff144cd mptest: fix race condition in TestSetup constructor
d2f6aa2e84ef ci: add thread sanitizer job
3a6db38e561f ci: rename configs to .bash
401e0ce1d9c3 ci: add copyright to bash scripts
e956467ae464 ci: export LC_ALL
8954cc0377d8 Merge bitcoin-core/libmultiprocess#184: Add CI jobs and fix clang-tidy and iwyu errors
757e13a75546 ci: add gnu32 cross-compiled 32-bit build
15bf349000eb doc: fix typo found by DrahtBot
1a598d5905f7 clang-tidy: drop 'bitcoin-*' check
cbb1e43fdc6e ci: test libc++ instead of libstdc++ in one job
76313450c2c4 type-context: disable clang-tidy UndefinedBinaryOperatorResult error
4896e7fe51ba proxy-types: fix clang-tidy EnumCastOutOfRange error
060a73926956 proxy-types: fix clang-tidy StackAddressEscape error
977d721020f6 ci: add github actions jobs testing gcc, clang-20, clang-tidy, and iwyu
0d5f1faae5da iwyu: fix add/remove include errors
753d2b10cc27 util: fix clang-tidy modernize-use-equals-default error
ae4f1dc2bb1a type-number: fix clang-tidy modernize-use-nullptr error
07a741bf6946 proxy-types: fix clang-tidy bugprone-use-after-move error
3673114bc9d9 proxy-types: fix clang-tidy bugprone-use-after-move error
422923f38485 proxy-types: fix clang-tidy bugprone-use-after-move error
c6784c6adefa mpgen: disable clang-tidy misc-no-recursion error
c5498aa11ba6 tidy: copy clang-tidy file from bitcoin core
258a617c1eec Merge bitcoin-core/libmultiprocess#160: refactor: EventLoop locking cleanups + client disconnect exception
84cf56a0b5f4 test: Test disconnects during IPC calls
949573da8411 Prevent IPC server crash if disconnected during IPC call
019839758085 Merge bitcoin-core/libmultiprocess#179: scripted-diff: Remove copyright year (ranges)
ea38392960e1 Prevent EventLoop async cleanup thread early exit during shutdown
616d9a75d20a doc: Document ProxyClientBase destroy_connection option
56fff76f940b Improve IPC client disconnected exceptions
9b8ed3dc5f87 refactor: Add clang thread safety annotations to EventLoop
52256e730f51 refactor: Remove DestructorCatcher and AsyncCallable
f24894794adf refactor: Drop addClient/removeClient methods
2b830e558e61 refactor: Use EventLoopRef instead of addClient/removeClient
315ff537fb65 refactor: Add ProxyContext EventLoop* member
9aaeec3678d3 proxy-io.h: Add EventLoopRef RAII class handle addClient/removeClient refcounting
f58c8d8ba2f0 proxy-io.h: Add more detailed EventLoop comment
5108445e5d16 test: Add test coverage for client & server disconnections
59030c68cb5f Merge bitcoin-core/libmultiprocess#181: type-function.h: Fix CustomBuildField overload
688140b1dffc test: Add coverage for type-function.h
8b96229da58e type-function.h: Fix CustomBuildField overload
fa2ff9a66842 scripted-diff: Remove copyright year (ranges)

git-subtree-dir: src/ipc/libmultiprocess
git-subtree-split: a11e6905c238dc35a8bbef995190296bc6329d49
ryanofsky added a commit to ryanofsky/bitcoin that referenced this pull request Jul 1, 2025
Use EventLoopRef to avoid reference counting bugs and be more exception safe
and deal with removal of addClient/removeClient methods in
bitcoin-core/libmultiprocess#160

A test update is also required due to
bitcoin-core/libmultiprocess#160 to deal with changed
reference count semantics. In IpcPipeTest(), it is now necessary to destroy
the client Proxy object instead of just the client Connection object to
decrease the event loop reference count and allow the loop to exit so the test
does not hang on shutdown.
ryanofsky added a commit to ryanofsky/bitcoin that referenced this pull request Jul 1, 2025
This fixes an error reported by Antoine Poinsot <[email protected]> in
bitcoin-core/libmultiprocess#123 that does not happen
in master, but does happen with bitcoin#10102
applied, where if the child bitcoin-wallet process is killed (either by an
external signal or by Ctrl-C as reported in the issue) the bitcoin-node process
will not shutdown cleanly after that because chain client stop()
calls will fail.

This change fixes the problem by handling ipc::Exception errors thrown during
the stop() calls, and it relies on the fixes to disconnect detection
implemented in bitcoin-core/libmultiprocess#160 to work
effectively.
ryanofsky added a commit to ryanofsky/bitcoin that referenced this pull request Aug 8, 2025
Use EventLoopRef to avoid reference counting bugs and be more exception safe
and deal with removal of addClient/removeClient methods in
bitcoin-core/libmultiprocess#160

A test update is also required due to
bitcoin-core/libmultiprocess#160 to deal with changed
reference count semantics. In IpcPipeTest(), it is now necessary to destroy
the client Proxy object instead of just the client Connection object to
decrease the event loop reference count and allow the loop to exit so the test
does not hang on shutdown.
ryanofsky added a commit to ryanofsky/bitcoin that referenced this pull request Aug 8, 2025
This fixes an error reported by Antoine Poinsot <[email protected]> in
bitcoin-core/libmultiprocess#123 that does not happen
in master, but does happen with bitcoin#10102
applied, where if the child bitcoin-wallet process is killed (either by an
external signal or by Ctrl-C as reported in the issue) the bitcoin-node process
will not shutdown cleanly after that because chain client stop()
calls will fail.

This change fixes the problem by handling ipc::Exception errors thrown during
the stop() calls, and it relies on the fixes to disconnect detection
implemented in bitcoin-core/libmultiprocess#160 to work
effectively.
ryanofsky added a commit to ryanofsky/bitcoin that referenced this pull request Aug 8, 2025
…34bad2

b4120d34bad2 Merge bitcoin-core/libmultiprocess#192: doc: fix typos
6ecbdcd35a93 doc: fix typos
a11e6905c238 Merge bitcoin-core/libmultiprocess#186: Fix mptest failures in bitcoin CI
6f340a583f2b doc: fix DrahtBot LLM Linter error
c6f7fdf17350 type-context: revert client disconnect workaround
e09143d2ea2f proxy-types: fix UndefinedBehaviorSanitizer: null-pointer-use
84b292fcc4db mptest: fix MemorySanitizer: use-of-uninitialized-value
fe4a188803c6 proxy-io: fix race conditions in disconnect callback code
d8011c83608e proxy-io: fix race conditions in ProxyClientBase cleanup handler
97e82ce19c47 doc: Add note about Waiter::m_mutex and interaction with the EventLoop::m_mutex
81d58f5580e8 refactor: Rename ProxyClient cleanup_it variable
07230f259f55 refactor: rename ProxyClient<Thread>::m_cleanup_it
c0efaa5e8cb1 Merge bitcoin-core/libmultiprocess#187: ci: have bash scripts explicitly opt out of locale dependence.
0d986ff144cd mptest: fix race condition in TestSetup constructor
d2f6aa2e84ef ci: add thread sanitizer job
3a6db38e561f ci: rename configs to .bash
401e0ce1d9c3 ci: add copyright to bash scripts
e956467ae464 ci: export LC_ALL
8954cc0377d8 Merge bitcoin-core/libmultiprocess#184: Add CI jobs and fix clang-tidy and iwyu errors
757e13a75546 ci: add gnu32 cross-compiled 32-bit build
15bf349000eb doc: fix typo found by DrahtBot
1a598d5905f7 clang-tidy: drop 'bitcoin-*' check
cbb1e43fdc6e ci: test libc++ instead of libstdc++ in one job
76313450c2c4 type-context: disable clang-tidy UndefinedBinaryOperatorResult error
4896e7fe51ba proxy-types: fix clang-tidy EnumCastOutOfRange error
060a73926956 proxy-types: fix clang-tidy StackAddressEscape error
977d721020f6 ci: add github actions jobs testing gcc, clang-20, clang-tidy, and iwyu
0d5f1faae5da iwyu: fix add/remove include errors
753d2b10cc27 util: fix clang-tidy modernize-use-equals-default error
ae4f1dc2bb1a type-number: fix clang-tidy modernize-use-nullptr error
07a741bf6946 proxy-types: fix clang-tidy bugprone-use-after-move error
3673114bc9d9 proxy-types: fix clang-tidy bugprone-use-after-move error
422923f38485 proxy-types: fix clang-tidy bugprone-use-after-move error
c6784c6adefa mpgen: disable clang-tidy misc-no-recursion error
c5498aa11ba6 tidy: copy clang-tidy file from bitcoin core
258a617c1eec Merge bitcoin-core/libmultiprocess#160: refactor: EventLoop locking cleanups + client disconnect exception
84cf56a0b5f4 test: Test disconnects during IPC calls
949573da8411 Prevent IPC server crash if disconnected during IPC call
019839758085 Merge bitcoin-core/libmultiprocess#179: scripted-diff: Remove copyright year (ranges)
ea38392960e1 Prevent EventLoop async cleanup thread early exit during shutdown
616d9a75d20a doc: Document ProxyClientBase destroy_connection option
56fff76f940b Improve IPC client disconnected exceptions
9b8ed3dc5f87 refactor: Add clang thread safety annotations to EventLoop
52256e730f51 refactor: Remove DestructorCatcher and AsyncCallable
f24894794adf refactor: Drop addClient/removeClient methods
2b830e558e61 refactor: Use EventLoopRef instead of addClient/removeClient
315ff537fb65 refactor: Add ProxyContext EventLoop* member
9aaeec3678d3 proxy-io.h: Add EventLoopRef RAII class handle addClient/removeClient refcounting
f58c8d8ba2f0 proxy-io.h: Add more detailed EventLoop comment
5108445e5d16 test: Add test coverage for client & server disconnections
59030c68cb5f Merge bitcoin-core/libmultiprocess#181: type-function.h: Fix CustomBuildField overload
688140b1dffc test: Add coverage for type-function.h
8b96229da58e type-function.h: Fix CustomBuildField overload
fa2ff9a66842 scripted-diff: Remove copyright year (ranges)

git-subtree-dir: src/ipc/libmultiprocess
git-subtree-split: b4120d34bad2de28141c5770f6e8df8e54898987
ryanofsky added a commit to ryanofsky/bitcoin that referenced this pull request Aug 8, 2025
Use EventLoopRef to avoid reference counting bugs and be more exception safe
and deal with removal of addClient/removeClient methods in
bitcoin-core/libmultiprocess#160

A test update is also required due to
bitcoin-core/libmultiprocess#160 to deal with changed
reference count semantics. In IpcPipeTest(), it is now necessary to destroy
the client Proxy object instead of just the client Connection object to
decrease the event loop reference count and allow the loop to exit so the test
does not hang on shutdown.
ryanofsky added a commit to ryanofsky/bitcoin that referenced this pull request Aug 8, 2025
This fixes an error reported by Antoine Poinsot <[email protected]> in
bitcoin-core/libmultiprocess#123 that does not happen
in master, but does happen with bitcoin#10102
applied, where if the child bitcoin-wallet process is killed (either by an
external signal or by Ctrl-C as reported in the issue) the bitcoin-node process
will not shutdown cleanly after that because chain client stop()
calls will fail.

This change fixes the problem by handling ipc::Exception errors thrown during
the stop() calls, and it relies on the fixes to disconnect detection
implemented in bitcoin-core/libmultiprocess#160 to work
effectively.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

bitcoin-node segfaults when interrupted
4 participants