Skip to content

Commit a5a7f5e

Browse files
pythongh-124694: Add concurrent.futures.InterpreterPoolExecutor (pythongh-124548)
This is an implementation of InterpreterPoolExecutor that builds on ThreadPoolExecutor. (Note that this is not tied to PEP 734, which is strictly about adding a new stdlib module.) Possible future improvements: * support passing a script for the initializer or to submit() * support passing (most) arbitrary functions without pickling * support passing closures * optionally exec functions against __main__ instead of the their original module
1 parent a38fef4 commit a5a7f5e

File tree

12 files changed

+826
-38
lines changed

12 files changed

+826
-38
lines changed

Doc/library/asyncio-dev.rst

+4-2
Original file line numberDiff line numberDiff line change
@@ -103,7 +103,8 @@ To handle signals the event loop must be
103103
run in the main thread.
104104

105105
The :meth:`loop.run_in_executor` method can be used with a
106-
:class:`concurrent.futures.ThreadPoolExecutor` to execute
106+
:class:`concurrent.futures.ThreadPoolExecutor` or
107+
:class:`~concurrent.futures.InterpreterPoolExecutor` to execute
107108
blocking code in a different OS thread without blocking the OS thread
108109
that the event loop runs in.
109110

@@ -128,7 +129,8 @@ if a function performs a CPU-intensive calculation for 1 second,
128129
all concurrent asyncio Tasks and IO operations would be delayed
129130
by 1 second.
130131

131-
An executor can be used to run a task in a different thread or even in
132+
An executor can be used to run a task in a different thread,
133+
including in a different interpreter, or even in
132134
a different process to avoid blocking the OS thread with the
133135
event loop. See the :meth:`loop.run_in_executor` method for more
134136
details.

Doc/library/asyncio-eventloop.rst

+8-1
Original file line numberDiff line numberDiff line change
@@ -1305,6 +1305,12 @@ Executing code in thread or process pools
13051305
pool, cpu_bound)
13061306
print('custom process pool', result)
13071307

1308+
# 4. Run in a custom interpreter pool:
1309+
with concurrent.futures.InterpreterPoolExecutor() as pool:
1310+
result = await loop.run_in_executor(
1311+
pool, cpu_bound)
1312+
print('custom interpreter pool', result)
1313+
13081314
if __name__ == '__main__':
13091315
asyncio.run(main())
13101316

@@ -1329,7 +1335,8 @@ Executing code in thread or process pools
13291335

13301336
Set *executor* as the default executor used by :meth:`run_in_executor`.
13311337
*executor* must be an instance of
1332-
:class:`~concurrent.futures.ThreadPoolExecutor`.
1338+
:class:`~concurrent.futures.ThreadPoolExecutor`, which includes
1339+
:class:`~concurrent.futures.InterpreterPoolExecutor`.
13331340

13341341
.. versionchanged:: 3.11
13351342
*executor* must be an instance of

Doc/library/asyncio-llapi-index.rst

+1-1
Original file line numberDiff line numberDiff line change
@@ -96,7 +96,7 @@ See also the main documentation section about the
9696
- Invoke a callback *at* the given time.
9797

9898

99-
.. rubric:: Thread/Process Pool
99+
.. rubric:: Thread/Interpreter/Process Pool
100100
.. list-table::
101101
:widths: 50 50
102102
:class: full-width-table

Doc/library/concurrent.futures.rst

+131-4
Original file line numberDiff line numberDiff line change
@@ -15,9 +15,10 @@ The :mod:`concurrent.futures` module provides a high-level interface for
1515
asynchronously executing callables.
1616

1717
The asynchronous execution can be performed with threads, using
18-
:class:`ThreadPoolExecutor`, or separate processes, using
19-
:class:`ProcessPoolExecutor`. Both implement the same interface, which is
20-
defined by the abstract :class:`Executor` class.
18+
:class:`ThreadPoolExecutor` or :class:`InterpreterPoolExecutor`,
19+
or separate processes, using :class:`ProcessPoolExecutor`.
20+
Each implements the same interface, which is defined
21+
by the abstract :class:`Executor` class.
2122

2223
.. include:: ../includes/wasm-notavail.rst
2324

@@ -63,7 +64,8 @@ Executor Objects
6364
setting *chunksize* to a positive integer. For very long iterables,
6465
using a large value for *chunksize* can significantly improve
6566
performance compared to the default size of 1. With
66-
:class:`ThreadPoolExecutor`, *chunksize* has no effect.
67+
:class:`ThreadPoolExecutor` and :class:`InterpreterPoolExecutor`,
68+
*chunksize* has no effect.
6769

6870
.. versionchanged:: 3.5
6971
Added the *chunksize* argument.
@@ -227,6 +229,111 @@ ThreadPoolExecutor Example
227229
print('%r page is %d bytes' % (url, len(data)))
228230

229231

232+
InterpreterPoolExecutor
233+
-----------------------
234+
235+
The :class:`InterpreterPoolExecutor` class uses a pool of interpreters
236+
to execute calls asynchronously. It is a :class:`ThreadPoolExecutor`
237+
subclass, which means each worker is running in its own thread.
238+
The difference here is that each worker has its own interpreter,
239+
and runs each task using that interpreter.
240+
241+
The biggest benefit to using interpreters instead of only threads
242+
is true multi-core parallelism. Each interpreter has its own
243+
:term:`Global Interpreter Lock <global interpreter lock>`, so code
244+
running in one interpreter can run on one CPU core, while code in
245+
another interpreter runs unblocked on a different core.
246+
247+
The tradeoff is that writing concurrent code for use with multiple
248+
interpreters can take extra effort. However, this is because it
249+
forces you to be deliberate about how and when interpreters interact,
250+
and to be explicit about what data is shared between interpreters.
251+
This results in several benefits that help balance the extra effort,
252+
including true multi-core parallelism, For example, code written
253+
this way can make it easier to reason about concurrency. Another
254+
major benefit is that you don't have to deal with several of the
255+
big pain points of using threads, like nrace conditions.
256+
257+
Each worker's interpreter is isolated from all the other interpreters.
258+
"Isolated" means each interpreter has its own runtime state and
259+
operates completely independently. For example, if you redirect
260+
:data:`sys.stdout` in one interpreter, it will not be automatically
261+
redirected any other interpreter. If you import a module in one
262+
interpreter, it is not automatically imported in any other. You
263+
would need to import the module separately in interpreter where
264+
you need it. In fact, each module imported in an interpreter is
265+
a completely separate object from the same module in a different
266+
interpreter, including :mod:`sys`, :mod:`builtins`,
267+
and even ``__main__``.
268+
269+
Isolation means a mutable object, or other data, cannot be used
270+
by more than one interpreter at the same time. That effectively means
271+
interpreters cannot actually share such objects or data. Instead,
272+
each interpreter must have its own copy, and you will have to
273+
synchronize any changes between the copies manually. Immutable
274+
objects and data, like the builtin singletons, strings, and tuples
275+
of immutable objects, don't have these limitations.
276+
277+
Communicating and synchronizing between interpreters is most effectively
278+
done using dedicated tools, like those proposed in :pep:`734`. One less
279+
efficient alternative is to serialize with :mod:`pickle` and then send
280+
the bytes over a shared :mod:`socket <socket>` or
281+
:func:`pipe <os.pipe>`.
282+
283+
.. class:: InterpreterPoolExecutor(max_workers=None, thread_name_prefix='', initializer=None, initargs=(), shared=None)
284+
285+
A :class:`ThreadPoolExecutor` subclass that executes calls asynchronously
286+
using a pool of at most *max_workers* threads. Each thread runs
287+
tasks in its own interpreter. The worker interpreters are isolated
288+
from each other, which means each has its own runtime state and that
289+
they can't share any mutable objects or other data. Each interpreter
290+
has its own :term:`Global Interpreter Lock <global interpreter lock>`,
291+
which means code run with this executor has true multi-core parallelism.
292+
293+
The optional *initializer* and *initargs* arguments have the same
294+
meaning as for :class:`!ThreadPoolExecutor`: the initializer is run
295+
when each worker is created, though in this case it is run.in
296+
the worker's interpreter. The executor serializes the *initializer*
297+
and *initargs* using :mod:`pickle` when sending them to the worker's
298+
interpreter.
299+
300+
.. note::
301+
Functions defined in the ``__main__`` module cannot be pickled
302+
and thus cannot be used.
303+
304+
.. note::
305+
The executor may replace uncaught exceptions from *initializer*
306+
with :class:`~concurrent.futures.interpreter.ExecutionFailed`.
307+
308+
The optional *shared* argument is a :class:`dict` of objects that all
309+
interpreters in the pool share. The *shared* items are added to each
310+
interpreter's ``__main__`` module. Not all objects are shareable.
311+
Shareable objects include the builtin singletons, :class:`str`
312+
and :class:`bytes`, and :class:`memoryview`. See :pep:`734`
313+
for more info.
314+
315+
Other caveats from parent :class:`ThreadPoolExecutor` apply here.
316+
317+
:meth:`~Executor.submit` and :meth:`~Executor.map` work like normal,
318+
except the worker serializes the callable and arguments using
319+
:mod:`pickle` when sending them to its interpreter. The worker
320+
likewise serializes the return value when sending it back.
321+
322+
.. note::
323+
Functions defined in the ``__main__`` module cannot be pickled
324+
and thus cannot be used.
325+
326+
When a worker's current task raises an uncaught exception, the worker
327+
always tries to preserve the exception as-is. If that is successful
328+
then it also sets the ``__cause__`` to a corresponding
329+
:class:`~concurrent.futures.interpreter.ExecutionFailed`
330+
instance, which contains a summary of the original exception.
331+
In the uncommon case that the worker is not able to preserve the
332+
original as-is then it directly preserves the corresponding
333+
:class:`~concurrent.futures.interpreter.ExecutionFailed`
334+
instance instead.
335+
336+
230337
ProcessPoolExecutor
231338
-------------------
232339

@@ -574,6 +681,26 @@ Exception classes
574681

575682
.. versionadded:: 3.7
576683

684+
.. currentmodule:: concurrent.futures.interpreter
685+
686+
.. exception:: BrokenInterpreterPool
687+
688+
Derived from :exc:`~concurrent.futures.thread.BrokenThreadPool`,
689+
this exception class is raised when one of the workers
690+
of a :class:`~concurrent.futures.InterpreterPoolExecutor`
691+
has failed initializing.
692+
693+
.. versionadded:: next
694+
695+
.. exception:: ExecutionFailed
696+
697+
Raised from :class:`~concurrent.futures.InterpreterPoolExecutor` when
698+
the given initializer fails or from
699+
:meth:`~concurrent.futures.Executor.submit` when there's an uncaught
700+
exception from the submitted task.
701+
702+
.. versionadded:: next
703+
577704
.. currentmodule:: concurrent.futures.process
578705

579706
.. exception:: BrokenProcessPool

Doc/whatsnew/3.14.rst

+8
Original file line numberDiff line numberDiff line change
@@ -225,6 +225,14 @@ ast
225225
* The ``repr()`` output for AST nodes now includes more information.
226226
(Contributed by Tomas R in :gh:`116022`.)
227227

228+
concurrent.futures
229+
------------------
230+
231+
* Add :class:`~concurrent.futures.InterpreterPoolExecutor`,
232+
which exposes "subinterpreters (multiple Python interpreters in the
233+
same process) to Python code. This is separate from the proposed API
234+
in :pep:`734`.
235+
(Contributed by Eric Snow in :gh:`124548`.)
228236

229237
ctypes
230238
------

Lib/concurrent/futures/__init__.py

+11-1
Original file line numberDiff line numberDiff line change
@@ -29,6 +29,7 @@
2929
'Executor',
3030
'wait',
3131
'as_completed',
32+
'InterpreterPoolExecutor',
3233
'ProcessPoolExecutor',
3334
'ThreadPoolExecutor',
3435
)
@@ -39,7 +40,7 @@ def __dir__():
3940

4041

4142
def __getattr__(name):
42-
global ProcessPoolExecutor, ThreadPoolExecutor
43+
global ProcessPoolExecutor, ThreadPoolExecutor, InterpreterPoolExecutor
4344

4445
if name == 'ProcessPoolExecutor':
4546
from .process import ProcessPoolExecutor as pe
@@ -51,4 +52,13 @@ def __getattr__(name):
5152
ThreadPoolExecutor = te
5253
return te
5354

55+
if name == 'InterpreterPoolExecutor':
56+
try:
57+
from .interpreter import InterpreterPoolExecutor as ie
58+
except ModuleNotFoundError:
59+
ie = InterpreterPoolExecutor = None
60+
else:
61+
InterpreterPoolExecutor = ie
62+
return ie
63+
5464
raise AttributeError(f"module {__name__!r} has no attribute {name!r}")

0 commit comments

Comments
 (0)