python · markshannon · Aug 20, 2021 · Aug 20, 2021 · Sep 13, 2021 · Sep 13, 2021
@@ -0,0 +1,318 @@
+PEP: 669
+Title: Low Impact Instrumentation and Monitoring for CPython
+Author: Mark Shannon <[email protected]>
+Status: Draft
+Type: Standards Track
+Content-Type: text/x-rst
+Created: 18-Aug-2021
+Post-History: 13-Sep-2021
+
+
+Abstract
+========
+
+Using a profiler or debugger in CPython can have a severe impact on
+performance. Slowdowns by an order of magnitude are not uncommon.
+It does not have this bad.
-It does not have this bad.
+It does not have to be this bad.
-It does not have this bad.
+It does not have to be this bad.
+
+This PEP proposes an API for instrumentation and monitoring of Python
+programs running on CPython that will enable the insertion of instrumentation
+and monitoring at low cost.
+
+Using the new API, code run under a debugger on 3.11 should easily outperform
+code run without a debugger on 3.10.
+
+Profiling will still slow down execution, but by much less than in 3.10.
+
+Motivation
+==========
+
+Developers should not have to pay an unreasonable cost to use debuggers,
+profilers and other similar tools.
+
+C++ and Java developers expect to be able to run a program at full speed
+(or very close to it) under a debugger.
+Python developers should expect that too.
+
+Rationale
+=========
+
+The quickening mechanism provided by PEP 659 provides a way to dynamically
+modify executing Python bytecode. These modifications have no cost beyond
+the parts of the code that are modified and a relatively low cost to those
+parts that are modified. We can leverage this to provide an efficient
+mechanism for instrumentation and monitoring that was not possible in 3.10
+or earlier.
+
+Specification
+=============
+
+There are two parts to this specification, instrumentation and monitoring.
+
+Instrumentation occurs early in a program's life cycle and persists through
+the lifetime of the program. It is expected to be pervasive, but fixed.
+Instrumentation is designed to support profiling and coverage tools that 
+expect to be active for the entire lifetime of the program.
+
+Monitoring can occur at any point in the program's life and be applied
+anywhere in the program. Monitoring points are expected to few.
-anywhere in the program. Monitoring points are expected to few.
+anywhere in the program. Monitoring points are expected to be few.
-anywhere in the program. Monitoring points are expected to few.
+anywhere in the program. Monitoring points are expected to be few.
+The capabilities of monitoring are a superset of that of profiling,
+but bulk insertion of monitoring points will be *much* more
+expensive than insertion of instrumentation.
+
+Both instrumentation and monitoring is performed by insertion of
+checkpoints in a code object.
+
+Checkpoints
+-----------
+
+A checkpoint is simply a point in code defined by a 
+``(codeobject, offset)`` pair.
+Every time a checkpoint is reached, the registered callable is called.
+
+Instrumentation
+---------------
+
+Instrumentation supports the bulk insertion of checkpoints, but does not
+allow insertion or removal of checkpoints after code has started to execute.
+
+The events are::
+
+  * BRANCH: A conditional branch is reached. 
+  * JUMPBACK: A backwards, unconditional branch is reached.
+  * ENTER: A Python function is entered.
+  * EXIT: A Python function exits normally (without an exception).
+  * UNWIND: A Python function exits with an unhandled exception.
+  * C_CALL: A call to any object that is not a Python function.
+  * C_RETURN: A return from any object that is not a Python function.
+  * RAISE: An exception is raised.
+  * EXCEPT: An exception is handled.
+
+For each ``ENTER`` event there will be a corresponding
+``EXIT`` or ``UNWIND`` event.
+For each ``C_CALL`` event there will be a corresponding
+``C_RETURN`` or ``RAISE`` event.
+
+All events are integer powers of two and can be bitwise or-ed together to
+instrument multiple events.
+
+Instrumenting code objects
+''''''''''''''''''''''''''
+
+Code objects can be instrumented by calling::
+
+  instrumentation.instrument(codeobject, events)
+
+Code objects must be instrumented before they are executed.
+An exception will be raised if the code object has been executed before it
+is instrumented.
+
+Register callback functions for instrumentation
+'''''''''''''''''''''''''''''''''''''''''''''''
+
+To register a callable for events call::
+
+  instrumentation.register(event, func)
+
+Functions can be unregistered by calling
+``instrumentation.register(event, None)``.
+
+Callback functions can be registered at any time.
+
+Callback function arguments
+'''''''''''''''''''''''''''
+
+When an event occurs the registered function will be called.
+The arguments provided are as follows:
+
+* BRANCH: ``func(code: CodeType, offset: int, taken:bool)``
+* JUMPBACK: ``func(code: CodeType, offset: int)``
+* ENTER: ``func(code: CodeType, offset: int)``
+* EXIT: ``func(code: CodeType, offset: int)``
+* C_CALL: ``func(code: CodeType, offset: int, value: object)``
+* C_RETURN: ``func(code: CodeType, offset: int, value: object)``
+* C_EXCEPT: ``func(code: CodeType, offset: int, exception: BaseException)``
+* RAISE: ``func(code: CodeType, offset: int, exception: BaseException)``
+* EXCEPT: ``func(code: CodeType, offset: int)``
+* UNWIND: ``func(code: CodeType)``
+
+Monitoring
+----------
+
+Monitoring allows checkpoints to be inserted or removed at any
+point in the program's execution.
+
+The following functions are provided to insert monitoring points::
+
+  instrumentation.insert_monitors(codeobject, *offsets)
+  instrumentation.remove_monitors(codeobject, *offsets)
+  instrumentation.monitor_off(codeobject, offset)
+  instrumentation.monitor_on(codeobject, offset)
+
+All functions return ``True`` if a monitor checkpoint was present,
+or ``False`` if a monitor checkpoint was not present.
+Turning on, or off, a non-existent checkpoint is a no-op;
+no exception is raised.
+
+To register a callable for monitoring function events call::
+
+  instrumentation.monitor_register(func)
+
+The callback function will be called with the code object and offset as arguments::
+
+  func(code: CodeType, offset: int)
+
+For optimizing virtual machines, such as future versions of CPython
+(and ``PyPy`` should they choose to support this API), a call to
+``insert_monitors`` and ``remove_monitors`` in a long running program
+could be quite expensive, possibly taking 100s of milliseconds as it
+triggers de-optimizations. Repeated calls to ``insert_monitors``
+and ``remove_monitors``, as may be required in an interactive debugger,
+should be relatively inexpensive.
+
+Combining Checkpoints
+---------------------
+
+Only one instrumentation checkpoint and one monitoring checkpoint is allowed
+per bytecode instruction. It is possible to have both a monitoring and
+instrumentation checkpoint on the same instruction; they are independent.
+Monitors will be called before instrumentation if both are present.
+
+Backwards Compatibility
+=======================
+
+This PEP is fully backwards compatible.
+
+We may seek to remove ``sys.settrace`` in the future once the APIs provided
+by this PEP have been widely adopted, but that is for another PEP.
+
+
+Security Implications
+=====================
+
+Allowing modification of running code has some security implications,
+but no more than the ability to generate and call new code.
+
+All the functions listed above will trigger audit hooks.
+
+
+Implementation
+==============
+
+The implementation of this PEP will be built on top of PEP 659 quickening.
-The implementation of this PEP will be built on top of PEP 659 quickening.
+The implementation of this PEP will be built on top of PEP 659 (Specializing Adaptive Interpreter).
-The implementation of this PEP will be built on top of PEP 659 quickening.
+The implementation of this PEP will be built on top of PEP 659 (Specializing Adaptive Interpreter).
+Instrumentation or monitoring of a code object will cause it to be quickened.
+Checkpoints will then be implemented by inserting one of several special
+``CHECKPOINT`` instructions into the quickened code. These instructions
+will call the registered callable before executing the original instruction.
+
+Note that this can interfere with specialization, which will result in
+performance degradation in addition to the overhead of calling the
+registered callable.
+
+Implementing tools
+==================
+
+It is the philosophy of this PEP that third-party tools should be able to
+achieve high-performance, not that it should be easy for them to do so.
+This PEP provides the necessary API for tools, but does nothing to help 
+them determine when and where to insert instrumentation or monitors.
+
+Debuggers
+---------
+
+Inserting breakpoints
+'''''''''''''''''''''
+
+Breakpoints should be implemented as monitors.
+To insert a breakpoint at a given line, the matching instruction offsets
+should be found from ``codeobject.co_lines()``.
+Then a monitor should be added for each of those offsets.
+To avoid excessive overhead, a single call should be made to
+``instrumentation.insert_monitors`` passing all the offsets at once.
+
+Breakpoints can suspended with ``instrumentation.monitor_off``.
+
+Debuggers can break on exceptions being raised by registering a callable 
+for ``RAISE``:
+
+``instrumentation.register(RAISE, break_on_raise_handler)``
+
+Stepping
+''''''''
+
+Debuggers usually offer the ability to step execution by a
+single instruction or line.
+
+This can be implemented by inserting a new monitor at the required
+offset(s) of the code to be stepped to,
+and by removing or disabling the current monitor.
+
+It is the job of the debugger to compute the relevant offset(s).
+
+Coverage Tools
+--------------
+
+Coverage tools need to track which parts of the control graph have been
+executed. To do this, they need to track most events and map those events
+onto the control flow graph of the code object.
+``BRANCH``, ``JUMPBACK``, ``START`` and ``RESUME`` events will inform which
+basic blocks have started to execute.
+The ``RAISE`` event with mark any blocks that did not complete.
-The ``RAISE`` event with mark any blocks that did not complete.
+The ``RAISE`` event will mark any blocks that did not complete.
-The ``RAISE`` event with mark any blocks that did not complete.
+The ``RAISE`` event will mark any blocks that did not complete.
+
+This can be then be converted back into a line based report after execution
+has completed.
+
+Profilers
+---------
+
+Simple profilers need to gather information about calls.
+To do this profilers should register for the following events:
+
+* ENTER
+* EXIT
+* UNWIND
+* C_CALL
+* C_RETURN
+* RAISE
-* ENTER
-* EXIT
-* UNWIND
-* C_CALL
-* C_RETURN
-* RAISE
+* ``ENTER``
+* ``EXIT``
+* ``UNWIND``
+* ``C_CALL``
+* ``C_RETURN``
+* ``RAISE``
-* ENTER
-* EXIT
-* UNWIND
-* C_CALL
-* C_RETURN
-* RAISE
+* ``ENTER``
+* ``EXIT``
+* ``UNWIND``
+* ``C_CALL``
+* ``C_RETURN``
+* ``RAISE``
+
+Line based profilers
+''''''''''''''''''''
+
+Line based profilers will also need to handle ``BRANCH`` and ``JUMPBACK``
+events.
+Beware that handling these extra events will have a large performance impact.
+
+.. note::
+
+  Instrumenting profilers have a significant overhead and will distort the
+  results of profiling. Unless you need exact call counts,
+  consider using a statistical profiler.
+
+Open Issues
+===========
+
+[Any points that are still being decided/discussed.]
+
+
+References
+==========
+
+[A collection of URLs used as references through the PEP.]
+
+
+Copyright
+=========
+
+This document is placed in the public domain or under the
+CC0-1.0-Universal license, whichever is more permissive.
+
+
+
+..
+    Local Variables:
+    mode: indented-text
+    indent-tabs-mode: nil
+    sentence-end-double-space: t
+    fill-column: 70
+    coding: utf-8
+    End: