-
-
Notifications
You must be signed in to change notification settings - Fork 1.6k
PEP 669: Low Impact Instrumentation and Monitoring for CPython. #2070
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
2b3da81
b9c6233
613d1ad
1dc252c
ceb9540
c6dbfdd
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change | ||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
@@ -0,0 +1,318 @@ | ||||||||||||||||||||||||||
PEP: 669 | ||||||||||||||||||||||||||
Title: Low Impact Instrumentation and Monitoring for CPython | ||||||||||||||||||||||||||
Author: Mark Shannon <[email protected]> | ||||||||||||||||||||||||||
Status: Draft | ||||||||||||||||||||||||||
Type: Standards Track | ||||||||||||||||||||||||||
Content-Type: text/x-rst | ||||||||||||||||||||||||||
Created: 18-Aug-2021 | ||||||||||||||||||||||||||
Post-History: 13-Sep-2021 | ||||||||||||||||||||||||||
|
||||||||||||||||||||||||||
|
||||||||||||||||||||||||||
Abstract | ||||||||||||||||||||||||||
======== | ||||||||||||||||||||||||||
|
||||||||||||||||||||||||||
Using a profiler or debugger in CPython can have a severe impact on | ||||||||||||||||||||||||||
performance. Slowdowns by an order of magnitude are not uncommon. | ||||||||||||||||||||||||||
It does not have this bad. | ||||||||||||||||||||||||||
|
||||||||||||||||||||||||||
This PEP proposes an API for instrumentation and monitoring of Python | ||||||||||||||||||||||||||
programs running on CPython that will enable the insertion of instrumentation | ||||||||||||||||||||||||||
and monitoring at low cost. | ||||||||||||||||||||||||||
|
||||||||||||||||||||||||||
Using the new API, code run under a debugger on 3.11 should easily outperform | ||||||||||||||||||||||||||
code run without a debugger on 3.10. | ||||||||||||||||||||||||||
Comment on lines
+22
to
+23
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Presumably only when the debugger is not actually invoked? I can believe it will be faster for code in a frame containing a breakpoint that is never hit. But I cannot believe it would be faster if some Python code would be run during e.g. line tracing or when evaluating a condition each time a breakpoint is hit. |
||||||||||||||||||||||||||
|
||||||||||||||||||||||||||
Profiling will still slow down execution, but by much less than in 3.10. | ||||||||||||||||||||||||||
|
||||||||||||||||||||||||||
Motivation | ||||||||||||||||||||||||||
========== | ||||||||||||||||||||||||||
|
||||||||||||||||||||||||||
Developers should not have to pay an unreasonable cost to use debuggers, | ||||||||||||||||||||||||||
profilers and other similar tools. | ||||||||||||||||||||||||||
|
||||||||||||||||||||||||||
C++ and Java developers expect to be able to run a program at full speed | ||||||||||||||||||||||||||
(or very close to it) under a debugger. | ||||||||||||||||||||||||||
Python developers should expect that too. | ||||||||||||||||||||||||||
Comment on lines
+27
to
+35
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This section is just rhetoric. There's already a bunch of that in the Abstract ("It does not have this bad", "at low cost", "easily outperform") -- I'd like to hear more about what approach this is actually taking sooner. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It is the "motivation" section; it should be motivating 🙂 There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. For those who think that I'm obviously in agreement that a better approach would be great, just concerned that there may be push back from people who don't see the value when a few lines here about the current state of things (maybe the frequent checks for tracing?) would probably motivate things. |
||||||||||||||||||||||||||
|
||||||||||||||||||||||||||
Rationale | ||||||||||||||||||||||||||
========= | ||||||||||||||||||||||||||
|
||||||||||||||||||||||||||
The quickening mechanism provided by PEP 659 provides a way to dynamically | ||||||||||||||||||||||||||
modify executing Python bytecode. These modifications have no cost beyond | ||||||||||||||||||||||||||
the parts of the code that are modified and a relatively low cost to those | ||||||||||||||||||||||||||
parts that are modified. We can leverage this to provide an efficient | ||||||||||||||||||||||||||
mechanism for instrumentation and monitoring that was not possible in 3.10 | ||||||||||||||||||||||||||
or earlier. | ||||||||||||||||||||||||||
Comment on lines
+40
to
+45
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is something that you could lead with. |
||||||||||||||||||||||||||
|
||||||||||||||||||||||||||
Specification | ||||||||||||||||||||||||||
============= | ||||||||||||||||||||||||||
|
||||||||||||||||||||||||||
There are two parts to this specification, instrumentation and monitoring. | ||||||||||||||||||||||||||
|
||||||||||||||||||||||||||
Instrumentation occurs early in a program's life cycle and persists through | ||||||||||||||||||||||||||
the lifetime of the program. It is expected to be pervasive, but fixed. | ||||||||||||||||||||||||||
Instrumentation is designed to support profiling and coverage tools that | ||||||||||||||||||||||||||
expect to be active for the entire lifetime of the program. | ||||||||||||||||||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What if I want to profile only a specific function call? It that not covered? I suppose it is, but your simplified description of instrumentation leaves little room for it. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. To profile calls to a specific function instrumentation.instrument(foo.__code__, instrumentation.ENTER)
instrumentation.register(instrumentation.ENTER, profiler_func) |
||||||||||||||||||||||||||
|
||||||||||||||||||||||||||
Monitoring can occur at any point in the program's life and be applied | ||||||||||||||||||||||||||
anywhere in the program. Monitoring points are expected to few. | ||||||||||||||||||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||||||||||||||||||||||
The capabilities of monitoring are a superset of that of profiling, | ||||||||||||||||||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Now I'm confused. The use cases that I'm familiar with are coverage, profiling and debugging (both stepping through code and breakpoints). How do these use cases map to the two parts, instrumentation and monitoring? You imply there's a mapping, but I'm confused by what the mapping is supposed to be, since debugging seems to fall in neither category. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Debugging is a form of monitoring. The debugger is monitoring the program being run. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think the terms are fine, all I'm asking for is that you clarify the mapping between the two concepts and the three or more use cases. 'Monitoring is a superset of profiling" doesn't do that for me, but "A profiler would use the monitoring API" would. Except your example of profiling a function (in the comment above) uses instrumentation, not profiling. So I'm still confused about the relationship between instrumentation and monitoring, and between those and the use cases. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don't think the draft terms are fine, as I initially thought the "instrumentation API" was going to be "How clients specify what to monitor" and the monitoring API was going to be "How clients are notified of monitored events". Even after having the distinction explained, I don't think I'd be able to remember "instrumentation is low overhead, monitoring is high overhead". If anything, I would expect them to be the other way around due to the way somewhat similar terminology gets used in hardware testing (non-intrusively monitoring ordinary device outputs vs instrumenting the test rig with additional data collection points). For the actual distinction being made, my suggestions would be:
|
||||||||||||||||||||||||||
but bulk insertion of monitoring points will be *much* more | ||||||||||||||||||||||||||
expensive than insertion of instrumentation. | ||||||||||||||||||||||||||
Comment on lines
+60
to
+61
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This feels like a non-sequitur. First you compare monitoring to profiling, and then you compare it to instrumentation, separated by "but". I miss the intention of that "but". |
||||||||||||||||||||||||||
|
||||||||||||||||||||||||||
Both instrumentation and monitoring is performed by insertion of | ||||||||||||||||||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. "are" |
||||||||||||||||||||||||||
checkpoints in a code object. | ||||||||||||||||||||||||||
|
||||||||||||||||||||||||||
Checkpoints | ||||||||||||||||||||||||||
----------- | ||||||||||||||||||||||||||
|
||||||||||||||||||||||||||
A checkpoint is simply a point in code defined by a | ||||||||||||||||||||||||||
``(codeobject, offset)`` pair. | ||||||||||||||||||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Are offsets measured in bytes or instructions? |
||||||||||||||||||||||||||
Every time a checkpoint is reached, the registered callable is called. | ||||||||||||||||||||||||||
|
||||||||||||||||||||||||||
Instrumentation | ||||||||||||||||||||||||||
--------------- | ||||||||||||||||||||||||||
|
||||||||||||||||||||||||||
Instrumentation supports the bulk insertion of checkpoints, but does not | ||||||||||||||||||||||||||
allow insertion or removal of checkpoints after code has started to execute. | ||||||||||||||||||||||||||
|
||||||||||||||||||||||||||
The events are:: | ||||||||||||||||||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Define "event" first. |
||||||||||||||||||||||||||
|
||||||||||||||||||||||||||
* BRANCH: A conditional branch is reached. | ||||||||||||||||||||||||||
* JUMPBACK: A backwards, unconditional branch is reached. | ||||||||||||||||||||||||||
* ENTER: A Python function is entered. | ||||||||||||||||||||||||||
* EXIT: A Python function exits normally (without an exception). | ||||||||||||||||||||||||||
* UNWIND: A Python function exits with an unhandled exception. | ||||||||||||||||||||||||||
* C_CALL: A call to any object that is not a Python function. | ||||||||||||||||||||||||||
* C_RETURN: A return from any object that is not a Python function. | ||||||||||||||||||||||||||
* RAISE: An exception is raised. | ||||||||||||||||||||||||||
* EXCEPT: An exception is handled. | ||||||||||||||||||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. At what point is an exception considered being handled? E.g. in try:
1/0
except RuntimeError:
print(1)
except ZeroDivisionError:
print(2) Does the checking whether the raised exception matches There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Yes. These are VM level events. I've changed the wording to : |
||||||||||||||||||||||||||
|
||||||||||||||||||||||||||
For each ``ENTER`` event there will be a corresponding | ||||||||||||||||||||||||||
``EXIT`` or ``UNWIND`` event. | ||||||||||||||||||||||||||
For each ``C_CALL`` event there will be a corresponding | ||||||||||||||||||||||||||
``C_RETURN`` or ``RAISE`` event. | ||||||||||||||||||||||||||
|
||||||||||||||||||||||||||
All events are integer powers of two and can be bitwise or-ed together to | ||||||||||||||||||||||||||
instrument multiple events. | ||||||||||||||||||||||||||
|
||||||||||||||||||||||||||
Instrumenting code objects | ||||||||||||||||||||||||||
'''''''''''''''''''''''''' | ||||||||||||||||||||||||||
|
||||||||||||||||||||||||||
Code objects can be instrumented by calling:: | ||||||||||||||||||||||||||
|
||||||||||||||||||||||||||
instrumentation.instrument(codeobject, events) | ||||||||||||||||||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. For code objects containing other code objects (e.g. nested functions/classes, lambdas, comprehensions), does this affect the sub-objects? Also, at this point I'm really holding my breath until I see how the event is delivered to some kind of event handler. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
No. Just the code object specified. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. That's worth calling out in the text as a clarification then. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is this a new module? Why not put it in |
||||||||||||||||||||||||||
|
||||||||||||||||||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Could this function just be called |
||||||||||||||||||||||||||
Code objects must be instrumented before they are executed. | ||||||||||||||||||||||||||
An exception will be raised if the code object has been executed before it | ||||||||||||||||||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It said earlier that quickening lets us modify "executing" bytecode. Is that not actually the case? And if so, when do we have a chance to instrument new code objects? Do they all have to be done on creation? Or can we do it on first call? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. One place that should have the chance to instrument called code objects is the event handling in the parent function. I think there's a missing event type for that purpose though: there's currently no event that triggers just before a code object is invoked (receiving the code objects for both the caller and callee, and the offset in the caller). |
||||||||||||||||||||||||||
is instrumented. | ||||||||||||||||||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think I had misread this PEP... I thought that only by using I think that the PEP could make that clearer. Now, if it's really the case that:
I'd say I'm -1 on the PEP as it'd just not be usable for many use cases (the happy path where this API would work would be too narrow to be usable)... For instance, it's very common for users to attach to a running program and this PEP wouldn't support that at all and it'd be next to impossible to get a hold of all the code objects that'd need to be instrumented before they're actually run even on the case where the program is started with the debugger in place (an import hook to instrument code objects isn't enough as there are many corner cases where it wouldn't be able to get it). Now, if I misunderstood it, then I think the PEP should make it clearer that those are in fact supported and which API would be used for that... There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is the distinction between the low overhead passive monitoring API ('instrumentation' in the current text) and the high overhead dynamic monitoring API ('monitoring' in the current text). Turning on code coverage or profiling would have to be done early, so could be done dynamically only in the presence of dynamic compilation. But an interactive debugger would use the kinds of hooks that can be added to an existing code object, it wouldn't use the ones that are intended for code coverage and profiling. |
||||||||||||||||||||||||||
|
||||||||||||||||||||||||||
Register callback functions for instrumentation | ||||||||||||||||||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Maybe this section needs to come first? |
||||||||||||||||||||||||||
''''''''''''''''''''''''''''''''''''''''''''''' | ||||||||||||||||||||||||||
|
||||||||||||||||||||||||||
To register a callable for events call:: | ||||||||||||||||||||||||||
|
||||||||||||||||||||||||||
instrumentation.register(event, func) | ||||||||||||||||||||||||||
|
||||||||||||||||||||||||||
Functions can be unregistered by calling | ||||||||||||||||||||||||||
``instrumentation.register(event, None)``. | ||||||||||||||||||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Could something more descriptive than "register" be used as the name here? If I've understood the purpose of the function correctly, then |
||||||||||||||||||||||||||
|
||||||||||||||||||||||||||
Callback functions can be registered at any time. | ||||||||||||||||||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. These callbacks are global, right? (Per-interpreter, I presume.) Which makes me wonder about threads and reentrancy. I suppose to some extent callbacks are protected by the GIL. But callbacks can be Python code (presumably). I suppose events are disabled while a callback is running? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Events are always active unless explicitly disabled. Callbacks can be any callable, including Python code. You can add monitor checkpoints to the callback function for instrumentation. It will be possible to debug a profiler. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Ah, this makes sense now I understand that each code object needs to be instrumented separately. Nevertheless your clarifications in the comments would be helpful if added to the text. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Seconding the question about whether they're global/per-interpreter/per-thread/per-...whatever. Right now, people hit issues, confusion and sometimes ideas due to settrace's single-threadedness (particularly when you start tracing an already-running application and don't have a chance to force your trace into every thread, which is something we also saw regularly building the debugger for VS). The main problem being that tracing on one thread doesn't give you any way to trace - or even pause execution of - the others. e.g. once you hit a breakpoint in one thread, the first thing you do that releases the GIL is going to let other threads start executing. Having some way to also trigger an event in the context of every thread would let a thread-aware debugger control which of them can keep executing. (Or maybe some other kind of approach makes more sense, that's just an idea that jumps out at me.) It doesn't seem to be essential to this PEP for this to be supported, but it's a good opportunity to add it while we're defining new events. At the very least, having clear statements about the threading behaviour for what this PEP does add are essential. |
||||||||||||||||||||||||||
|
||||||||||||||||||||||||||
Callback function arguments | ||||||||||||||||||||||||||
''''''''''''''''''''''''''' | ||||||||||||||||||||||||||
|
||||||||||||||||||||||||||
When an event occurs the registered function will be called. | ||||||||||||||||||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. And events only occur if they were instrumented? |
||||||||||||||||||||||||||
The arguments provided are as follows: | ||||||||||||||||||||||||||
|
||||||||||||||||||||||||||
* BRANCH: ``func(code: CodeType, offset: int, taken:bool)`` | ||||||||||||||||||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is the code object an important parameter? The frame is generally more useful (and implicitly includes the code object). There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Then again, I guess for a profiler the code object is enough as it's likely maintaining its own call stacks. Debuggers will need to walk the frame even if callers weren't instrumented, and so having them do an extra call to start walking the stack is better than preemptively passing it in. If that's the case, a sentence explaining it would be nice, so the next reader doesn't have to guess. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. +1 to the case where the frame is needed for debuggers (i.e.: this PEP should have some note stating the recommended way to get the frame related to the instrumentation notification). -- as a note, if performance-wise it'd be the same, I'd say that receiving the frame which contains the code object would be better. If there's some performance penalty, then another way to get it would also be reasonable since it's not always needed (but the recommended way to get it should be explicit in the PEP). There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It wouldn't be the same performance wise. The new eval loop mostly avoids creating full python frame objects, so this new API provides an opportunity to retain that performance improvement, whereas the old one loses it (the frame objects have to be created anyway in order to pass them to the installed trace hook). There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I do wonder if it might make sense to pass in a callable that requests the full frame object though. Otherwise hooks would have to use something similar to sys._getframes(), which feels awkward. That said, the callable wouldn't be cheap to create either, so @zooba's suggestion is probably the way to go (i.e. add text to the PEP explaining why the event callback API is the way it is) |
||||||||||||||||||||||||||
* JUMPBACK: ``func(code: CodeType, offset: int)`` | ||||||||||||||||||||||||||
* ENTER: ``func(code: CodeType, offset: int)`` | ||||||||||||||||||||||||||
* EXIT: ``func(code: CodeType, offset: int)`` | ||||||||||||||||||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Are these with statement entry and exit or function entry and exit? Either way, it's likely worth making the event name longer to be clear about it. |
||||||||||||||||||||||||||
* C_CALL: ``func(code: CodeType, offset: int, value: object)`` | ||||||||||||||||||||||||||
* C_RETURN: ``func(code: CodeType, offset: int, value: object)`` | ||||||||||||||||||||||||||
* C_EXCEPT: ``func(code: CodeType, offset: int, exception: BaseException)`` | ||||||||||||||||||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I believe that both the exception as well as the traceback would be important in the As a note, if the idea is using |
||||||||||||||||||||||||||
* RAISE: ``func(code: CodeType, offset: int, exception: BaseException)`` | ||||||||||||||||||||||||||
* EXCEPT: ``func(code: CodeType, offset: int)`` | ||||||||||||||||||||||||||
* UNWIND: ``func(code: CodeType)`` | ||||||||||||||||||||||||||
Comment on lines
+128
to
+137
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The order here is different than in the first list of events. Maybe order them the same each time? |
||||||||||||||||||||||||||
|
||||||||||||||||||||||||||
Monitoring | ||||||||||||||||||||||||||
---------- | ||||||||||||||||||||||||||
|
||||||||||||||||||||||||||
Monitoring allows checkpoints to be inserted or removed at any | ||||||||||||||||||||||||||
point in the program's execution. | ||||||||||||||||||||||||||
|
||||||||||||||||||||||||||
The following functions are provided to insert monitoring points:: | ||||||||||||||||||||||||||
|
||||||||||||||||||||||||||
instrumentation.insert_monitors(codeobject, *offsets) | ||||||||||||||||||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. With the adjusted naming I proposed above, the associated names for these APIs would be:
The callback registration function |
||||||||||||||||||||||||||
instrumentation.remove_monitors(codeobject, *offsets) | ||||||||||||||||||||||||||
instrumentation.monitor_off(codeobject, offset) | ||||||||||||||||||||||||||
instrumentation.monitor_on(codeobject, offset) | ||||||||||||||||||||||||||
Comment on lines
+147
to
+150
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
No. Turning a monitor on or off is a very cheap (<1us) operation. Inserting or removing a monitor is a very expensive operation maybe taking 100s of milliseconds as it may cause cascading de-optimizations. So, we want to allow multiple monitoring checkpoints to be inserted at once because:
If a user can't listify the arguments, I don't have much hope of them implementing a debugger 🙂 There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Oh. This important distinction might be easier to follow if you allowed yourself some mention of the (roughly) intended implementation. In my imagination, inserting monitors is roughly equivalent to
But then monitor_off() would seem to be equivalent to replacing the MONITOR opcode with the original opcode (or perhaps the specialized variant, if we care). What kind of cascading de-optimizations am I missing? Anything that's currently implemented, or is this referring to e.g. generating machine code? |
||||||||||||||||||||||||||
|
||||||||||||||||||||||||||
All functions return ``True`` if a monitor checkpoint was present, | ||||||||||||||||||||||||||
or ``False`` if a monitor checkpoint was not present. | ||||||||||||||||||||||||||
Comment on lines
+152
to
+153
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is a little vague. If I insert checkpoints at offsets 10 and 20, and one was present at 10 but not at 20, should it return True or False? Or should it return a list[bool]? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes that is ambiguous. |
||||||||||||||||||||||||||
Turning on, or off, a non-existent checkpoint is a no-op; | ||||||||||||||||||||||||||
no exception is raised. | ||||||||||||||||||||||||||
Comment on lines
+154
to
+155
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. But presumably it would return False, right? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes |
||||||||||||||||||||||||||
|
||||||||||||||||||||||||||
To register a callable for monitoring function events call:: | ||||||||||||||||||||||||||
|
||||||||||||||||||||||||||
instrumentation.monitor_register(func) | ||||||||||||||||||||||||||
|
||||||||||||||||||||||||||
The callback function will be called with the code object and offset as arguments:: | ||||||||||||||||||||||||||
|
||||||||||||||||||||||||||
func(code: CodeType, offset: int) | ||||||||||||||||||||||||||
|
||||||||||||||||||||||||||
For optimizing virtual machines, such as future versions of CPython | ||||||||||||||||||||||||||
(and ``PyPy`` should they choose to support this API), a call to | ||||||||||||||||||||||||||
``insert_monitors`` and ``remove_monitors`` in a long running program | ||||||||||||||||||||||||||
could be quite expensive, possibly taking 100s of milliseconds as it | ||||||||||||||||||||||||||
triggers de-optimizations. Repeated calls to ``insert_monitors`` | ||||||||||||||||||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Where do you foresee these 100s of ms being spent? Tracking down all the affected code objects? Rewriting machine code? Clearing the inline cache data? What else? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes those sort of things. We might need to scan all compiled code to see what needs to be de-optimized, then do quite a lot of clean up. I would expect this to only happen in an interactive debugger, where a 500ms pause is barely noticeable. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. (I realize I asked the same question again before reading your reply. But I'm still not sure of the scope of the deoptimization. All your APIs are specific to a code object, and I can't imagine deoptimizing a single code object taking more than a few msec. So there must be some kind of optimization you are planning that spans multiple code objects. What? Is there an issue in the faster-cpython/ideas tracker about this?) |
||||||||||||||||||||||||||
and ``remove_monitors``, as may be required in an interactive debugger, | ||||||||||||||||||||||||||
should be relatively inexpensive. | ||||||||||||||||||||||||||
|
||||||||||||||||||||||||||
Combining Checkpoints | ||||||||||||||||||||||||||
--------------------- | ||||||||||||||||||||||||||
|
||||||||||||||||||||||||||
Only one instrumentation checkpoint and one monitoring checkpoint is allowed | ||||||||||||||||||||||||||
per bytecode instruction. It is possible to have both a monitoring and | ||||||||||||||||||||||||||
instrumentation checkpoint on the same instruction; they are independent. | ||||||||||||||||||||||||||
Monitors will be called before instrumentation if both are present. | ||||||||||||||||||||||||||
Comment on lines
+176
to
+179
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This confused me, since the instrumentation API has no explicit mention of checkpoints (there are no offsets in the instrumentation API, only code objects). I take it that instrumentation works by replacing specific bytecodes (e.g. certain JUMP instructions for BRANCH and JUMPBACK), and monitoring also works by doing that? And the instruction has two flag bits indicating whether it is an instrumentation or monitoring checkpoint or both? It would seem that the APIs already imply that only one instrumentation checkpoint can exist per instruction, and only one monitoring checkpoint. So this paragraph only adds in which order they will be called. |
||||||||||||||||||||||||||
|
||||||||||||||||||||||||||
Backwards Compatibility | ||||||||||||||||||||||||||
======================= | ||||||||||||||||||||||||||
|
||||||||||||||||||||||||||
This PEP is fully backwards compatible. | ||||||||||||||||||||||||||
|
||||||||||||||||||||||||||
We may seek to remove ``sys.settrace`` in the future once the APIs provided | ||||||||||||||||||||||||||
by this PEP have been widely adopted, but that is for another PEP. | ||||||||||||||||||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It would make sense to specify whether the new callback hooks are invoked before or after the existing trace hooks. |
||||||||||||||||||||||||||
|
||||||||||||||||||||||||||
|
||||||||||||||||||||||||||
Security Implications | ||||||||||||||||||||||||||
===================== | ||||||||||||||||||||||||||
|
||||||||||||||||||||||||||
Allowing modification of running code has some security implications, | ||||||||||||||||||||||||||
but no more than the ability to generate and call new code. | ||||||||||||||||||||||||||
|
||||||||||||||||||||||||||
All the functions listed above will trigger audit hooks. | ||||||||||||||||||||||||||
|
||||||||||||||||||||||||||
|
||||||||||||||||||||||||||
Implementation | ||||||||||||||||||||||||||
============== | ||||||||||||||||||||||||||
|
||||||||||||||||||||||||||
The implementation of this PEP will be built on top of PEP 659 quickening. | ||||||||||||||||||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||||||||||||||||||||||
Instrumentation or monitoring of a code object will cause it to be quickened. | ||||||||||||||||||||||||||
Checkpoints will then be implemented by inserting one of several special | ||||||||||||||||||||||||||
``CHECKPOINT`` instructions into the quickened code. These instructions | ||||||||||||||||||||||||||
will call the registered callable before executing the original instruction. | ||||||||||||||||||||||||||
|
||||||||||||||||||||||||||
Note that this can interfere with specialization, which will result in | ||||||||||||||||||||||||||
performance degradation in addition to the overhead of calling the | ||||||||||||||||||||||||||
registered callable. | ||||||||||||||||||||||||||
|
||||||||||||||||||||||||||
Implementing tools | ||||||||||||||||||||||||||
================== | ||||||||||||||||||||||||||
|
||||||||||||||||||||||||||
It is the philosophy of this PEP that third-party tools should be able to | ||||||||||||||||||||||||||
achieve high-performance, not that it should be easy for them to do so. | ||||||||||||||||||||||||||
This PEP provides the necessary API for tools, but does nothing to help | ||||||||||||||||||||||||||
them determine when and where to insert instrumentation or monitors. | ||||||||||||||||||||||||||
|
||||||||||||||||||||||||||
Debuggers | ||||||||||||||||||||||||||
--------- | ||||||||||||||||||||||||||
|
||||||||||||||||||||||||||
Inserting breakpoints | ||||||||||||||||||||||||||
''''''''''''''''''''' | ||||||||||||||||||||||||||
|
||||||||||||||||||||||||||
Breakpoints should be implemented as monitors. | ||||||||||||||||||||||||||
To insert a breakpoint at a given line, the matching instruction offsets | ||||||||||||||||||||||||||
should be found from ``codeobject.co_lines()``. | ||||||||||||||||||||||||||
Then a monitor should be added for each of those offsets. | ||||||||||||||||||||||||||
Comment on lines
+227
to
+229
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm not super familiar with co_lines(). When does it return multiple offsets? Only when the code of that line is duplicated (e.g. by loop unrolling or similar optimizations)? |
||||||||||||||||||||||||||
To avoid excessive overhead, a single call should be made to | ||||||||||||||||||||||||||
``instrumentation.insert_monitors`` passing all the offsets at once. | ||||||||||||||||||||||||||
|
||||||||||||||||||||||||||
Breakpoints can suspended with ``instrumentation.monitor_off``. | ||||||||||||||||||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Does that re-specialize the instruction at that offset? Regardless of the answer it would seem this is deleting a breakpoint, not just suspending it. (If there was a difference, I'd thing that suspending makes it easy to re-enable, but there doesn't seem to be a semantic difference in this case, so I prefer the simpler "deleting".) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Maybe, but probably not. Turning a monitor off and removing it have very different performance profiles. Turning a monitor off is cheap, but will interfere with optimization. It is up to the implementer of the debugger which to use, but if the user of the debugger turns a breakpoint off, rather than removing it, it implies they may want to turn it on again soon. |
||||||||||||||||||||||||||
|
||||||||||||||||||||||||||
Debuggers can break on exceptions being raised by registering a callable | ||||||||||||||||||||||||||
for ``RAISE``: | ||||||||||||||||||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. So the |
||||||||||||||||||||||||||
|
||||||||||||||||||||||||||
``instrumentation.register(RAISE, break_on_raise_handler)`` | ||||||||||||||||||||||||||
|
||||||||||||||||||||||||||
Stepping | ||||||||||||||||||||||||||
'''''''' | ||||||||||||||||||||||||||
|
||||||||||||||||||||||||||
Debuggers usually offer the ability to step execution by a | ||||||||||||||||||||||||||
single instruction or line. | ||||||||||||||||||||||||||
|
||||||||||||||||||||||||||
This can be implemented by inserting a new monitor at the required | ||||||||||||||||||||||||||
offset(s) of the code to be stepped to, | ||||||||||||||||||||||||||
and by removing or disabling the current monitor. | ||||||||||||||||||||||||||
|
||||||||||||||||||||||||||
It is the job of the debugger to compute the relevant offset(s). | ||||||||||||||||||||||||||
|
||||||||||||||||||||||||||
Coverage Tools | ||||||||||||||||||||||||||
-------------- | ||||||||||||||||||||||||||
|
||||||||||||||||||||||||||
Coverage tools need to track which parts of the control graph have been | ||||||||||||||||||||||||||
executed. To do this, they need to track most events and map those events | ||||||||||||||||||||||||||
onto the control flow graph of the code object. | ||||||||||||||||||||||||||
``BRANCH``, ``JUMPBACK``, ``START`` and ``RESUME`` events will inform which | ||||||||||||||||||||||||||
basic blocks have started to execute. | ||||||||||||||||||||||||||
The ``RAISE`` event with mark any blocks that did not complete. | ||||||||||||||||||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. "with"? |
||||||||||||||||||||||||||
|
||||||||||||||||||||||||||
This can be then be converted back into a line based report after execution | ||||||||||||||||||||||||||
has completed. | ||||||||||||||||||||||||||
|
||||||||||||||||||||||||||
Profilers | ||||||||||||||||||||||||||
--------- | ||||||||||||||||||||||||||
|
||||||||||||||||||||||||||
Simple profilers need to gather information about calls. | ||||||||||||||||||||||||||
To do this profilers should register for the following events: | ||||||||||||||||||||||||||
|
||||||||||||||||||||||||||
* ENTER | ||||||||||||||||||||||||||
* EXIT | ||||||||||||||||||||||||||
* UNWIND | ||||||||||||||||||||||||||
* C_CALL | ||||||||||||||||||||||||||
* C_RETURN | ||||||||||||||||||||||||||
* RAISE | ||||||||||||||||||||||||||
Comment on lines
+271
to
+276
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Why is There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. To match |
||||||||||||||||||||||||||
|
||||||||||||||||||||||||||
Line based profilers | ||||||||||||||||||||||||||
'''''''''''''''''''' | ||||||||||||||||||||||||||
|
||||||||||||||||||||||||||
Line based profilers will also need to handle ``BRANCH`` and ``JUMPBACK`` | ||||||||||||||||||||||||||
events. | ||||||||||||||||||||||||||
Beware that handling these extra events will have a large performance impact. | ||||||||||||||||||||||||||
|
||||||||||||||||||||||||||
.. note:: | ||||||||||||||||||||||||||
|
||||||||||||||||||||||||||
Instrumenting profilers have a significant overhead and will distort the | ||||||||||||||||||||||||||
results of profiling. Unless you need exact call counts, | ||||||||||||||||||||||||||
consider using a statistical profiler. | ||||||||||||||||||||||||||
|
||||||||||||||||||||||||||
Open Issues | ||||||||||||||||||||||||||
=========== | ||||||||||||||||||||||||||
|
||||||||||||||||||||||||||
[Any points that are still being decided/discussed.] | ||||||||||||||||||||||||||
|
||||||||||||||||||||||||||
|
||||||||||||||||||||||||||
References | ||||||||||||||||||||||||||
========== | ||||||||||||||||||||||||||
|
||||||||||||||||||||||||||
[A collection of URLs used as references through the PEP.] | ||||||||||||||||||||||||||
Comment on lines
+297
to
+300
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. You can remove this if you're not planning to add any references. |
||||||||||||||||||||||||||
|
||||||||||||||||||||||||||
|
||||||||||||||||||||||||||
Copyright | ||||||||||||||||||||||||||
========= | ||||||||||||||||||||||||||
|
||||||||||||||||||||||||||
This document is placed in the public domain or under the | ||||||||||||||||||||||||||
CC0-1.0-Universal license, whichever is more permissive. | ||||||||||||||||||||||||||
|
||||||||||||||||||||||||||
|
||||||||||||||||||||||||||
|
||||||||||||||||||||||||||
.. | ||||||||||||||||||||||||||
Local Variables: | ||||||||||||||||||||||||||
mode: indented-text | ||||||||||||||||||||||||||
indent-tabs-mode: nil | ||||||||||||||||||||||||||
sentence-end-double-space: t | ||||||||||||||||||||||||||
fill-column: 70 | ||||||||||||||||||||||||||
coding: utf-8 | ||||||||||||||||||||||||||
End: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.