Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
GH-91048: Add utils for capturing async call stack for asyncio programs and enable profiling #124640
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GH-91048: Add utils for capturing async call stack for asyncio programs and enable profiling #124640
Changes from 45 commits
1b01a91
0fc5511
1d20a51
c8be18e
abf2cb9
20ceab7
72d9321
c9475f6
e1099e9
817f88b
54386ac
98434f0
485c166
8802be7
1ddc9cf
bc9beb8
fd141d4
2d72f24
391defa
d6357fd
bb3b6df
54c99ec
c1a4f09
027d522
c2d5ec6
08d09eb
fe3113b
18ec26d
e4cc462
d5cdc36
83606f2
5edac41
8dc6d34
30884ea
1317658
81b0a31
258ce3d
b9ecefb
b77dcb0
8867946
87d2524
b47bef1
230b7ec
b1d6158
ac51364
c7e59eb
9eba5e1
59121f6
f8f48f0
74c5ad1
067c043
9f04911
0774805
3048493
1f42873
7799391
03ed5c1
21f9ea9
8a43dfa
b3fae68
d0aedf0
df0032a
0ce241b
8f126f6
966d84e
f56468a
404b88a
911fed8
ab511a4
c3c685a
785adeb
a577328
064129a
ce332d9
d6d943f
703ff46
e867863
9cb5b29
61b2b7b
9533ab9
596191d
ad9152e
066bf21
4caeec4
38f061d
a8dd667
cf8f5e5
eda9c7c
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is adding yet another field generators for the use of async.
Remember that generators are used as iterators, probably more often than they are used for coroutines.
For code like
any(expr for var in seq)
andtuple(expr for var in seq)
this is unwanted expense.Is there no way to add this tasks, not generators?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, I don't see how it is safe for this to be a borrowed reference. What guarantee is there that the object passed to
_PyCoro_SetTask
will outlive the generator?If the answer is "_PyCoro_SetTask is an internal function", then why is it used by asyncio which is a module?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AFAICT there is too much entanglement between generators and coroutines to split their implementation entirely to avoid this. How can we measure what this unwanted expense actually is?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Splitting their implementations is desirable, but I appreciate that it won't happen in this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cr_task
is a borrowed reference, and is exposed through a public API, which is unsafe.So it will need to made a strong reference.
That means that it not only does it takes space and needs initializing, it will also need to be cleared when the generator is destroyed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@markshannon we agree with your feedback. I'll refactor the PR to remove the need for
cr_task
(and therefore thegi_task
debug offset). We'll maintain a list of running tasks in _asynciomodule.c since this is asyncio-specific anyway.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The stack pointer is not always readable, nor is it always correct, nor is it part of the ABI, and when it is correct or not may change at a whim.
In fact, it didn't even exist in 3.13, and may not exist again in future versions.
Overall, I think it is better not to publish this offset lest it suggest that the value is meaningful.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is needed because there is no other way to get the current running coroutine object from the frame with the current new layout from external profilers they can only read memory. Please check the tests.
If you have a better alternative I am happy to move to that but right now I don't see any other possibility
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you point me at (pun not intended) a situation where the stack pointer will be invalid in a way that will make the external introspection (done in the tests here) not work?
Is there a design that you have in mind that would result in the removal of this entirely?
AFAICT if we decided to drop deferred refcounting, we'd return to what
stacktop
was doing in the past, which was more-less equivalent for our purpose here. Moreover, @Fidget-Spinner claims the stack pointer is a better design regardless of deferred refcounting.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@markshannon Mark, I think the contract here is that CPython can move freely with its refactorings and changes. If
stackpointer
breaks or becomes completely non-meaningful we'll just mark our test "xfail" and figure another way to introspect the particular future version of CPython. I wouldn't worry about these offsets at all, we're not saying that they are public API.Same as internal layout of Set/Dict/asyncio objects -- they are all subject to break from one version to another.
I really think this particular change is fine -- CPython doesn't care nor does guarantee anything whatsoever here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We experimented with a register VM. We ultimately decided against it, but if we had gone for it there would have been no stack pointer.
The problem is that optimizations may leave the tests working, but the stack pointer could easily be meaningless in other places.
I would strongly recommend that you avoid relying on the stack pointer in any way.
We expect that the stack will consist of zero or more values in registers followed by the remaining values in memory, with the stack pointer pointing to some offset from the in-memory stack top.
The offset and number of values in registers will be not be knowable at runtime.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pablogsal
Where? I don't see where it is used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
https://github.com/python/cpython/pull/124640/files#diff-08adf6717de7c662b4e6960c5c428e6c46b1ef5370b562a4b0c92ecb071ce733R728
(Modules/_testexternalinspection.c in line 728)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will copy It here for your convenience:
as you can see is needed to go from the coroutine to the
gi_await
field.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is an implementation detail, and may not exist in future versions of CPython
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Everything in the debug offsets is an implementation detail so I don't think this made this special. Once more, this is used to follow the chain of coroutines so if it it changes we will adapt but the test we add ensures that there is an alternative and we don't change this in a way that breaks all profilers
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, see also my comment here. I don't think it's worth anyones time to debate these offsets. Yes they are subject to break and that's fine, they are not a public API. Profilers and debuggers expect this kind of stuff to drift.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK. I worry that when they do break, we'll be pressured to restore whatever field they were relying on.
I'd be happier if these offsets were in their own header, with a large clear warning about breakage.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think which header file they live in should matter, we don't need a new header for this specific internal implementation detail, this already is such a header. Lets just add the desired code comment about stability expectations and breakage as instructions to our future selves that if these need to change, so be it, and that the internal implementation detail tests relying on these can be marked as expected failures at that time.