Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
gh-127022: Simplify
PyStackRef_FromPyObjectSteal
#127024New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
gh-127022: Simplify
PyStackRef_FromPyObjectSteal
#127024Changes from 2 commits
5583ac0
16f7e7b
3631451
06ab2ec
a9e4872
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we use
PyStackRef_IsExactly
here (which doesn't mask out the deferred bit) but usePyStackRef_IsFalse
(which does mask out the deferred bit) in_POP_JUMP_IF_FALSE
above? Is this the rare case where it's safe?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Our codegen ensures that these ops only see True or False. That's often by adding a
TO_BOOL
immediately before, which may be folded intoCOMPARE_OP
. The precedingTO_BOOL
, including inCOMPARE_OP
, ensures the canonical representation ofPyStackRef_False
orPyStackRef_True
with the deferred bit set.However, there are two places in
codegen.c
that omit theTO_BOOL
because they have other reasons to know that the result is exactly a boolean:cpython/Python/codegen.c
Lines 678 to 682 in 09c240f
cpython/Python/codegen.c
Lines 5746 to 5749 in 09c240f
The
COMPARE_OP
s here still generate bools, but not always in the canonical representation. So we can either:COMPARE_OP
to ensure the canonical representation like https://github.com/colesbury/cpython/blob/5583ac0c311132e36ef458842e087945898ffdec/Python/bytecodes.c#L2409-L2416PyStackRef_IsFalse
(instead ofPyStackRef_IsExactly
) in theJUMP_IF_FALSE
TO_BOOL
in those two spots.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That makes sense, thanks for the explanation. Since using
PyStackRef_IsExactly
safely is sensitive to code generation changes, I might suggest using it only when we're sure it actually matters for performance, and default to using the variants that mask out the deferred bits everywhere by default since those are always safe. I'd guess that this wouldn't affect the performance improvement of this change much, since it should come from avoiding the tagging in_PyStackRef_FromPyObjectSteal
. I don't feel super strongly though.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll switch to using
PyStackRef_IsFalse
andPyStackRef_IsTrue
.I'm no longer convinced that
PyStackRef_IsExactly
is actually a performance win (and I didn't see it in measurements). I think we have issues with code generation quality that we'll need to address later. Things likePOP_JUMP_IF_NONE
are composed of_IS_NONE
and_POP_JUMP_IF_TRUE
and we pack the intermediate result in a tagged_PyStackRef
. Clang does a pretty good job of optimizing through it. GCC less so: https://gcc.godbolt.org/z/Ejs8c78qd.Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.