Skip to content

GH-130415: Narrow str to "" based on boolean tests #130476

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 9 commits into from
Mar 4, 2025

Conversation

fluhus
Copy link
Contributor

@fluhus fluhus commented Feb 22, 2025

Assign value to string when an if evaluates to false.

@brandtbucher

@ghost
Copy link

ghost commented Feb 22, 2025

All commit authors signed the Contributor License Agreement.
CLA signed

@bedevere-app
Copy link

bedevere-app bot commented Feb 22, 2025

Most changes to Python require a NEWS entry. Add one using the blurb_it web app or the blurb command-line tool.

If this change has little impact on Python users, wait for a maintainer to apply the skip news label instead.

@brandtbucher brandtbucher self-assigned this Feb 22, 2025
@brandtbucher brandtbucher added performance Performance or resource usage interpreter-core (Objects, Python, Grammar, and Parser dirs) topic-JIT labels Feb 22, 2025
@bedevere-app
Copy link

bedevere-app bot commented Feb 22, 2025

A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated.

Once you have made the requested changes, please leave a comment on this pull request containing the phrase I have made the requested changes; please review again. I will then notify any core developers who have left a review that you're ready for them to take another look at this pull request.

@fluhus
Copy link
Contributor Author

fluhus commented Feb 22, 2025

Added requested corrections. Thanks, @brandtbucher !

Copy link
Member

@markshannon markshannon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for contribution.
Unfortunately, I think there is a critical flaw in this approach as it could result in mis-optimizations in the future.

This would be a useful optimization, so if you're willing to pursue this further, it would be appreciated.

dummy = "aaa"
# Hopefully the optimizer can't guess what the value is.
# empty is always "", but we can only prove that it's a string:
empty = dummy[:0]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can easily see the optimizer turning "aaa"[:0] into "".
empty doesn't need to be a constant, we just need it to be mostly "", for profiling.
Use something like empty = "a"[:(n % 1000) == 0]

Copy link
Member

@brandtbucher brandtbucher Mar 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we check the actual path taken as part of the test, we need the value to always be "", not just mostly "". So maybe:

false = i == TIER2_THRESHOLD
empty = "X"[:false]

The optimizer can't prove false is False, so it's good enough for our purposes.

// *can't* narrow res, since that would cause the guard to be
// removed and the narrowed value to be invalid:
if (next_opcode == _GUARD_IS_FALSE_POP) {
sym_set_const(value, Py_GetConstant(Py_CONSTANT_EMPTY_STR));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is strictly incorrect. We don't know that value is "" until after the _GUARD_IS_FALSE_POP.
The reason that matters is that when we start attaching type information to side exits, as we probably will in 3.15, then this could lead us to infer that value is "" on both branches. Which would be wrong.

There are two possible fixes for this.

  • Combine TO_BOOL_STR and _GUARD_IS_FALSE_POP/_GUARD_IS_TRUE_POP into a single (super)instruction, then optimize that.
  • Annotate the bool value resulting from the TO_BOOL with its input, then in _GUARD_IS_FALSE_POP convert the input value to TO_BOOL.

I prefer the second option, although it may be more work, as it is more flexible and can be extended more easily.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, @Fidget-Spinner and I suggested something like the latter on the issue (new symbols like JitBoolOf(JitOptSymbol *source, bool inverted) and JitEqualTo(JitOptSymbol *lhs, JitOptSymbol *rhs, bool inverted)). That's probably the direction we're headed in longer term.

However, I don't think we should let perfect be the enemy of good here. We have nice, working optimizations in these PRs; just because we might sink info onto side exits in the future probably shouldn't prevent us from making changes like this now for 3.14, which are perfectly correct for the current optimizer (which doesn't sink anything).

I'm inclined to land these changes and other similar ones for ==/!= now, and make the symbolic representation of derived boolean values more complex later as an improvement (it will also be able to handle more uncommon cases like x = y == 42; if x: ...). I'm really worried that if we try to "future-proof" optimizations based on what we could do six months from now, it will prevent actual improvements in the near term.

But I'll defer to you here. If having value be narrowed one uop too early in the instruction stream is enough to block this PR, I can work with these new contributors on the more complex solution. But as-is, this has no bugs and works as intended. We don't sink value info onto side exits, so it's correct.

@bedevere-app
Copy link

bedevere-app bot commented Feb 24, 2025

A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated.

Once you have made the requested changes, please leave a comment on this pull request containing the phrase I have made the requested changes; please review again. I will then notify any core developers who have left a review that you're ready for them to take another look at this pull request.

@brandtbucher brandtbucher changed the title GH-130415: Add JIT optimization path for _TO_BOOL_STR GH-130415: Narrow str to "" based on boolean tests Mar 3, 2025
@brandtbucher brandtbucher merged commit 691354c into python:main Mar 4, 2025
58 checks passed
@fluhus fluhus deleted the hack-night2 branch March 21, 2025 20:23
seehwan pushed a commit to seehwan/cpython that referenced this pull request Apr 16, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
interpreter-core (Objects, Python, Grammar, and Parser dirs) performance Performance or resource usage topic-JIT
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants