Skip to content

[syntax-errors] yield, yield from, and await outside functions #17298

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 27 commits into from
Apr 11, 2025

Conversation

ntBre
Copy link
Contributor

@ntBre ntBre commented Apr 8, 2025

Summary

This PR reimplements yield-outside-function (F704) as a semantic syntax error. Despite the name, this rule covers yield from and await in addition to yield.

Test Plan

New linter tests, along with the existing F704 test.

@ntBre ntBre added the rule Implementing or modifying a lint rule label Apr 8, 2025
Copy link
Contributor

github-actions bot commented Apr 8, 2025

ruff-ecosystem results

Linter (stable)

✅ ecosystem check detected no linter changes.

Linter (preview)

✅ ecosystem check detected no linter changes.

Formatter (stable)

✅ ecosystem check detected no format changes.

Formatter (preview)

✅ ecosystem check detected no format changes.

@ntBre ntBre marked this pull request as ready for review April 8, 2025 18:13
ntBre added a commit that referenced this pull request Apr 8, 2025
Summary
--

This PR reimplements
[return-outside-function (F706)](https://docs.astral.sh/ruff/rules/return-outside-function/)
as a semantic syntax error.

These changes are very similar to those in #17298, which this is currently
stacked on.

Test Plan
--

Simple new inline tests, as well as updates to existing tests in the second
commit.
ntBre added a commit that referenced this pull request Apr 9, 2025
## Summary

Based on the discussion in
#17298 (comment), we
decided to move the scope handling out of the `SemanticSyntaxChecker`
and into the `SemanticSyntaxContext` trait. This PR implements that
refactor by:

- Reverting all of the `Checkpoint` and `in_async_context` code in the
`SemanticSyntaxChecker`
- Adding four new methods to the `SemanticSyntaxContext` trait
- `in_async_context`: matches `SemanticModel::in_async_context` and only
detects the nearest enclosing function
- `in_sync_comprehension`: uses the new `is_async` tracking on
`Generator` scopes to detect any enclosing sync comprehension
  - `in_module_scope`: reports whether we're at the top-level scope
  - `in_notebook`: reports whether we're in a Jupyter notebook
- In-lining the `TestContext` directly into the
`SemanticSyntaxCheckerVisitor`
- This allows modifying the context as the visitor traverses the AST,
which wasn't possible before

One potential question here is "why not add a single method returning a
`Scope` or `Scopes` to the context?" The main reason is that the `Scope`
type is defined in the `ruff_python_semantic` crate, which is not
currently a dependency of the parser. It also doesn't appear to be used
in red-knot. So it seemed best to use these more granular methods
instead of trying to access `Scope` in `ruff_python_parser` (and
red-knot).

## Test Plan

Existing parser and linter tests.
@ntBre ntBre force-pushed the brent/syn-yield-outside-function branch from a761f7b to 9ed05a1 Compare April 9, 2025 20:30
@ntBre
Copy link
Contributor Author

ntBre commented Apr 9, 2025

The PR is now rebased onto the context changes, and I think I addressed the other review feedback, but it's still a huge number of changes to existing snapshots. Those are mostly just noise, but I will say they alerted me to one issue in the test context (how it visited generators) at least.

One other way of avoiding these snapshot diffs is just to rely on linter integration tests instead of inline tests at all. It feels a bit unreliable having to update the test context to pass these tests anyway, so fully reusing the new test infrastructure in the linter could solve several problems at once.

To be more concrete, the inline tests are still great for cases that don't rely on information from the Context, but I'm suggesting moving the Context-dependent tests to the linter and restoring the SemanticSyntaxCheckerVisitor to a very, very simple visitor like it was before and leaving it with dummy context methods:

fn visit_stmt(&mut self, stmt: &ast::Stmt) {
	self.with_semantic_checker(|semantic, context| semantic.visit_stmt(stmt, context));
	ast::visitor::walk_stmt(self, stmt);
}

I'm interested in hearing y'all's thoughts on that.

@MichaReiser
Copy link
Member

This PR is in my review queue, but my afternoon is rather busy. I'm not sure if I get to reviewing it today

@ntBre
Copy link
Contributor Author

ntBre commented Apr 10, 2025

No rush, I was thinking about playing with one of these test options (the one I mentioned or Dhruv's idea for a separate inline test mechanism) to make this more reviewable anyway.

Oh, I also hadn't noticed the new ecosystem report. I'm guessing those are a lot of false positives, so really no rush on reviewing this.

ntBre added 13 commits April 10, 2025 13:04
Summary
--

This PR reimplements
[yield-outside-function (F704)](https://docs.astral.sh/ruff/rules/yield-outside-function/)
as a semantic syntax error. Despite the name, this rule covers `yield from` and
`await` in addition to `yield`.

Test Plan
--

Simple, new inline tests, along with the existing F704 test.

A huge number of existing parser tests also had to be updated. These are all in
the second commit.

For `test_err` cases, I mostly just accepted the new semantic errors alongside
the existing syntax errors. For `test_ok` cases, I either deleted the lines with
new errors, or nested the offending lines in a function. I think I only deleted
lines from the first two inline tests I updated, and all of the other fixtures
should be nested.

The `await` cases should actually be nested inside of `async` functions, but
I've intentionally left them as synchronous functions for now because I'm
planning to implement that check next.
it looks like it should still be wrong, but we're intentionally leaving the
"await outside _async_ function" for a separate rule
@ntBre ntBre force-pushed the brent/syn-yield-outside-function branch from e9c9ed8 to 44a043b Compare April 10, 2025 17:04
@ntBre
Copy link
Contributor Author

ntBre commented Apr 10, 2025

I hope this should be much more manageable to review now. I reverted to dummy methods on the test visitor, which also reverted all of the unrelated snapshot changes. All of the tests are now in the linter and rely on the real visitor.

The false positives should also be addressed.

Copy link
Member

@dhruvmanila dhruvmanila left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great! I think there are couple of follow-ups and minor changes but otherwise looks good.

5 | [(yield x) for x in range(3)] # error
|

resources/test/fixtures/syntax_errors/yield_scope.py:4:1: F704 `await` statement outside of a function
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unrelated to this PR and does not need to be done in this PR but maybe we should update the diagnostic message for await keyword to "await statement outside of an async function".

false
}

fn in_function_context(&self) -> bool {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should start adding some documentation for these methods. I'd find it confusing because there are both in_function_scope and in_function_context but I don't have a better suggestion for those methods. One solution would be to inline in_function_scope which a lot of the linter code does.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(I see that the documentation would be part of the trait.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I put it on the trait in case there were more of these later, but this is the only one for now, so I can just move it down.

Unfortunately we can't really inline in_function_scope (unless I'm missing something) because it would require access to the Scope type, which I tried to avoid in #17314. That would actually remove the need for a lot of these little methods, but I think it would add complications of its own, especially once we integrate with red-knot.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately we can't really inline in_function_scope (unless I'm missing something) because it would require access to the Scope type, which I tried to avoid in #17314. That would actually remove the need for a lot of these little methods, but I think it would add complications of its own, especially once we integrate with red-knot.

Ah right, ofcourse. Sorry for the noise.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No worries, it would be nice to have!

fn in_function_context(&self) -> bool {
for scope in self.semantic.current_scopes() {
match scope.kind {
ScopeKind::Class(_) => return false,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the reason for special casing the class scope? Should this be bundled with the final match arm instead?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah this is probably my least favorite part of the PR and one of the reasons that I wish we could just access the Scope directly. This is not really a general in_function_context method, it's more like an await_allowed_in_this_scope method, but that seemed too specific of a name. Basically, only classes break an async scope, unlike generators, which is also different from how yield and yield from work (they are broken by generators).

Should I just rename it? Or do you have another suggestion here?

Python REPL session
>>> async def f():
...     [await x for x in y]
...
>>> async def f():
...     class C:
...         [await x for x in y]
...
  File "<python-input-1>", line 3
    [await x for x in y]
    ^^^^^^^^^^^^^^^^^^^^
SyntaxError: asynchronous comprehension outside of an asynchronous function
>>> def f():
...     yield 1
...
>>> def f():
...     [(yield 1) for x in y]
...
  File "<python-input-3>", line 2
    [(yield 1) for x in y]
      ^^^^^^^
SyntaxError: 'yield' inside list comprehension

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Resolving this will also help to clear up the docs and relationship between the other methods, I think.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, yeah that isn't obvious from the function name 😅

In that case, it's a lot similar to in_async_context which is why I guess you choose the name in_function_context.

It would be useful to rename it and add relevant documentation.

@@ -1345,6 +1489,8 @@ pub trait SemanticSyntaxContext {
/// Returns `true` if the visitor is currently in an async context, i.e. an async function.
fn in_async_context(&self) -> bool;

fn in_function_context(&self) -> bool;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's document this method especially w.r.t in_function_scope (mainly a follow-up from my other comment)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see that there's a writeup regarding this in the trait documentation, I think it'd be more useful to have it here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the trait level documentation but I agree, that it would be useful to have a short summary of what it is. It could also refer to the trait documentation for more details.

Comment on lines +614 to +617
// We are intentionally not inspecting the async status of the scope for now to mimic F704.
// await-outside-async is PLE1142 instead, so we'll end up emitting both syntax errors for
// cases that trigger F704
if kind.is_await() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is your plan here to emit a different SemanticSyntaxErrorKind for await outside of async so that they can be re-coded to PLE1142?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, exactly! It seems a bit redundant, but it will preserve the current behavior.

That's the last error I'm planning to handle after this and #17300.

// await-outside-async is PLE1142 instead, so we'll end up emitting both syntax errors for
// cases that trigger F704
if kind.is_await() {
if ctx.in_function_context() || ctx.in_module_scope() && ctx.in_notebook() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: I'd prefer to have the notebook branch be its own if expression and have some documentation explaining why we skip the module scope for notebooks (as this isn't something that's very obvious why)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good point. I'll bring back the comment from the deleted F704 implementation.

@@ -1345,6 +1489,8 @@ pub trait SemanticSyntaxContext {
/// Returns `true` if the visitor is currently in an async context, i.e. an async function.
fn in_async_context(&self) -> bool;

fn in_function_context(&self) -> bool;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the trait level documentation but I agree, that it would be useful to have a short summary of what it is. It could also refer to the trait documentation for more details.

Copy link
Member

@dhruvmanila dhruvmanila left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! Thanks for updating the notebook tests, I think test_async_comprehension_notebook would also need to be updated but can be done in a follow-up PR as well.

@ntBre
Copy link
Contributor Author

ntBre commented Apr 11, 2025

Thank you both for the reviews! I think I've addressed everything.

I went with in_await_allowed_context instead of in_function_context and added docs on the method itself. My other name candidate was in_await_context, but that seemed way too close to in_async_context. I also considered await_allowed, but I did still want to point back to the trait-level docs, which refer to the related methods as in_*_context methods. Happy to iterate further here if you have any suggestions.

At some point in the future, we may want to get rid of the await behavior here and only emit await outside async function (PLE1142) anyway.

@ntBre ntBre merged commit ffef71d into main Apr 11, 2025
22 checks passed
@ntBre ntBre deleted the brent/syn-yield-outside-function branch April 11, 2025 14:16
dcreager added a commit that referenced this pull request Apr 11, 2025
* main:
  [red-knot] Specialize `str.startswith` for string literals (#17351)
  [syntax-errors] `yield`, `yield from`, and `await` outside functions (#17298)
  [red-knot] Refresh diagnostics when changing related files (#17350)
  Add `Checker::import_from_typing` (#17340)
  Don't add chaperone space after escaped quote in triple quote (#17216)
  [red-knot] Silence errors in unreachable type annotations / class bases (#17342)
ntBre added a commit that referenced this pull request Apr 11, 2025
Summary
--

This PR reimplements [return-outside-function
(F706)](https://docs.astral.sh/ruff/rules/return-outside-function/) as a
semantic syntax error.

These changes are very similar to those in
#17298.

Test Plan
--

New linter tests, plus existing F706 tests.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
rule Implementing or modifying a lint rule
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants