[syntax-errors] Async comprehension in sync comprehension #17177

ntBre · 2025-04-03T14:21:21Z

Summary

Detect async comprehensions nested in sync comprehensions in async functions before Python 3.11, when this was changed.

The actual logic of this rule is very straightforward, but properly tracking the async scopes took a bit of work. An alternative to the current approach is to offload the in_async_context check into the SemanticSyntaxContext trait, but that actually required much more extensive changes to the TestContext and also to ruff's semantic model, as you can see in the changes up to 31554b4. This version has the benefit of mostly centralizing the state tracking in SemanticSyntaxChecker, although there was some subtlety around deferred function body traversal that made the changes to Checker more intrusive too (hence the new linter test).

The Checkpoint struct/system is obviously overkill for now since it's only tracking a single bool, but I thought it might be more useful later.

Test Plan

New inline tests and a new linter integration test.

Summary -- Detect async comprehensions nested in sync comprehensions in async functions before Python 3.11, when this was [changed]. [changed]: python/cpython#77527 Test Plan -- New inline tests. I also inlined `TestContext` into the `SemanticSyntaxCheckerVisitor` to make it possible to modify the context as it traverses the AST without adding mutating methods to the `SemanticSyntaxContext` trait. I also added a new linter integration test to make sure its `in_async_context` method has the behavior I expected.

but the checkpoint isn't wired up in the ruff_python_ast checker, so it seems it's not needed?

I think I've convinced myself that this error can be detected by a single level of nesting, which is in line with not needing this checkpoint. the issue is in a structure like this: sync async and it doesn't matter if that occurs at the top level or somewhere down the stack like async async sync async it also doesn't matter that a pattern like this sync async async prevents the innermost comprehension from raising an error because we will have already reported an error for its parent

This reverts commit 375e4e1.

MichaReiser

Do you think it would simplify the implementation if we only tracked whether we're in a sync comprehension rather than tracking if we're in an async context? That seems easier than correctly tracking if we're also in an async function. It has the downside that we might emit one extra diagnostic if someone uses an async comprehension in a sync function, but I'd expect this to be rare. On the other hand, it doesn't add that much complexity. It's just a bit tricky to get the handling right when to unset the async flag. Do you know if it's necessary to unset the flag in a sync context manager (in an async function)?

MichaReiser · 2025-04-03T16:27:30Z

crates/ruff_python_parser/src/semantic_errors.rs

+        // check for errors
+        self.check_stmt(stmt, ctx);
+
+        self.save_checkpoint();


Do we need to store the checkpoint for every statement or could we only store and restore it for some statements (e.g functions, lambdas, classes?)

MichaReiser · 2025-04-03T16:31:59Z

crates/ruff_python_parser/src/semantic_errors.rs

+    /// because the parent context is what matters.
+    in_async_context: bool,
+
+    checkpoint: Vec<Checkpoint>,


I wonder if we could avoid having to store the checkpoints inside and instead move it to the stack by changing SemanticSyntaxChecker::enter_stmt to return a state which the caller needs to pass to exit_stmt (consumes the state, the state can't be created externally).

The checker would always store the current state and enter_stmt would return the previous state from enter_stmt (but update its internal state). The exit_stmt call (or expression) then restores the internal state.

The downside of this is that it requires some more bookkeeping at the call site, but it seems fairly reasonable.

This approach would mirror the Checker's approach with updating SemanticModelFlags, where it updates the flags before walking the children and then restores the flags once it's done.

If you decide to keep the stack solution, we should then change the implementation to only snapshot and restore when necessary. For example, it isn't necessary to store the snapshot for most expressions because the state won't be changed (and, therefore, there isn't anything to restore either).

Good idea, it really wasn't bad to plumb this through at the call sites.

crates/ruff_python_parser/src/semantic_errors.rs

MichaReiser · 2025-04-03T16:36:43Z

crates/ruff_python_parser/src/semantic_errors.rs

+            Stmt::FunctionDef(ast::StmtFunctionDef { is_async, .. }) => {
+                self.in_async_context = *is_async;
+                self.seen_futures_boundary = true;
+            }
            _ => {


Do we need to set in_async_context to false when entering a Class, lambda, or a sync with context?

Oh good catch. It looks like class and lambda do cause problems even inside of an outer async function, but with doesn't seem to add a sync scope:

Python 3.13.2 (main, Feb 5 2025, 08:05:21) [GCC 14.2.1 20250128] on linux Type "help", "copyright", "credits" or "license" for more information. >>> async def foo(): lambda: [_ async for n in range(3)] File "<python-input-4>", line 1 async def foo(): lambda: [_ async for n in range(3)] ^^^^^^^^^^^^^^^^^^^^^^^^^^^ SyntaxError: asynchronous comprehension outside of an asynchronous function >>> async def foo(): ... class C: [_ async for n in range(3)] ... File "<python-input-7>", line 2 class C: [_ async for n in range(3)] ^^^^^^^^^^^^^^^^^^^^^^^^^^^ SyntaxError: asynchronous comprehension outside of an asynchronous function >>> async def foo(): ... with open("foo.txt", "w") as f: ... [_ async for n in range(3)] ... >>>

However, these are both still errors on 3.13, so this might be a separate, version-independent rule.

This is making me think we should go back to the trait implementation and defer to the full visitor's scope-based implementation. Or at least try tracking nested scopes here.

We do have a lint rule that detects these, PLE1142, but not an error.

https://play.ruff.rs/f0be27e4-9cb1-4bdb-9ebe-b095d81815e5

ntBre · 2025-04-03T17:26:48Z

Do you think it would simplify the implementation if we only tracked whether we're in a sync comprehension rather than tracking if we're in an async context? That seems easier than correctly tracking if we're also in an async function. It has the downside that we might emit one extra diagnostic if someone uses an async comprehension in a sync function, but I'd expect this to be rare. On the other hand, it doesn't add that much complexity. It's just a bit tricky to get the handling right when to unset the async flag. Do you know if it's necessary to unset the flag in a sync context manager (in an async function)?

This sounded very promising at first, but I think we would still need the checkpoints, and it would only remove the function-specific code, which I don't think adds that much "complexity" per se, like you said. Although I guess we could avoid statement checkpoints and exit_stmt entirely since everything else is an expression.

On the other hand, it sounds like PLE1142 should also be a syntax error and needs even more careful scope tracking, so it might be worth setting up for that here.

I'll work on your suggestions about removing the checkpoint stack and see how it looks after that.

As I said in a reply above, we could also give the trait approach another try. That just gets annoying for inline testing. It requires a lot of new test code, and that test code doesn't really guarantee anything about the actual users. However, that's also somewhat the case here because of the tricky enter/exit calls in Checker.

github-actions · 2025-04-03T18:19:32Z

`ruff-ecosystem` results

Linter (stable)

✅ ecosystem check detected no linter changes.

Linter (preview)

✅ ecosystem check detected no linter changes.

Formatter (stable)

✅ ecosystem check detected no format changes.

Formatter (preview)

✅ ecosystem check detected no format changes.

ntBre · 2025-04-03T19:50:11Z

The ecosystem check is showing a false positive in a notebook cell, which I think should have an implicit async scope. It's probably also picking up a default python version since it mentions 3.9.

dhruvmanila · 2025-04-03T21:10:20Z

crates/ruff_python_parser/src/semantic_errors.rs

 #[derive(Debug)]
+pub struct Checkpoint {
+    in_async_context: bool,
+}


We could potentially include a DebugDropBomb to make sure that the consumer of the enter API don't forgot to call the exit_stmt / exit_expr methods which will defuse the bomb.

Oh interesting, I think I saw some mention of this around the new diagnostics. I could also mark enter_stmt as #[must_use], which could help a bit in the same direction.

MichaReiser · 2025-04-04T06:49:53Z

crates/ruff_linter/src/checkers/ast/mod.rs

+        // For functions, defer semantic syntax error checks until the body of the function is
+        // visited
+        let checkpoint = if stmt.is_function_def_stmt() {
+            None
+        } else {
+            Some(self.with_semantic_checker(|semantic, context| semantic.enter_stmt(stmt, context)))
+        };


I don't understand the reasoning for skipping function definitions here. Won't this result in stale in_async_context flags because we don't update the state until after we visited the entire body?

If there's a need for deferred visiting, then I'd prefer to have a enter_deferred_stmt or, better, move the necessary checks into exit_stmt and also pass the statement and context because exit is exactly the hook called after visiting the node's children

I think I might be missing something here, but my reasoning for this was that the ast::Checker doesn't actually visit the body of the function until visit_deferred_function:

ruff/crates/ruff_linter/src/checkers/ast/mod.rs

Lines 2588 to 2591 in 5cee346

self.visit_parameters(parameters);

// Set the docstring state before visiting the function body.

self.docstring_state = DocstringState::Expected(ExpectedDocstringKind::Function);

self.visit_body(body);

Before I added this logic, the function body was being visited twice by the SemanticSyntaxChecker but only once by the ast::Checker and this integration test was failing even though the inline version passed:

async def test(): return [[x async for x in elements(n)] async for n in range(3)]

which pointed to an issue in the visit order because the test visitor is much simpler:

fn visit_stmt(&mut self, stmt: &'_ Stmt) { let checkpoint = self.checker.enter_stmt(stmt, &self.context); ruff_python_ast::visitor::walk_stmt(self, stmt); self.checker.exit_stmt(checkpoint); }

Again I might be misunderstanding, but I don't think an enter_deferred_stmt helps here because we still need this logic in visit_stmt itself to avoid duplicating the visit. Unless you mean hiding this logic inside of SemanticSyntaxChecker::enter_stmt. I guess that could work if I also update the test visitor to defer function bodies like the real ast::Checker.

Oh I see. This seems fragile. I wouldn't be aware of this out-of-order visiting when working on the SemanticSyntaxChecker. We should at least document the constraints in which the enter methods are called. I assumed it would be in semantic visiting order but it seems its in semantic visiting order except for functions which may be deferred (or not, depending on the caller)

Yes, very fragile. I was so confused when the integration test was failing but the inline test was fine, until I realized the function bodies were deferred. I'll work on expanding the docs for this.

ntBre · 2025-04-07T12:46:26Z

I made a few improvements here:

documented when enter_stmt should be called. I didn't add anything for enter_expr because it doesn't have any of these fragile cases yet
added #[must_use] to both enter_ methods to require using the returned Checkpoint. We could add some kind of drop bomb if we want to be even more sure
add a PySourceType argument to SemanticSyntaxChecker::new to avoid notebook false positives (the top level scope should allow async code in notebooks)

crates/ruff_python_parser/src/semantic_errors.rs

dhruvmanila

Looks good to me, might want to wait on @MichaReiser as he has the most context for this one.

MichaReiser

Looks good to me. It would be great if we could move some of the CLI tests to parser tests (maybe not the jupyter one because that would be very noisy unless the out-of-bound parser test have options support)

MichaReiser · 2025-04-08T06:54:08Z

crates/ruff/tests/lint.rs

@@ -5564,3 +5564,206 @@ fn semantic_syntax_errors() -> Result<()> {

    Ok(())
 }
+
+#[test]
+fn async_comprehension_in_sync_comprehension() {


It's not quiet clear to me why we need those cli tests. Could some of those tests be parser tests instead? The parser tests already support # parse_options: {"target-version": "3.7"}

We probably don't need all of these, but the third and fourth helped to detect bugs in the enter/exit pairs and the deferred function body traversal. The first two are probably less valuable, if you want me to trim it down a bit.

It would be great if we could narrow them down and maybe document why it's important that they're CLI tests.

An alternative would be to implement them as custom tests in ruff_linter (that instantiate checker). Maybe there's something you can reuse from the linter test infra?

I adapted the pyflakes test runner (as suggested on Discord) and moved all of the CLI tests to ruff_linter. They're just in the top-level linter::tests module for now, but I'm happy to move them around if there's a better place. I also considered nesting them under rules/syntax_errors, for example.

I think having them in linter::tests is fine. They are integration tests after all

Sounds good, thanks! I'll merge then

* origin/main: [red-knot] Default `python-platform` to current platform (#17183) [red-knot] Add new 'unreachable code' test case (#17306) [red-knot] mypy_primer: Run on `async-utils` (#17303) [red-knot] Add custom `__setattr__` support (#16748) [red-knot] Add `__init__` arguments check when doing `try_call` on a class literal (#16512) [`flake8-pie`] Avoid false positive for multiple assignment with `auto()` (`PIE796`) (#17274) [syntax-errors] Async comprehension in sync comprehension (#17177) [`airflow`] Expand module path check to individual symbols (`AIR302`) (#17278) [syntax-errors] Check annotations in annotated assignments (#17283) [syntax-errors] Extend annotation checks to `await` (#17282) [red-knot] Add support for `assert_never` (#17287) [`flake8-pytest-style`] Avoid false positive for legacy form of `pytest.raises` (`PT011`) (#17231) [red-knot] Do not show types for literal expressions on hover (#17290) [red-knot] Fix dead-code clippy warning (#17291) [red-knot] Reachability analysis (#17199) [red-knot] Don't use latency-sensitive for handlers (#17227)

* dcreager/special-class: (26 commits) lint Add TODO about property test data Better todos Narrow type(generic) better More Python-like displays for specializations Add xfail for generic method inside generic class Comment other non-specializations Explain self_instance not being specialized Generic aliases are literals in type display Better TODO fallback type [red-knot] Default `python-platform` to current platform (#17183) [red-knot] Add new 'unreachable code' test case (#17306) [red-knot] mypy_primer: Run on `async-utils` (#17303) [red-knot] Add custom `__setattr__` support (#16748) [red-knot] Add `__init__` arguments check when doing `try_call` on a class literal (#16512) [`flake8-pie`] Avoid false positive for multiple assignment with `auto()` (`PIE796`) (#17274) [syntax-errors] Async comprehension in sync comprehension (#17177) [`airflow`] Expand module path check to individual symbols (`AIR302`) (#17278) [syntax-errors] Check annotations in annotated assignments (#17283) [syntax-errors] Extend annotation checks to `await` (#17282) ...

…17177) Summary -- Detect async comprehensions nested in sync comprehensions in async functions before Python 3.11, when this was [changed]. The actual logic of this rule is very straightforward, but properly tracking the async scopes took a bit of work. An alternative to the current approach is to offload the `in_async_context` check into the `SemanticSyntaxContext` trait, but that actually required much more extensive changes to the `TestContext` and also to ruff's semantic model, as you can see in the changes up to 31554b4. This version has the benefit of mostly centralizing the state tracking in `SemanticSyntaxChecker`, although there was some subtlety around deferred function body traversal that made the changes to `Checker` more intrusive too (hence the new linter test). The `Checkpoint` struct/system is obviously overkill for now since it's only tracking a single `bool`, but I thought it might be more useful later. [changed]: python/cpython#77527 Test Plan -- New inline tests and a new linter integration test.

ntBre added 20 commits April 3, 2025 10:09

in-line TestContext into SemanticSyntaxCheckerVisitor

336e6a3

refactor some Default impls to make clippy happier

0835fa2

add test_ok all-async case on 3.10 and track in_async_context

3785e4b

add failing linter test, this should be an error

76dd5d5

track async status for generator scopes and pass test

30a56a8

add 3.11 test, don't reuse command, can't pass target-version twice

7462cd6

track async context on SemanticSyntaxChecker, with checkpoint

04c57f0

but the checkpoint isn't wired up in the ruff_python_ast checker, so it seems it's not needed?

revert other unused context changes

50e0bc8

revert unnecessary linter tests

f811c1a

tidy up and add docs

ffcd250

add more test cases, finally find a test_ok that needs checkpoints

d2127e8

restore checkpoints, convert to stack to pass test

5b440de

restore linter tests

a96ae02

This reverts commit 375e4e1.

add a test case for exit_stmt and some commentary

d49c50c

wire up exit methods and add test, but earlier test is failing

624f671

isolate the inline test that fails in the linter

a21be7e

defer visits to function bodies

bf69ef9

rename visit_{stmt,expr} to enter_{stmt,expr} and add docs

cc42a38

ntBre added rule Implementing or modifying a lint rule preview Related to preview mode features labels Apr 3, 2025

ntBre requested review from MichaReiser and dhruvmanila as code owners April 3, 2025 14:21

ntBre mentioned this pull request Apr 3, 2025

☂️ Syntax errors raised by the compiler #11934

Closed

18 tasks

fix doc links and mention exit_* methods

d1e1cc4

MichaReiser reviewed Apr 3, 2025

View reviewed changes

ntBre added 2 commits April 3, 2025 12:48

Merge branch 'main' into brent/syn-async-comprehensions

7f99f5f

return early on recent python version

0f95773

use the call stack instead of Vec<Checkpoint> stack

48f1e52

dhruvmanila reviewed Apr 3, 2025

View reviewed changes

Merge branch 'main' into brent/syn-async-comprehensions

970fe3d

MichaReiser reviewed Apr 4, 2025

View reviewed changes

ntBre added 2 commits April 7, 2025 08:05

add must_use and some docs

7d3b4f6

pass source_type to SemanticSyntaxChecker

2dd610f

dhruvmanila reviewed Apr 7, 2025

View reviewed changes

crates/ruff_python_parser/src/semantic_errors.rs Show resolved Hide resolved

dhruvmanila approved these changes Apr 7, 2025

View reviewed changes

add notebook CLI test

6db55c6

MichaReiser approved these changes Apr 8, 2025

View reviewed changes

ntBre added 2 commits April 8, 2025 11:13

move non-notebook tests to ruff_linter

f730e85

refactor check_syntax_errors to handle notebooks, move notebook test

83d3a70

ntBre merged commit 058439d into main Apr 8, 2025
22 checks passed

ntBre deleted the brent/syn-async-comprehensions branch April 8, 2025 16:50

ntBre mentioned this pull request Apr 8, 2025

[syntax-errors] yield, yield from, and await outside functions #17298

Merged

BrewTestBot mentioned this pull request Apr 10, 2025

ruff 0.11.5 Homebrew/homebrew-core#219299

Merged

	self.visit_parameters(parameters);
	// Set the docstring state before visiting the function body.
	self.docstring_state = DocstringState::Expected(ExpectedDocstringKind::Function);
	self.visit_body(body);

[syntax-errors] Async comprehension in sync comprehension #17177

[syntax-errors] Async comprehension in sync comprehension #17177

Uh oh!

Conversation

ntBre commented Apr 3, 2025

Summary

Test Plan

Uh oh!

MichaReiser left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ntBre commented Apr 3, 2025

Uh oh!

github-actions bot commented Apr 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

ruff-ecosystem results

Linter (stable)

Linter (preview)

Formatter (stable)

Formatter (preview)

Uh oh!

ntBre commented Apr 3, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ntBre commented Apr 7, 2025

Uh oh!

Uh oh!

dhruvmanila left a comment

Choose a reason for hiding this comment

Uh oh!

MichaReiser left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

MichaReiser Apr 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

MichaReiser left a comment •

edited

Loading

github-actions bot commented Apr 3, 2025 •

edited

Loading

`ruff-ecosystem` results

MichaReiser Apr 8, 2025 •

edited

Loading