Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[syntax-errors] Invalid syntax in annotations #17101

Merged
merged 14 commits into from
Apr 3, 2025
Merged

Conversation

ntBre
Copy link
Contributor

@ntBre ntBre commented Mar 31, 2025

Summary

This PR detects the use of invalid syntax in annotation scopes, including
yield and yield from expressions and named expressions. I combined a few
different types of CPython errors here, but I think the resulting error messages
still make sense and are even preferable to what CPython gives. For example, we
report yield expression cannot be used in a type annotation for both of these:

>>> def f[T](x: (yield 1)): ...
  File "<python-input-26>", line 1
    def f[T](x: (yield 1)): ...
                 ^^^^^^^
SyntaxError: yield expression cannot be used within the definition of a generic
>>> def foo() -> (yield x): ...
  File "<python-input-28>", line 1
    def foo() -> (yield x): ...
                  ^^^^^^^
SyntaxError: 'yield' outside function

Fixes #11118.

Test Plan

New inline tests, along with some updates to existing tests.

ntBre added 8 commits March 31, 2025 15:27
Summary
--

This PR detects the use of invalid syntax in annotation scopes, including
`yield` and `yield from` expressions and named expressions. I combined a few
different types of CPython errors here, but I think the resulting error messages
still make sense and are even preferable to what CPython gives. For example, we
report `yield expression cannot be used in a type annotation` for both of these:

```pycon
>>> def f[T](x: (yield 1)): ...
  File "<python-input-26>", line 1
    def f[T](x: (yield 1)): ...
                 ^^^^^^^
SyntaxError: yield expression cannot be used within the definition of a generic
>>> def foo() -> (yield x): ...
  File "<python-input-28>", line 1
    def foo() -> (yield x): ...
                  ^^^^^^^
SyntaxError: 'yield' outside function
```

Fixes #11118.

Test Plan
--

New inline tests, along with some updates to existing tests. I also combined the
`TestContext` code with the `SemanticSyntaxCheckerVisitor` because we now need
to set the `in_annotation` field on the context while visiting expressions.
```pycon
>>> def foo() -> (yield x): ...
  File "<python-input-32>", line 1
    def foo() -> (yield x): ...
                  ^^^^^^^
SyntaxError: 'yield' outside function
```
@ntBre ntBre added rule Implementing or modifying a lint rule preview Related to preview mode features labels Mar 31, 2025
Copy link
Contributor

github-actions bot commented Mar 31, 2025

ruff-ecosystem results

Linter (stable)

✅ ecosystem check detected no linter changes.

Linter (preview)

✅ ecosystem check detected no linter changes.

Formatter (stable)

✅ ecosystem check detected no format changes.

Formatter (preview)

✅ ecosystem check detected no format changes.

let mut visitor = SemanticSyntaxCheckerVisitor::new(
TestContext::default().with_python_version(options.target_version()),
);
if parsed.has_valid_syntax() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this check to make the ParseError and SemanticSyntaxError mutually exclusive in the snapshot output? What's the reason for doing this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I thought that would lead to a cleaner diff and be more similar to how the linter reports errors. 9 snapshots change without this, which isn't as bad as I thought, so I'm happy to revert this if you'd prefer.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it should be fine to have both ParseError and SemanticSyntaxError in the snapshots as it also gives us test coverage for semantic syntax errors in source code containing parse errors.

Comment on lines 183 to 186
let mut visitor = AnnotationVisitor {
allow_named_expr: false,
ctx,
};
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to check type_params.is_none() in this context?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one is actually invalid even without type parameters:

>>> type Y = (x := 1)
  File "<python-input-95>", line 1
    type Y = (x := 1)
              ^^^^^^
SyntaxError: named expression cannot be used within a type alias

But I did add this as a test case now, thanks!

Comment on lines +137 to +142
for param in parameters
.iter()
.filter_map(ast::AnyParameterRef::annotation)
{
visitor.visit_expr(param);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know this rule is specific to annotations but I was wondering if we need to run the same check for the default value? I'm mainly asking because this PR also checks the default value in type params (type X[T = (yield 1)] = int).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right that this is a syntax error, but it looks like we already catch these for yield expressions: playground, and they seem to be okay for named expressions:

>>> def foo(x=(y := [])): ...
... def bar(x=(y := [])): ...
...
>>>

I guess these are currently caught in the parser, so we could move the detection here if we wanted and use visitor.visit_parameters, though.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you specify which errors are caught in the parser? The playground link that you've provided suggest that the yield expressions are part of the lint rule while the named expressions are fine.

Regardless, we also visit the type param default in here which makes me think whether we should instead split the rule as "InvalidYieldExpressionUsage" and "InvalidNamedExpressionUsage" similar to the variants on ParseErrorType?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you specify which errors are caught in the parser? The playground link that you've provided suggest that the yield expressions are part of the lint rule while the named expressions are fine.

Oh you're right, I just assumed it was a ParseError when I saw a diagnostic, but I see now that it's lint rule F704. I guess that's another rule I should reimplement, like F404, but I'll save that for a follow-up.

Regardless, we also visit the type param default in here which makes me think whether we should instead split the rule as "InvalidYieldExpressionUsage" and "InvalidNamedExpressionUsage" similar to the variants on ParseErrorType?

Ah, I see what you mean. I guess I thought type parameter defaults were close enough to "type annotations" that the current error message made sense, but I see now that they are pretty different. That's also in line with CPython's different error message:

>>> type X[T= (yield 1)] = int
  File "<python-input-112>", line 1
    type X[T= (yield 1)] = int
               ^^^^^^^
SyntaxError: yield expression cannot be used within a TypeVar default

Is it enough to split them into one variant for yield and one for named expressions, or should we try to split them up more like CPython?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it enough to split them into one variant for yield and one for named expressions, or should we try to split them up more like CPython?

I think what you have right now looks great. I only suggested to split them up if we didn't want to provide granular context but I think providing them seems more useful.

Copy link
Member

@MichaReiser MichaReiser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This overall looks good to me but I think we should improve the error message because some usages of yield (or named) aren't in type annotations

///
/// ```python
/// type X[T: (yield 1)] = int
/// type Y = (yield 1)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not 100% sure but using the term type annotation in a type alias seems misleading because the value isn't a type annotation. The same applies to when using a yield expression in a type var default. We could use a more generic term (e.g. yield expression cannot be used in a type expression) or we distinguish the different contexts. I'm leaning towards the latter.


|
1 | class F[T](y := list): ...
| ^^^^^^^^^ Syntax Error: named expression cannot be used in a type annotation
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the error here is misleading because y := list) isn't a type annotation.

@ntBre
Copy link
Contributor Author

ntBre commented Apr 1, 2025

I'm not sure the names I picked are ideal, but I think the error messages are much more accurate now. The only divergence from CPython, as far as I know, is in cases like this:

>>> def foo(arg: yield int): ...
  File "<python-input-154>", line 1
    def foo(arg: yield int): ...
                 ^^^^^
SyntaxError: invalid syntax

where CPython reports a very generic "invalid syntax" error. I instead call these "type annotations" like before, which I think is accurate here.

I also followed CPython in having separate messages for generic classes and functions:

>>> class F[T]((yield 1)): ...
  File "<python-input-158>", line 1
    class F[T]((yield 1)): ...
                ^^^^^^^
SyntaxError: yield expression cannot be used within the definition of a generic
>>> class F((yield 1)): ...
  File "<python-input-159>", line 1
    class F((yield 1)): ...
             ^^^^^^^
SyntaxError: 'yield' outside function

but it might make more sense to use the message "yield expression cannot be used within a base class" for both of these.

Copy link
Member

@dhruvmanila dhruvmanila left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! I like the new error messages.

All of the review comments does not need to be addressed, they're more of a suggestion, so feel free to ignore them.

Comment on lines +137 to +142
for param in parameters
.iter()
.filter_map(ast::AnyParameterRef::annotation)
{
visitor.visit_expr(param);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it enough to split them into one variant for yield and one for named expressions, or should we try to split them up more like CPython?

I think what you have right now looks great. I only suggested to split them up if we didn't want to provide granular context but I think providing them seems more useful.

@ntBre ntBre merged commit c2b2e42 into main Apr 3, 2025
22 checks passed
@ntBre ntBre deleted the brent/syn-annotation-scopes branch April 3, 2025 21:56
maxmynter pushed a commit to maxmynter/ruff that referenced this pull request Apr 3, 2025
Summary
--

This PR detects the use of invalid syntax in annotation scopes,
including
`yield` and `yield from` expressions and named expressions. I combined a
few
different types of CPython errors here, but I think the resulting error
messages
still make sense and are even preferable to what CPython gives. For
example, we
report `yield expression cannot be used in a type annotation` for both
of these:

```pycon
>>> def f[T](x: (yield 1)): ...
  File "<python-input-26>", line 1
    def f[T](x: (yield 1)): ...
                 ^^^^^^^
SyntaxError: yield expression cannot be used within the definition of a generic
>>> def foo() -> (yield x): ...
  File "<python-input-28>", line 1
    def foo() -> (yield x): ...
                  ^^^^^^^
SyntaxError: 'yield' outside function
```

Fixes astral-sh#11118.

Test Plan
--

New inline tests, along with some updates to existing tests.
@JelleZijlstra
Copy link
Contributor

but it might make more sense to use the message "yield expression cannot be used within a base class" for both of these

That would be inaccurate; you can have a yield expression (or indeed any expression) in a class base.

>>> def f():
...     class X((yield 1)):
...         pass
...     yield X
... gen = f()
... gen.send(None)
... x = gen.send(int)
... print(x)
... 
<class '__main__.f.<locals>.X'>

It needs to be parenthesized for grammar reasons though.

(Sorry for the post-merge comment; only saw this when the issue was closed.)

@ntBre
Copy link
Contributor Author

ntBre commented Apr 4, 2025

No worries on the post-merge review, I appreciate it!

That makes a lot of sense, now that you mention it... I think this needs more work in that case, and we likely need to implement the 'yield' outside function error from CPython separately rather than folding it in here.

ntBre added a commit that referenced this pull request Apr 4, 2025
Summary
--

This PR fixes the issue pointed out by @JelleZijlstra in
#17101 (comment). Namely, I
conflated two very different errors from CPython:

```pycon
>>> def m[T](x: (yield from 1)): ...
  File "<python-input-310>", line 1
    def m[T](x: (yield from 1)): ...
                 ^^^^^^^^^^^^
SyntaxError: yield expression cannot be used within the definition of a generic
>>> def m(x: (yield from 1)): ...
  File "<python-input-311>", line 1
    def m(x: (yield from 1)): ...
              ^^^^^^^^^^^^
SyntaxError: 'yield from' outside function
>>> def outer():
...     def m(x: (yield from 1)): ...
...
>>>
```

I thought the second error was the same as the first, but `yield` (and `yield
from`) is actually valid in this position when inside a function scope. The same
is true for base classes, as pointed out in the original comment.

We don't currently raise an error for `yield` outside of a function, but that
should be handled separately.

On the upside, this had the benefit of removing the
`InvalidExpressionPosition::BaseClass` variant, the
`InvalidExpressionPosition::TypeAnnotation` variant, and the `allow_named_expr`
field from the visitor because they were all no longer used.

Test Plan
--

Updated inline tests.
ntBre added a commit that referenced this pull request Apr 4, 2025
Summary
--

This PR fixes the issue pointed out by @JelleZijlstra in
#17101 (comment).
Namely, I conflated two very different errors from CPython:

```pycon
>>> def m[T](x: (yield from 1)): ...
  File "<python-input-310>", line 1
    def m[T](x: (yield from 1)): ...
                 ^^^^^^^^^^^^
SyntaxError: yield expression cannot be used within the definition of a generic
>>> def m(x: (yield from 1)): ...
  File "<python-input-311>", line 1
    def m(x: (yield from 1)): ...
              ^^^^^^^^^^^^
SyntaxError: 'yield from' outside function
>>> def outer():
...     def m(x: (yield from 1)): ...
...
>>>
```

I thought the second error was the same as the first, but `yield` (and
`yield from`) is actually valid in this position when inside a function
scope. The same is true for base classes, as pointed out in the original
comment.

We don't currently raise an error for `yield` outside of a function, but
that should be handled separately.

On the upside, this had the benefit of removing the
`InvalidExpressionPosition::BaseClass` variant and the
`allow_named_expr` field from the visitor because they were both no
longer used.

Test Plan
--

Updated inline tests.
dcreager added a commit that referenced this pull request Apr 4, 2025
* main:
  [red-knot] Empty tuple is always-falsy (#17213)
  Run fuzzer with `--preview` (#17210)
  Bump 0.11.4 (#17212)
  [syntax-errors] Allow `yield` in base classes and annotations (#17206)
  Don't skip visiting non-tuple slice in `typing.Annotated` subscripts (#17201)
  [red-knot] mypy_primer: do not specify Python version (#17200)
  [red-knot] Add `Type.definition` method (#17153)
  Implement `Invalid rule provided` as rule RUF102 with `--fix` (#17138)
  [red-knot] Add basic on-hover to playground and LSP (#17057)
  [red-knot] don't remove negations when simplifying constrained typevars (#17189)
  [minor] Fix extra semicolon for clippy (#17188)
  [syntax-errors] Invalid syntax in annotations (#17101)
  [syntax-errors] Duplicate attributes in match class pattern (#17186)
  [syntax-errors] Fix multiple assignment for class keyword argument (#17184)
  use astral-sh/cargo-dist instead (#17187)
  Enable overindented docs lint (#17182)
ntBre added a commit that referenced this pull request Apr 7, 2025
Summary
--

This PR extends the changes in #17101 to include `await` in the same positions.

I also renamed the `valid_annotation_function` test to include `_py313` and
explicitly passed a Python version to contrast it with the `_py314` version.

Test Plan
--

New test cases added to existing files.
ntBre added a commit that referenced this pull request Apr 7, 2025
Summary
--

This PR extends the checks in #17101 and #17282 to annotated assignments after
Python 3.13.

Currently stacked on #17282 to include `await`.

Test Plan
--

New inline tests. These are simpler than the other cases because there's no
place to put generics.
ntBre added a commit that referenced this pull request Apr 7, 2025
Summary
--

This PR extends the checks in #17101 and #17282 to annotated assignments after
Python 3.13.

Currently stacked on #17282 to include `await`.

Test Plan
--

New inline tests. These are simpler than the other cases because there's no
place to put generics.
ntBre added a commit that referenced this pull request Apr 8, 2025
Summary
--

This PR extends the changes in #17101 to include `await` in the same
positions.

I also renamed the `valid_annotation_function` test to include `_py313`
and explicitly passed a Python version to contrast it with the `_py314`
version.

Test Plan
--

New test cases added to existing files.
ntBre added a commit that referenced this pull request Apr 8, 2025
Summary
--

This PR extends the checks in #17101 and #17282 to annotated assignments
after Python 3.13.

Currently stacked on #17282 to include `await`.

Test Plan
--

New inline tests. These are simpler than the other cases because there's
no place to put generics.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
preview Related to preview mode features rule Implementing or modifying a lint rule
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Ruff fails to reject invalid syntax in annotation scopes
4 participants