-
Notifications
You must be signed in to change notification settings - Fork 1.4k
[syntax-errors] Start detecting compile-time syntax errors #16106
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't done an in depth review (as this is also a draft PR) but I left a few comments that hopefully help clarify things.
Is the idea to emit version specific syntax errors (not semantic syntax errors) as part of the SyntaxChecker
or do you still plan on emitting those as part of the parser?
I think we have to explore some alternative designs to decide on whether we want a rule code but it should definetely not block you from prototyping.
let _span = tracing::trace_span!("check_types", file=?file.path(db)).entered(); | ||
|
||
tracing::debug!("Checking file '{path}'", path = file.path(db)); | ||
|
||
let index = semantic_index(db, file); | ||
let mut diagnostics = TypeCheckDiagnostics::default(); | ||
let mut syntax_diagnostics = SyntaxDiagnostics::default(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we should use another collection here. We should either push directly into TypeCheckDiagnostics
or, what I'd prefer, move this to the semantic index phase.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ahh okay, I knew something felt wrong about where I was putting this in red-knot. I will try moving it to the semantic index phase!
Thanks Micha! Our conversation earlier was very helpful, and I think these comments clear up some places where I was still confused.
In this draft, I tried emitting all of the errors, including version specific ones, in the Detecting
I partially updated the design doc with your rule code suggestions from earlier, but I still need to add a more concrete proposal to help with this discussion. I'll do that tonight or first thing in the morning. |
3807ac5
to
6394094
Compare
I haven't handled all of the review comments yet, but I rebased to get the new |
|
Exactly. My motivation for this is that:
|
For most of these, I think we could get away without documentation. There's not that much to say about the For some of them, though, I think docs would be really useful for our users. For example, the syntax differences between Python 3.8 and 3.9 due to PEP 614 are pretty subtle. And the details on when exactly you're able to parenthesize context managers in This is invalid syntax on Python 3.8: with (
foo() as x,
bar() as y,
):
pass But these are all valid syntax: Python 3.8.18 (default, Feb 15 2024, 19:36:58)
[Clang 15.0.0 (clang-1500.1.0.2.5)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> with (x, y) as foo:
... pass
...
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
NameError: name 'x' is not defined
>>> with (x,
... y) as foo:
... pass
...
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
NameError: name 'x' is not defined
>>> with (x,
... y):
... pass
...
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
NameError: name 'x' is not defined
>>> with (
... x,
... ):
... pass
...
Traceback (most recent call last):
File "<stdin>", line 2, in <module>
NameError: name 'x' is not defined
>>> with (
... x,
... y
... ) as foo:
... pass
...
Traceback (most recent call last):
File "<stdin>", line 2, in <module>
NameError: name 'x' is not defined
>>> with x, (
... y
... ): pass
...
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
NameError: name 'x' is not defined
>>> with x as foo, (
... y
... ) as bar:
... pass
...
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
NameError: name 'x' is not defined
>>> with x() as foo, (
... y()
... ) as bar:
... pass
...
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
NameError: name 'x' is not defined |
That's fair. But how is it different from any other syntax error. E.g what if you get a comprehension wrong? Either way. Having them in the parser doesn't mean that we can never have unique error codes. Red Knot would allow for it. It might just not be possible today and I think that's fine. |
Well, as we discussed earlier, we also probably want version-dependent syntax errors to be suppressible. If an error is suppressible and has documentation, it ends up looking very much like a lint rule, no? ;) And that's not to say that I want them detected using the same infrastructure as a lint rule. I think you're right that it'll be more performant to detect them in the parser where possible, and I think you make a great point when you say that it would be great for the formatter tests to be able to assert that they don't introduce any new version-dependent syntax errors. But I think we should be aware that these in quite a few ways are going to end up looking a lot more like our existing linter rules than our existing syntax rules.
Sure, it might be nice to have better docs for our existing syntax errors too ;-) but I do think it's especially confusing and subtle for users if whether or not they get a syntax error depends on the target Python version we've inferred for their project. And documenting clearly which Python version the new syntax was added in could really help users.
I don't have a strong opinion on whether we should detect these errors in the parser or elsewhere. But if we are going to detect them in the parser, I do think we should have a plan for how we'll provide docs for these errors. It doesn't have to be part of the initial PR adding these syntax diagnostics, but it is important to me. |
I don't think this is something we have an agreement on. |
Oh, sorry about that. I think I misunderstood our earlier conversation on Monday in that case -- that's on me :-) |
Summary -- WIP currently I just added all of the valid cases from Alex's comment here #16106 (comment) Test Plan -- Inline tests
8ea3e35
to
5bc9a44
Compare
) Summary -- I thought this was very complicated based on the comment here: #16106 (comment) and on some of the discussion in the CPython issue here: python/cpython#56991. However, after a little bit of experimentation, I think it boils down to this example: ```python with (x as y): ... ``` The issue is parentheses around a `with` item with an `optional_var`, as we (and [Python](https://docs.python.org/3/library/ast.html#ast.withitem)) call the trailing variable name (`y` in this case). It's not actually about line breaks after all, except that line breaks are allowed in parenthesized expressions, which explains the validity of cases like ```pycon >>> with ( ... x, ... y ... ) as foo: ... pass ... ``` even on Python 3.8. I followed [pyright]'s example again here on the diagnostic range (just the opening paren) and the wording of the error. Test Plan -- Inline tests [pyright]: https://pyright-play.net/?pythonVersion=3.7&strict=true&code=FAdwlgLgFgBAFAewA4FMB2cBEAzBCB0EAHhJgJQwCGAzjLgmQFwz6tA
Summary -- This PR updates `check_path` in the `ruff_linter` crate to return a `Vec<Message>` instead of a `Vec<Diagnostic>`. The main motivation for this is to make it easier to convert semantic syntax errors directly into `Message`s rather than `Diagnostic`s in #16106. However, this also has the benefit of keeping the preview check on unsupported syntax errors in `check_path`, as suggested in https://github.com/astral-sh/ruff/pull/16429/files#r1974748024. Test Plan -- Existing tests. I also tested the playground and server manually.
Summary -- This PR updates `check_path` in the `ruff_linter` crate to return a `Vec<Message>` instead of a `Vec<Diagnostic>`. The main motivation for this is to make it easier to convert semantic syntax errors directly into `Message`s rather than `Diagnostic`s in #16106. However, this also has the benefit of keeping the preview check on unsupported syntax errors in `check_path`, as suggested in #16429 (comment). All of the interesting changes are in the first commit. The second commit just renames variables like `diagnostics` to `messages`, and the third commit is a tiny import fix. I also updated the `ExpandedMessage::location` field name, which caused a few extra commits tidying up the playground code. I thought it was nicely symmetric with `end_location`, but I'm happy to revert that too. Test Plan -- Existing tests. I also tested the playground and server manually.
Thanks @dhruvmanila! I did test I also added a test for Otherwise, I think I've handled the other comments too. Thank you and @MichaReiser for the reviews! This push should get the updated ecosystem check as well, just in case. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good. A few nits around how to work around the context mutability problem.
Co-authored-by: Micha Reiser <[email protected]>
this also includes one very small update to an existing test to remove a semantic error from an otherwise valid case
Summary
This PR implements the "greeter" approach for checking the AST for syntax errors emitted by the CPython compiler. It introduces two main infrastructural changes to support all of the compile-time errors:
semantic_errors
module to the parser crate with publicSemanticSyntaxChecker
andSemanticSyntaxError
typesSemanticSyntaxChecker
in theruff_linter::Checker
for checking these errors in ruffAs a proof of concept, it also implements detection of two syntax errors:
late-future-import
(F404
)[(a := ...) for a in b]
#14395)Test plan
Existing F404 tests, new inline tests in the
ruff_python_parser
crate, and a linter CLI test showing an example of theMessage
output.I also tested in VS Code, where
preview = false
and turning off syntax errors both disable the new errors:And on the playground, where
preview = false
also disables the errors:Fixes #14395