Skip to content

[syntax-errors] Start detecting compile-time syntax errors #16106

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 52 commits into from
Mar 21, 2025
Merged
Show file tree
Hide file tree
Changes from 25 commits
Commits
Show all changes
52 commits
Select commit Hold shift + click to select a range
c66f37a
try SourceOrderVisitor and hack into red-knot
ntBre Feb 11, 2025
825f971
ignore syntax diagnostics in red-knot tests
ntBre Feb 11, 2025
4e702dd
integrate with ruff, accept snap changes
ntBre Feb 11, 2025
877a108
implement F404 too
ntBre Feb 11, 2025
af9d9dd
just use SyntaxChecker::enter_stmt instead of SourceOrderVisitor
ntBre Feb 11, 2025
6f90709
update doc comment
ntBre Feb 11, 2025
ab2db92
tidy enter_stmt
ntBre Feb 11, 2025
c4a5cc7
move red_knot_python_semantic::PythonVersion to ruff_db
ntBre Feb 12, 2025
34b5829
add docs to VersionSyntaxError
ntBre Feb 12, 2025
6cc9198
revert red-knot integration
ntBre Mar 7, 2025
508d785
revert MatchBeforePy310 detection
ntBre Mar 7, 2025
9c38e2d
revert version-related diagnostic kind
ntBre Mar 7, 2025
da6d31d
add a failing test for comprehension rebinding
ntBre Mar 17, 2025
33b5381
start detecting list comprehensions but can't display messages yet
ntBre Mar 17, 2025
419b5cd
restore testing code, add a TestVisitor
ntBre Mar 17, 2025
662ffc3
factor out check_generator_expr, check other comprehensions
ntBre Mar 17, 2025
63276ef
delete linter test, at least for now
ntBre Mar 17, 2025
ca582a5
use a Visitor to traverse generator elts, add more test cases
ntBre Mar 18, 2025
f461718
emit new errors in ruff
ntBre Mar 19, 2025
67eb290
preview gate
ntBre Mar 19, 2025
38ee75d
add ruff test
ntBre Mar 19, 2025
dc1a94a
remove now-unused FUTURES_BOUNDARY flag and shift others down
ntBre Mar 19, 2025
edbc46c
fix crate version
ntBre Mar 19, 2025
6e131f6
update docs
ntBre Mar 19, 2025
659788a
remove unused as_str method
ntBre Mar 19, 2025
7d00d50
Merge branch 'main' into brent/syntax-error-source-order
ntBre Mar 20, 2025
c3417be
update LinterResult flag name and docs
ntBre Mar 20, 2025
d6d3437
prepend new types with Semantic
ntBre Mar 20, 2025
7144a82
make rebound_variable a Vec, add a test with multiple violations
ntBre Mar 20, 2025
ffdb7cb
rename enter_ methods to visit_
ntBre Mar 20, 2025
16479d4
don't build a HashSet
ntBre Mar 20, 2025
ef11801
target_version -> python_version
ntBre Mar 20, 2025
4cfaad0
report semantic errors even when AST-based rules are not selected
ntBre Mar 20, 2025
f1b2222
add context trait and pass to visit_stmt
ntBre Mar 20, 2025
ae17231
move python version into context trait too
ntBre Mar 20, 2025
a363c6b
move preview check to check_ast to avoid collecting at all
ntBre Mar 20, 2025
d48afb7
remove separate crate, put SemanticSyntaxChecker in the parser
ntBre Mar 20, 2025
7ffa680
expand inline tests to semantic_errors module
ntBre Mar 20, 2025
c482a02
test --statistics for all three kinds of errors
ntBre Mar 20, 2025
88cbb9b
fix outdated docs links
ntBre Mar 20, 2025
5f7de2a
move interior mutability to Checker
ntBre Mar 21, 2025
e8d44f1
update checker field name and add docs (also fix an old missing `.`)
ntBre Mar 21, 2025
7171e20
fix LinterResult docs
ntBre Mar 21, 2025
077bd37
reorder SemanticSyntaxChecker code to navigate more easily
ntBre Mar 21, 2025
f484a85
combine semantic error tests with existing inline tests
ntBre Mar 21, 2025
de0914b
move inline tests just above add_error call
ntBre Mar 21, 2025
3ccea07
delete getter/setter without refcell
ntBre Mar 21, 2025
0b62bde
restore docs from SemanticModel flag, delete todo
ntBre Mar 21, 2025
452283a
move semantic_errors to single-file module
ntBre Mar 21, 2025
b5f828b
delete outdated comment
ntBre Mar 21, 2025
ab89b04
update variable name to match LinterResult field
ntBre Mar 21, 2025
79ddb2f
delete now-unused inline/semantic directory
ntBre Mar 21, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 14 additions & 0 deletions Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

1 change: 1 addition & 0 deletions Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,7 @@ ruff_python_index = { path = "crates/ruff_python_index" }
ruff_python_literal = { path = "crates/ruff_python_literal" }
ruff_python_parser = { path = "crates/ruff_python_parser" }
ruff_python_semantic = { path = "crates/ruff_python_semantic" }
ruff_python_syntax_errors = { path = "crates/ruff_python_syntax_errors" }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe:

Suggested change
ruff_python_syntax_errors = { path = "crates/ruff_python_syntax_errors" }
ruff_python_syntax_errors = { path = "crates/ruff_python_semantic_errors" }

To clarify that it doesn't implement other syntax errors? We could also use semantic_syntax_errors but it feels very long.

Another alternative is to implement the checks inside the parser crate. Or does the crate have dependencies that would prevent us from doing that?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moving it to the parser makes a lot of sense to me. I moved it to a public ruff_python_parser::semantic_errors module. Another possible location would be a private submodule of the error module (ruff_python_parser::error::semantic_errors or maybe just semantic at the end?) and then we could re-export the new related types like we do for ParseError and UnsupportedSyntaxError. Maybe UnsupportedSyntaxError should have had its own error submodule too.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fine with any of it but I myself would go with semantic_errors because it's short and simple.

ruff_python_stdlib = { path = "crates/ruff_python_stdlib" }
ruff_python_trivia = { path = "crates/ruff_python_trivia" }
ruff_server = { path = "crates/ruff_server" }
Expand Down
44 changes: 44 additions & 0 deletions crates/ruff/tests/lint.rs
Original file line number Diff line number Diff line change
Expand Up @@ -5491,3 +5491,47 @@ fn cookiecutter_globbing_no_project_root() -> Result<()> {

Ok(())
}

/// Test that semantic syntax errors (1) are emitted, (2) are not cached, and (3) don't affect the
/// reporting of normal diagnostics.
#[test]
fn semantic_syntax_errors() -> Result<()> {
let tempdir = TempDir::new()?;
fs::write(tempdir.path().join("main.py"), "[(x := 1) for x in foo]")?;

let mut cmd = Command::new(get_cargo_bin(BIN_NAME));
// inline STDIN_BASE_OPTIONS to remove --no-cache
cmd.args(["check", "--output-format", "concise"])
.arg("--preview")
.arg("--quiet") // suppress `debug build without --no-cache` warnings
.current_dir(&tempdir);

assert_cmd_snapshot!(
cmd,
@r"
success: false
exit_code: 1
----- stdout -----
main.py:1:3: SyntaxError: assignment expression cannot rebind comprehension variable
main.py:1:20: F821 Undefined name `foo`

----- stderr -----
"
);

// this should *not* be cached, like normal parse errors
assert_cmd_snapshot!(
cmd,
@r"
success: false
exit_code: 1
----- stdout -----
main.py:1:3: SyntaxError: assignment expression cannot rebind comprehension variable
main.py:1:20: F821 Undefined name `foo`

----- stderr -----
"
);

Ok(())
}
1 change: 1 addition & 0 deletions crates/ruff_linter/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,7 @@ ruff_python_index = { workspace = true }
ruff_python_literal = { workspace = true }
ruff_python_semantic = { workspace = true }
ruff_python_stdlib = { workspace = true }
ruff_python_syntax_errors = { workspace = true }
ruff_python_trivia = { workspace = true }
ruff_python_parser = { workspace = true }
ruff_source_file = { workspace = true, features = ["serde"] }
Expand Down
8 changes: 0 additions & 8 deletions crates/ruff_linter/src/checkers/ast/analyze/statement.rs
Original file line number Diff line number Diff line change
Expand Up @@ -853,14 +853,6 @@ pub(crate) fn statement(stmt: &Stmt, checker: &mut Checker) {
if checker.enabled(Rule::FutureFeatureNotDefined) {
pyflakes::rules::future_feature_not_defined(checker, alias);
}
if checker.enabled(Rule::LateFutureImport) {
if checker.semantic.seen_futures_boundary() {
checker.report_diagnostic(Diagnostic::new(
pyflakes::rules::LateFutureImport,
stmt.range(),
));
}
}
} else if &alias.name == "*" {
if checker.enabled(Rule::UndefinedLocalWithNestedImportStarUsage) {
if !matches!(checker.semantic.current_scope().kind, ScopeKind::Module) {
Expand Down
44 changes: 38 additions & 6 deletions crates/ruff_linter/src/checkers/ast/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,7 @@ use std::path::Path;

use itertools::Itertools;
use log::debug;
use ruff_python_syntax_errors::{SyntaxChecker, SyntaxError, SyntaxErrorKind};
use rustc_hash::{FxHashMap, FxHashSet};

use ruff_diagnostics::{Diagnostic, IsolationLevel};
Expand Down Expand Up @@ -63,6 +64,7 @@ use crate::importer::Importer;
use crate::noqa::NoqaMapping;
use crate::package::PackageRoot;
use crate::registry::Rule;
use crate::rules::pyflakes::rules::LateFutureImport;
use crate::rules::{flake8_pyi, flake8_type_checking, pyflakes, pyupgrade};
use crate::settings::{flags, LinterSettings};
use crate::{docstrings, noqa, Locator};
Expand Down Expand Up @@ -225,6 +227,9 @@ pub(crate) struct Checker<'a> {
docstring_state: DocstringState,
/// The target [`PythonVersion`] for version-dependent checks
target_version: PythonVersion,

#[allow(clippy::struct_field_names)]
syntax_checker: SyntaxChecker,
}

impl<'a> Checker<'a> {
Expand Down Expand Up @@ -272,6 +277,7 @@ impl<'a> Checker<'a> {
last_stmt_end: TextSize::default(),
docstring_state: DocstringState::default(),
target_version,
syntax_checker: SyntaxChecker::new(target_version),
}
}
}
Expand Down Expand Up @@ -516,6 +522,8 @@ impl<'a> Checker<'a> {

impl<'a> Visitor<'a> for Checker<'a> {
fn visit_stmt(&mut self, stmt: &'a Stmt) {
self.syntax_checker.enter_stmt(stmt);

// Step 0: Pre-processing
self.semantic.push_node(stmt);

Expand Down Expand Up @@ -550,17 +558,13 @@ impl<'a> Visitor<'a> for Checker<'a> {
{
self.semantic.flags |= SemanticModelFlags::FUTURE_ANNOTATIONS;
}
} else {
self.semantic.flags |= SemanticModelFlags::FUTURES_BOUNDARY;
}
}
Stmt::Import(_) => {
self.semantic.flags |= SemanticModelFlags::MODULE_DOCSTRING_BOUNDARY;
self.semantic.flags |= SemanticModelFlags::FUTURES_BOUNDARY;
}
_ => {
self.semantic.flags |= SemanticModelFlags::MODULE_DOCSTRING_BOUNDARY;
self.semantic.flags |= SemanticModelFlags::FUTURES_BOUNDARY;
if !(self.semantic.seen_import_boundary()
|| stmt.is_ipy_escape_command_stmt()
|| helpers::is_assignment_to_a_dunder(stmt)
Expand Down Expand Up @@ -1131,6 +1135,8 @@ impl<'a> Visitor<'a> for Checker<'a> {
}

fn visit_expr(&mut self, expr: &'a Expr) {
self.syntax_checker.enter_expr(expr);

// Step 0: Pre-processing
if self.source_type.is_stub()
&& self.semantic.in_class_base()
Expand Down Expand Up @@ -2674,7 +2680,7 @@ pub(crate) fn check_ast(
cell_offsets: Option<&CellOffsets>,
notebook_index: Option<&NotebookIndex>,
target_version: PythonVersion,
) -> Vec<Diagnostic> {
) -> (Vec<Diagnostic>, Vec<SyntaxError>) {
let module_path = package
.map(PackageRoot::path)
.and_then(|package| to_module_path(package, path));
Expand Down Expand Up @@ -2739,5 +2745,31 @@ pub(crate) fn check_ast(
checker.analyze.scopes.push(ScopeId::global());
analyze::deferred_scopes(&checker);

checker.diagnostics.take()
let Checker {
diagnostics,
syntax_checker,
settings,
..
} = checker;

// partition the semantic syntax errors into Diagnostics and regular errors
let mut diagnostics = diagnostics.take();
let mut semantic_syntax_errors = Vec::new();
for semantic_syntax_error in syntax_checker.finish() {
match semantic_syntax_error.kind {
SyntaxErrorKind::LateFutureImport => {
if settings.rules.enabled(Rule::LateFutureImport) {
diagnostics.push(Diagnostic::new(
LateFutureImport,
semantic_syntax_error.range,
));
}
}
SyntaxErrorKind::ReboundComprehensionVariable => {
semantic_syntax_errors.push(semantic_syntax_error);
}
}
}

(diagnostics, semantic_syntax_errors)
}
27 changes: 23 additions & 4 deletions crates/ruff_linter/src/linter.rs
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@ use std::path::Path;
use anyhow::{anyhow, Result};
use colored::Colorize;
use itertools::Itertools;
use ruff_python_syntax_errors::SyntaxError;
use rustc_hash::FxHashMap;

use ruff_diagnostics::Diagnostic;
Expand Down Expand Up @@ -77,6 +78,9 @@ pub fn check_path(
// Aggregate all diagnostics.
let mut diagnostics = vec![];

// Aggregate all semantic syntax errors.
let mut semantic_syntax_errors = vec![];

let tokens = parsed.tokens();
let comment_ranges = indexer.comment_ranges();

Expand Down Expand Up @@ -148,7 +152,7 @@ pub fn check_path(
let cell_offsets = source_kind.as_ipy_notebook().map(Notebook::cell_offsets);
let notebook_index = source_kind.as_ipy_notebook().map(Notebook::index);
if use_ast {
diagnostics.extend(check_ast(
let (new_diagnostics, new_semantic_syntax_errors) = check_ast(
parsed,
locator,
stylist,
Expand All @@ -162,7 +166,9 @@ pub fn check_path(
cell_offsets,
notebook_index,
target_version,
));
);
diagnostics.extend(new_diagnostics);
semantic_syntax_errors.extend(new_semantic_syntax_errors);
}
if use_imports {
let import_diagnostics = check_imports(
Expand Down Expand Up @@ -327,10 +333,17 @@ pub fn check_path(
&[]
};

let semantic_syntax_errors = if settings.preview.is_enabled() {
semantic_syntax_errors.as_slice()
} else {
&[]
};
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's move this check into Checker so that we avoid collecting the semantic syntax errors if the preview mode is off (Ideally, we'd skip the entire SemanticSyntaxVisitor but I think that would make the code unnecessarily awkward. Although it might be worth it if you're concerned about panics (it then gives users a way to opt out of the new semantic syntax visitor by disabling preview).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I see what you mean about it being awkward. I thought I could just do something like

if self.settings.preview.is_enabled() {
    self.syntax_checker.visit_stmt(stmt);
}

But this loses F404 detection when preview is disabled. For now I moved this into check_ast to avoid pushing onto the semantic_syntax_errors Vec when preview is disabled.

I don't think there's any panic-prone code in here, but I'll look back over it with that in mind. If we keep the context trait, the RefCell stuff is the most panicky, I think, but the encapsulated usage here should be about as safe as it gets.


diagnostics_to_messages(
diagnostics,
parsed.errors(),
syntax_errors,
semantic_syntax_errors,
path,
locator,
directives,
Expand Down Expand Up @@ -445,8 +458,8 @@ pub fn lint_only(
);

LinterResult {
has_syntax_error: messages.iter().any(Message::is_syntax_error),
messages,
has_syntax_error: parsed.has_syntax_errors(),
}
}

Expand All @@ -455,6 +468,7 @@ fn diagnostics_to_messages(
diagnostics: Vec<Diagnostic>,
parse_errors: &[ParseError],
unsupported_syntax_errors: &[UnsupportedSyntaxError],
semantic_syntax_errors: &[SyntaxError],
path: &Path,
locator: &Locator,
directives: &Directives,
Expand All @@ -476,6 +490,11 @@ fn diagnostics_to_messages(
.chain(unsupported_syntax_errors.iter().map(|syntax_error| {
Message::from_unsupported_syntax_error(syntax_error, file.deref().clone())
}))
.chain(
semantic_syntax_errors
.iter()
.map(|error| Message::from_semantic_syntax_error(error, file.deref().clone())),
)
.chain(diagnostics.into_iter().map(|diagnostic| {
let noqa_offset = directives.noqa_line_for.resolve(diagnostic.start());
Message::from_diagnostic(diagnostic, file.deref().clone(), noqa_offset)
Expand Down Expand Up @@ -547,7 +566,7 @@ pub fn lint_fix<'a>(
);

if iterations == 0 {
is_valid_syntax = parsed.has_no_syntax_errors();
is_valid_syntax = !messages.iter().any(Message::is_syntax_error);
} else {
// If the source code was parseable on the first pass, but is no
// longer parseable on a subsequent pass, then we've introduced a
Expand Down
13 changes: 13 additions & 0 deletions crates/ruff_linter/src/message/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@ use std::collections::BTreeMap;
use std::io::Write;
use std::ops::Deref;

use ruff_python_syntax_errors::SyntaxError;
use rustc_hash::FxHashMap;

pub use azure::AzureEmitter;
Expand Down Expand Up @@ -135,6 +136,18 @@ impl Message {
})
}

/// Create a [`Message`] from the given [`SyntaxError`].
pub fn from_semantic_syntax_error(
semantic_syntax_error: &SyntaxError,
file: SourceFile,
) -> Message {
Message::SyntaxError(SyntaxErrorMessage {
message: format!("SyntaxError: {semantic_syntax_error}"),
range: semantic_syntax_error.range,
file,
})
}

pub const fn as_diagnostic_message(&self) -> Option<&DiagnosticMessage> {
match self {
Message::Diagnostic(m) => Some(m),
Expand Down
Loading
Loading