Skip to content

[syntax-errors] Detect duplicate keys in match mapping patterns #17129

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Apr 3, 2025
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 6 additions & 1 deletion crates/ruff_linter/src/checkers/ast/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -552,13 +552,18 @@ impl SemanticSyntaxContext for Checker<'_> {
| SemanticSyntaxErrorKind::MultipleCaseAssignment(_)
| SemanticSyntaxErrorKind::IrrefutableCasePattern(_)
| SemanticSyntaxErrorKind::SingleStarredAssignment
| SemanticSyntaxErrorKind::WriteToDebug(_) => {
| SemanticSyntaxErrorKind::WriteToDebug(_)
| SemanticSyntaxErrorKind::DuplicateMatchKey(_) => {
if self.settings.preview.is_enabled() {
self.semantic_errors.borrow_mut().push(error);
}
}
}
}

fn source(&self) -> &str {
self.source()
}
}

impl<'a> Visitor<'a> for Checker<'a> {
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
match x:
case {"x": 1, "x": 2}: ...
case {b"x": 1, b"x": 2}: ...
case {0: 1, 0: 2}: ...
case {1.0: 1, 1.0: 2}: ...
case {1.0 + 2j: 1, 1.0 + 2j: 2}: ...
case {True: 1, True: 2}: ...
case {None: 1, None: 2}: ...
case {
"""x
y
z
""": 1,
"""x
y
z
""": 2}: ...
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
match x:
case {x.a: 1, x.a: 2}: ...
93 changes: 93 additions & 0 deletions crates/ruff_python_parser/src/semantic_errors.rs
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@ use std::fmt::Display;

use ruff_python_ast::{
self as ast,
comparable::ComparableExpr,
visitor::{walk_expr, Visitor},
Expr, ExprContext, IrrefutablePatternKind, Pattern, PythonVersion, Stmt, StmtExpr,
StmtImportFrom,
Expand Down Expand Up @@ -65,6 +66,7 @@ impl SemanticSyntaxChecker {
Stmt::Match(match_stmt) => {
Self::irrefutable_match_case(match_stmt, ctx);
Self::multiple_case_assignment(match_stmt, ctx);
Self::duplicate_match_mapping_keys(match_stmt, ctx);
}
Stmt::FunctionDef(ast::StmtFunctionDef { type_params, .. })
| Stmt::ClassDef(ast::StmtClassDef { type_params, .. })
Expand Down Expand Up @@ -240,6 +242,56 @@ impl SemanticSyntaxChecker {
}
}

fn duplicate_match_mapping_keys<Ctx: SemanticSyntaxContext>(stmt: &ast::StmtMatch, ctx: &Ctx) {
for mapping in stmt
.cases
.iter()
.filter_map(|case| case.pattern.as_match_mapping())
{
let mut seen = FxHashSet::default();
for key in mapping
.keys
.iter()
// complex numbers (`1 + 2j`) are allowed as keys but are not literals
// because they are represented as a `BinOp::Add` between a real number and
// an imaginary number
.filter(|key| key.is_literal_expr() || key.is_bin_op_expr())
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could be a bit stricter here if we want (checking that the BinOp is an Add and the arguments are numbers), but we already report syntax errors for non-complex literals like 1 + 2, so I thought this might be sufficient.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I think that's fine, the parser should catch invalid complex literals.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We also have to allow f-string expressions. They're allowed for as long as they contain no placeholders and they should compare equal to their string equivalent (I think this is already handled by ComparableExpr).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think f-strings are not allowed here by CPython, even without placeholders:

>>> match x:
...     case {f"x": 1}: ...
...
  File "<python-input-2>", line 2
    case {f"x": 1}: ...
         ^^^^^^^^^
SyntaxError: mapping pattern keys may only match literals and attribute lookups

In that case, I think our is_literal_expr is doing the right thing.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, f-strings are not allowed as literal patterns.

{
if !seen.insert(ComparableExpr::from(key)) {
let duplicate_key = ctx.source()[key.range()].to_string();
// test_ok duplicate_match_key_attr
// match x:
// case {x.a: 1, x.a: 2}: ...

// test_err duplicate_match_key
// match x:
// case {"x": 1, "x": 2}: ...
// case {b"x": 1, b"x": 2}: ...
// case {0: 1, 0: 2}: ...
// case {1.0: 1, 1.0: 2}: ...
// case {1.0 + 2j: 1, 1.0 + 2j: 2}: ...
// case {True: 1, True: 2}: ...
// case {None: 1, None: 2}: ...
// case {
// """x
// y
// z
// """: 1,
// """x
// y
// z
// """: 2}: ...
Self::add_error(
ctx,
SemanticSyntaxErrorKind::DuplicateMatchKey(duplicate_key),
mapping.range,
);
break;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should avoid breaking here and report all duplicate keys within a single mapping pattern. Or, do you see any limitation or challenges in that approach? What do you think?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was just copying CPython, but that sounds reasonable too!

>>> match x:
...     case {"x": 1, "x": 2}: ...
...
  File "<python-input-223>", line 2
    case {"x": 1, "x": 2}: ...
         ^^^^^^^^^^^^^^^^
SyntaxError: mapping pattern checks duplicate key ('x')

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, in general, we should prefer to surface all errors whenever possible which is one of the main motivation to have an error resilient parser :)

}
}
}
}

fn irrefutable_match_case<Ctx: SemanticSyntaxContext>(stmt: &ast::StmtMatch, ctx: &Ctx) {
// test_ok irrefutable_case_pattern_at_end
// match x:
Expand Down Expand Up @@ -462,6 +514,9 @@ impl Display for SemanticSyntaxError {
write!(f, "cannot delete `__debug__` on Python {python_version} (syntax was removed in 3.9)")
}
},
SemanticSyntaxErrorKind::DuplicateMatchKey(key) => {
write!(f, "mapping pattern checks duplicate key `{key}`")
}
}
}
}
Expand Down Expand Up @@ -575,6 +630,41 @@ pub enum SemanticSyntaxErrorKind {
///
/// [BPO 45000]: https://github.com/python/cpython/issues/89163
WriteToDebug(WriteToDebugKind),

/// Represents a duplicate key in a `match` mapping pattern.
///
/// The [CPython grammar] allows keys in mapping patterns to be literals or attribute accesses:
///
/// ```text
/// key_value_pattern:
/// | (literal_expr | attr) ':' pattern
/// ```
///
/// But only literals are checked for duplicates:
///
/// ```pycon
/// >>> match x:
/// ... case {"x": 1, "x": 2}: ...
/// ...
/// File "<python-input-160>", line 2
/// case {"x": 1, "x": 2}: ...
/// ^^^^^^^^^^^^^^^^
/// SyntaxError: mapping pattern checks duplicate key ('x')
/// >>> match x:
/// ... case {x.a: 1, x.a: 2}: ...
/// ...
/// >>>
/// ```
Comment on lines +709 to +721
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting!

Also, I think you meant "python" or "py" and not "pycon" xD

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I thought pycon was for the Python console [1]. That's what I use here on GitHub, although it doesn't usually do much syntax highlighting anyway.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh lol, I had no idea about that. I don't think that matters as those are going to be rendered in an editor or crates.io.

///
/// ## Examples
///
/// ```python
/// match x:
/// case {"x": 1, "x": 2}: ...
/// ```
///
/// [CPython grammar]: https://docs.python.org/3/reference/grammar.html
DuplicateMatchKey(String),
}

#[derive(Debug, Clone, PartialEq, Eq, Hash)]
Expand Down Expand Up @@ -705,6 +795,9 @@ pub trait SemanticSyntaxContext {
/// The target Python version for detecting backwards-incompatible syntax changes.
fn python_version(&self) -> PythonVersion;

/// Returns the source text under analysis.
fn source(&self) -> &str;

fn report_semantic_error(&self, error: SemanticSyntaxError);
}

Expand Down
25 changes: 19 additions & 6 deletions crates/ruff_python_parser/tests/fixtures.rs
Original file line number Diff line number Diff line change
Expand Up @@ -89,7 +89,7 @@ fn test_valid_syntax(input_path: &Path) {
let parsed = parsed.try_into_module().expect("Parsed with Mode::Module");

let mut visitor = SemanticSyntaxCheckerVisitor::new(
TestContext::default().with_python_version(options.target_version()),
TestContext::new(&source).with_python_version(options.target_version()),
);

for stmt in parsed.suite() {
Expand Down Expand Up @@ -185,7 +185,7 @@ fn test_invalid_syntax(input_path: &Path) {
let parsed = parsed.try_into_module().expect("Parsed with Mode::Module");

let mut visitor = SemanticSyntaxCheckerVisitor::new(
TestContext::default().with_python_version(options.target_version()),
TestContext::new(&source).with_python_version(options.target_version()),
);

for stmt in parsed.suite() {
Expand Down Expand Up @@ -462,21 +462,30 @@ impl<'ast> SourceOrderVisitor<'ast> for ValidateAstVisitor<'ast> {
}
}

#[derive(Debug, Default)]
struct TestContext {
#[derive(Debug)]
struct TestContext<'a> {
diagnostics: RefCell<Vec<SemanticSyntaxError>>,
python_version: PythonVersion,
source: &'a str,
}

impl TestContext {
impl<'a> TestContext<'a> {
fn new(source: &'a str) -> Self {
Self {
diagnostics: RefCell::default(),
python_version: PythonVersion::default(),
source,
}
}

#[must_use]
fn with_python_version(mut self, python_version: PythonVersion) -> Self {
self.python_version = python_version;
self
}
}

impl SemanticSyntaxContext for TestContext {
impl SemanticSyntaxContext for TestContext<'_> {
fn seen_docstring_boundary(&self) -> bool {
false
}
Expand All @@ -488,4 +497,8 @@ impl SemanticSyntaxContext for TestContext {
fn report_semantic_error(&self, error: SemanticSyntaxError) {
self.diagnostics.borrow_mut().push(error);
}

fn source(&self) -> &str {
self.source
}
}
Loading
Loading