Skip to content

[ruff] Stabilize unraw-re-pattern (RUF039) #16644

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

ntBre
Copy link
Contributor

@ntBre ntBre commented Mar 11, 2025

Summary

Stabilizes RUF039 and moves its tests out of preview_rules.

Test Plan

2 closed issues 3 days after the rule was added, otherwise no issues. However, there's a chance we need a better autofix for regexes involving escapes. We'll need to see the ecosystem check for this.

Summary
--

Stabilizes RUF039 and moves its tests out of `preview_rules`.

Test Plan
--

2 closed issues 3 days after the rule was added, otherwise no issues. However,
there's a chance we need a better autofix for regexes involving escapes. We'll
need to see the ecosystem check for this.
@ntBre ntBre added the rule Implementing or modifying a lint rule label Mar 11, 2025
@ntBre ntBre added this to the v0.10 milestone Mar 11, 2025
Copy link

codspeed-hq bot commented Mar 11, 2025

CodSpeed Performance Report

Merging #16644 will degrade performances by 12.73%

Comparing brent/ruf039-0.10 (b107666) with micha/ruff-0.10 (85f7871)

Summary

❌ 1 regressions
✅ 31 untouched benchmarks

⚠️ Please fix the performance issues or acknowledge them on CodSpeed.

Benchmarks breakdown

Benchmark BASE HEAD Change
red_knot_check_file[incremental] 4.8 ms 5.5 ms -12.73%

Copy link
Contributor

ruff-ecosystem results

Linter (stable)

ℹ️ ecosystem check detected linter changes. (+223 -0 violations, +0 -0 fixes in 14 projects; 41 projects unchanged)

RasaHQ/rasa (+10 -0 violations, +0 -0 fixes)

+ rasa/utils/io.py:222:9: RUF039 [*] First argument to `re.compile()` is not raw string
+ rasa/utils/io.py:223:9: RUF039 First argument to `re.compile()` is not raw string
+ rasa/utils/io.py:224:9: RUF039 First argument to `re.compile()` is not raw string
+ rasa/utils/io.py:225:9: RUF039 First argument to `re.compile()` is not raw string
+ rasa/utils/io.py:226:9: RUF039 First argument to `re.compile()` is not raw string
+ rasa/utils/io.py:227:9: RUF039 First argument to `re.compile()` is not raw string
+ rasa/utils/io.py:228:9: RUF039 First argument to `re.compile()` is not raw string
+ rasa/utils/io.py:229:9: RUF039 First argument to `re.compile()` is not raw string
+ rasa/utils/io.py:230:9: RUF039 First argument to `re.compile()` is not raw string
+ rasa/utils/io.py:231:9: RUF039 [*] First argument to `re.compile()` is not raw string

apache/airflow (+42 -0 violations, +0 -0 fixes)

ruff check --no-cache --exit-zero --ignore RUF9 --no-fix --output-format concise --no-preview --select ALL

+ airflow/api_fastapi/core_api/routes/public/providers.py:35:19: RUF039 [*] First argument to `re.sub()` is not raw string
+ airflow/cli/commands/remote_commands/provider_command.py:33:19: RUF039 [*] First argument to `re.sub()` is not raw string
+ dev/breeze/src/airflow_breeze/params/build_prod_params.py:93:21: RUF039 [*] First argument to `re.match()` is not raw string
+ dev/breeze/src/airflow_breeze/utils/run_tests.py:125:19: RUF039 [*] First argument to `re.sub()` is not raw string
+ dev/perf/dags/elastic_dag.py:73:19: RUF039 [*] First argument to `re.sub()` is not raw string
+ docs/exts/docs_build/lint_checks.py:49:46: RUF039 [*] First argument to `re.findall()` is not raw string
+ helm_tests/airflow_aux/test_pod_template_file.py:358:26: RUF039 [*] First argument to `re.search()` is not raw string
+ helm_tests/airflow_aux/test_pod_template_file.py:370:26: RUF039 [*] First argument to `re.search()` is not raw string
+ helm_tests/airflow_aux/test_pod_template_file.py:407:26: RUF039 [*] First argument to `re.search()` is not raw string
+ helm_tests/airflow_aux/test_pod_template_file.py:59:26: RUF039 [*] First argument to `re.search()` is not raw string
... 32 additional changes omitted for project

apache/superset (+67 -0 violations, +0 -0 fixes)

ruff check --no-cache --exit-zero --ignore RUF9 --no-fix --output-format concise --no-preview --select ALL

+ RELEASING/changelog.py:275:26: RUF039 [*] First argument to `re.match()` is not raw string
+ superset/db_engine_specs/athena.py:30:5: RUF039 [*] First argument to `re.compile()` is not raw string
+ superset/db_engine_specs/bigquery.py:81:5: RUF039 [*] First argument to `re.compile()` is not raw string
+ superset/db_engine_specs/bigquery.py:82:5: RUF039 [*] First argument to `re.compile()` is not raw string
+ superset/db_engine_specs/bigquery.py:95:5: RUF039 [*] First argument to `re.compile()` is not raw string
+ superset/db_engine_specs/denodo.py:29:46: RUF039 [*] First argument to `re.compile()` is not raw string
+ superset/db_engine_specs/denodo.py:30:48: RUF039 [*] First argument to `re.compile()` is not raw string
+ superset/db_engine_specs/denodo.py:32:9: RUF039 [*] First argument to `re.compile()` is not raw string
+ superset/db_engine_specs/denodo.py:35:9: RUF039 [*] First argument to `re.compile()` is not raw string
+ superset/db_engine_specs/denodo.py:37:46: RUF039 [*] First argument to `re.compile()` is not raw string
+ superset/db_engine_specs/denodo.py:39:9: RUF039 [*] First argument to `re.compile()` is not raw string
+ superset/db_engine_specs/denodo.py:41:43: RUF039 [*] First argument to `re.compile()` is not raw string
+ superset/db_engine_specs/denodo.py:43:9: RUF039 [*] First argument to `re.compile()` is not raw string
+ superset/db_engine_specs/denodo.py:46:9: RUF039 [*] First argument to `re.compile()` is not raw string
+ superset/db_engine_specs/denodo.py:49:9: RUF039 [*] First argument to `re.compile()` is not raw string
... 52 additional changes omitted for project

binary-husky/gpt_academic (+11 -0 violations, +0 -0 fixes)

+ crazy_functions/agent_fns/python_comment_agent.py:274:42: RUF039 First argument to `re.compile()` is not raw string
+ crazy_functions/agent_fns/python_comment_agent.py:275:45: RUF039 First argument to `re.compile()` is not raw string
+ crazy_functions/vector_fns/general_file_loader.py:18:27: RUF039 First argument to `re.sub()` is not raw string
+ crazy_functions/vector_fns/general_file_loader.py:20:39: RUF039 [*] First argument to `re.compile()` is not raw string
+ crazy_functions/vector_fns/general_file_loader.py:32:27: RUF039 First argument to `re.sub()` is not raw string
+ crazy_functions/vector_fns/general_file_loader.py:33:27: RUF039 First argument to `re.sub()` is not raw string
+ crazy_functions/vector_fns/general_file_loader.py:53:51: RUF039 [*] First argument to `re.sub()` is not raw string
+ multi_language.py:131:32: RUF039 First argument to `re.compile()` is not raw string
+ shared_utils/text_mask.py:76:36: RUF039 First argument to `re.compile()` is not raw string
+ tests/test_python_auto_docstring.py:200:42: RUF039 First argument to `re.compile()` is not raw string
... 1 additional changes omitted for project

bokeh/bokeh (+9 -0 violations, +0 -0 fixes)

ruff check --no-cache --exit-zero --ignore RUF9 --no-fix --output-format concise --no-preview --select ALL

+ src/bokeh/util/strings.py:91:19: RUF039 [*] First argument to `re.sub()` is not raw string
+ src/bokeh/util/strings.py:92:19: RUF039 First argument to `re.sub()` is not raw string
+ tests/unit/bokeh/core/test_templates.py:48:19: RUF039 First argument to `re.sub()` is not raw bytes literal
+ tests/unit/bokeh/io/test_export.py:203:9: RUF039 [*] First argument to `re.compile()` is not raw string
+ tests/unit/bokeh/io/test_export.py:204:13: RUF039 [*] First argument to `re.compile()` is not raw string
+ tests/unit/bokeh/io/test_export.py:205:13: RUF039 [*] First argument to `re.compile()` is not raw string
+ tests/unit/bokeh/io/test_export.py:206:9: RUF039 [*] First argument to `re.compile()` is not raw string
+ tests/unit/bokeh/server/test_server__server.py:211:28: RUF039 [*] First argument to `re.compile()` is not raw string
+ tests/unit/bokeh/server/test_server__server.py:219:36: RUF039 [*] First argument to `re.compile()` is not raw string

ibis-project/ibis (+9 -0 violations, +0 -0 fixes)

+ ibis/backends/__init__.py:1592:17: RUF039 [*] First argument to `re.match()` is not raw string
+ ibis/backends/athena/__init__.py:298:37: RUF039 [*] First argument to `re.search()` is not raw string
+ ibis/backends/flink/__init__.py:323:17: RUF039 [*] First argument to `re.search()` is not raw string
+ ibis/backends/sql/compilers/pyspark.py:360:27: RUF039 [*] First argument to `re.sub()` is not raw string
+ ibis/backends/tests/test_client.py:1062:13: RUF039 [*] First argument to `re.search()` is not raw string
+ ibis/backends/tests/test_client.py:1083:13: RUF039 [*] First argument to `re.search()` is not raw string
+ ibis/common/tests/test_patterns.py:246:31: RUF039 [*] First argument to `re.compile()` is not raw string
+ ibis/common/tests/test_patterns.py:246:65: RUF039 [*] First argument to `re.compile()` is not raw string
+ ibis/tests/expr/test_selectors.py:116:39: RUF039 [*] First argument to `re.compile()` is not raw string

latchbio/latch (+4 -0 violations, +0 -0 fixes)

+ src/latch_cli/centromere/ctx.py:479:26: RUF039 [*] First argument to `re.match()` is not raw string
+ src/latch_cli/services/init/init.py:304:18: RUF039 [*] First argument to `re.search()` is not raw string
+ src/latch_cli/services/init/init.py:311:18: RUF039 [*] First argument to `re.search()` is not raw string
+ src/latch_cli/services/register/register.py:59:36: RUF039 [*] First argument to `re.compile()` is not raw string

lnbits/lnbits (+1 -0 violations, +0 -0 fixes)

+ lnbits/db.py:151:34: RUF039 [*] First argument to `re.compile()` is not raw string

pandas-dev/pandas (+21 -0 violations, +0 -0 fixes)

+ pandas/io/formats/excel.py:421:35: RUF039 [*] First argument to `re.search()` is not raw string
+ pandas/io/formats/style_render.py:2523:28: RUF039 First argument to `re.findall()` is not raw string
+ pandas/io/formats/style_render.py:2525:28: RUF039 First argument to `re.findall()` is not raw string
+ pandas/io/formats/style_render.py:2528:32: RUF039 First argument to `re.findall()` is not raw string
+ pandas/io/formats/style_render.py:2530:32: RUF039 First argument to `re.findall()` is not raw string
+ pandas/tests/dtypes/test_inference.py:462:44: RUF039 [*] First argument to `re.compile()` is not raw string
+ pandas/tests/dtypes/test_inference.py:473:43: RUF039 [*] First argument to `re.compile()` is not raw string
+ pandas/tests/extension/test_arrow.py:1803:25: RUF039 [*] First argument to `re.compile()` is not raw string
+ pandas/tests/frame/methods/test_replace.py:1368:28: RUF039 [*] First argument to `re.compile()` is not raw string
+ pandas/tests/indexes/datetimes/test_date_range.py:787:29: RUF039 [*] First argument to `re.split()` is not raw string
... 11 additional changes omitted for project

pypa/build (+2 -0 violations, +0 -0 fixes)

+ tests/test_integration.py:31:21: RUF039 [*] First argument to `re.compile()` is not raw string
+ tests/test_integration.py:32:21: RUF039 [*] First argument to `re.compile()` is not raw string

python-poetry/poetry (+16 -0 violations, +0 -0 fixes)

+ src/poetry/console/logging/formatters/builder_formatter.py:11:26: RUF039 [*] First argument to `re.sub()` is not raw string
+ src/poetry/console/logging/formatters/builder_formatter.py:13:26: RUF039 [*] First argument to `re.sub()` is not raw string
+ src/poetry/console/logging/formatters/builder_formatter.py:15:26: RUF039 [*] First argument to `re.sub()` is not raw string
+ src/poetry/console/logging/formatters/builder_formatter.py:18:17: RUF039 [*] First argument to `re.sub()` is not raw string
+ src/poetry/mixology/solutions/providers/python_requirement_solution_provider.py:22:13: RUF039 [*] First argument to `re.match()` is not raw string
+ src/poetry/mixology/solutions/providers/python_requirement_solution_provider.py:23:13: RUF039 [*] First argument to `re.match()` is not raw string
+ src/poetry/puzzle/provider.py:736:21: RUF039 [*] First argument to `re.sub()` is not raw string
+ src/poetry/utils/dependency_specification.py:192:13: RUF039 [*] First argument to `re.sub()` is not raw string
+ tests/config/test_config.py:58:46: RUF039 [*] First argument to `re.sub()` is not raw string
+ tests/conftest.py:378:20: RUF039 [*] First argument to `re.compile()` is not raw string
... 6 additional changes omitted for project

... Truncated remaining completed project reports due to GitHub comment length restrictions

Changes by rule (1 rules affected)

code total + violation - violation + fix - fix
RUF039 223 223 0 0 0

Linter (preview)

✅ ecosystem check detected no linter changes.

@ntBre
Copy link
Contributor Author

ntBre commented Mar 11, 2025

The first rasa case is pretty annoying (it looks like @MichaReiser flagged this in the original PR too):
https://github.com/RasaHQ/rasa/blob/b8de3b231126747ff74b2782cb25cb22d2d898d7/rasa/utils/io.py#L221-L231

It involves an implicitly-concatenated string, so it gets 10 instances of RUF039 in the same snippet. 8 of them involve Unicode escapes (\U...) and won't be auto-fixable.

superset has some implicitly-concatentated strings with multiple diagnostics too. I'm not sure if we can handle this better, and these are auto-fixable at least.

Almost all of the gpt_academic cases have escapes and no auto-fixes.

bokeh is mostly auto-fixable but has one concatenated string with 4 diagnostics.

Summary

I looked at all of the results in the comment, and I would say that the majority have an auto-fix. I think getting multiple diagnostics for implicitly-concatenated strings is probably the most annoying aspect, but I don't think we can do much about that, short of joining the string, which doesn't seem right at all.

@ntBre ntBre mentioned this pull request Mar 12, 2025
2 tasks
@InSyncWithFoo
Copy link
Contributor

I think RUF039 can still be made quite a bit smarter. Let's not stabilize it yet.

@MichaReiser
Copy link
Member

superset has some implicitly-concatentated strings with multiple diagnostics too. I'm not sure if we can handle this better, and these are auto-fixable at least.

We could create a signal diagnostic with a single fix for all parts but it has the downside that we couldn't provide a fix if any of the parts aren't autofixable, but that's probably desired to avoid creating strings with mixed normal and raw string parts.

We could also make the raw string autofix understand which escape sequences are supported in regex.compile, so that the majority of regex patterns becomes autofixable.

@InSyncWithFoo what are the improvements you have in mind?

Holding back on this rule and creating a follow up issue does make sense to me

@InSyncWithFoo
Copy link
Contributor

what are the improvements you have in mind?

Not anything concrete yet, but I think it is possible to merge the diagnostics for implicitly concatenated string parts when they are the same (fixes, applicability). The fix could use some work too.

@ntBre
Copy link
Contributor Author

ntBre commented Mar 12, 2025

Sounds good, I can close this for now. @InSyncWithFoo do you want to open the issue to track follow-up work? I'm happy to if not.

@ntBre ntBre closed this Mar 12, 2025
@InSyncWithFoo
Copy link
Contributor

InSyncWithFoo commented Mar 12, 2025

I think I'll do that tomorrow, once I'm done with what I'm currently doing.

@ntBre ntBre deleted the brent/ruf039-0.10 branch May 1, 2025 15:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
rule Implementing or modifying a lint rule
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants