-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Escape template filenames in glob patterns #16407
Conversation
|
349a56f
to
11917c4
Compare
I'm not sure what's going wrong on Windows. It looks like the |
Yeah, the filters are brutal on windows. Can you try what we have in ruff/crates/red_knot/tests/cli.rs Lines 863 to 865 in 3e840ca
|
Err(e) if *e.kind() == globset::ErrorKind::NestedAlternates => { | ||
let new = s.replace("{{", "[{][{]").replace("}}", "[}][}]"); | ||
log::warn!( | ||
"Error parsing original glob: `{s:?}`, trying with escaped braces: `{new:?}`" | ||
); | ||
Ok(Glob::new(&new)?) | ||
} | ||
glob => Ok(glob?), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This, indeed, feels somewhat hacky :D
I think my main concern with this is that it replaces nested alternates anywhere in the glob -- even in the part that the user provided.
What I understand is that the problem is mainly about {{
and }}
in the base path up to the project. Could we escape the {{
and }}
in the project's base-path only instead of applying this fix to the entire glob?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's true, although the nested alternates will always cause an error even if the user provided them. I thought it might be nice to allow them to avoid manual escaping in that case, but at least they can control that directly, unlike the base paths.
We could also add the escapes to the whole path just before creating the glob. I think that version would also look less hacky than catching this error and retrying, but then we also do the work of trying to replace each time. That was the main reason I tried to keep the replacement in the error case, but we're still only compiling the globs once, so it's probably not a big deal.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's true, although the nested alternates will always cause an error even if the user provided them. I thought it might be nice to allow them to avoid manual escaping in that case, but at least they can control that directly, unlike the base paths.
The problem I see with this is that they may have wanted to use nested alternates and are now surprised that the glob doesn't match anything.
Just pushing a test for Windows, I haven't addressed the hacky path stuff yet. |
It looks like the regex escapes in
OH, maybe it's the cookiecutter causing problems again. The escapes for the braces might be getting in the way of the tempdir replacement. Let me try nesting it one level deeper. |
You can try to use |
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for taking a look at this!
I think the thing I'm not understanding here is why nested alternates are the only problem here? Isn't the problem that we're using arbitrary input to build a glob? And shouldn't that imply the input (in this case, a file path) should always be escaped? Like, the nested alternates is just one way for this problem to manifest. But what if a file path had a *
or a [
in it? We would want to escape it there too.
So I guess, can we fix this by just always escaping the literal inputs to a glob?
pub fn try_glob_new(s: &str) -> Result<Glob> { | ||
match Glob::new(s) { | ||
Err(e) if *e.kind() == globset::ErrorKind::NestedAlternates => { | ||
let new = s.replace("{{", "[{][{]").replace("}}", "[}][}]"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is also not technically right in all cases I think, e.g., [{{]
, although that's pretty pathological.
Thanks for the review! I failed to notice the |
This reverts commit 35cd606.
This reverts commit 1f778f0.
This reverts commit 931ff96.
as I saw in one of the early commits, these are a bit redundant with the context provided [here](https://github.com/astral-sh/ruff/blob/764aa0e6a1b16284c1c939332c19d742dae8aee1/crates/ruff_workspace/src/configuration.rs#L179-L182), but I think it's good to have for other uses
This is now a much simpler fix, thanks again for the reviews and suggestions! I also removed the warning, which was the source of the path in the snapshot, I realized it was also possible to trigger this for the non-absolute path, although in a pretty niche case. The test I added extends |
I was checking for the other uses of ruff/crates/ruff_linter/src/settings/types.rs Lines 162 to 166 in cf83584
I'll start adding the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice, this looks great
Ok(Self::User(pattern, absolute)) | ||
Ok(Self::User( | ||
s.to_string(), | ||
GlobPath::normalize(s, &*path_dedot::CWD), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we introduce a cwd
helper function in fs
. WASM doesn't like path_dedot::CWD
ruff/crates/ruff_linter/src/fs.rs
Lines 36 to 48 in e7a6c19
pub fn relativize_path<P: AsRef<Path>>(path: P) -> String { | |
let path = path.as_ref(); | |
#[cfg(target_arch = "wasm32")] | |
let cwd = Path::new("."); | |
#[cfg(not(target_arch = "wasm32"))] | |
let cwd = path_absolutize::path_dedot::CWD.as_path(); | |
if let Ok(path) = path.strip_prefix(cwd) { | |
return format!("{}", path.display()); | |
} | |
format!("{}", path.display()) | |
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh good idea, I factored this out into a get_cwd
function and call it everywhere in the linter. I left the calls in the ruff
and ruff_workspace
crates alone for now. Should I replace those too?
None => fs::normalize_path(path), | ||
}; | ||
|
||
let project_root = project_root.unwrap_or(&path_dedot::CWD); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here, let's avoid spreading path_dedot::CWD
over our code base and instead use a helper to maximize WASM compatibility
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Summary
Fixes #9381. This PR fixes errors like
caused by glob special characters in filenames like
{{cookiecutter.project_dirname}}
. When the user is matching that directory exactly, they can use the workaround given by #7959 (comment), but that doesn't work for a nested config file with relative paths. For example, the directory tree in the reproduction repo linked here:where the inner
pyproject.toml
contains a relative glob:Test Plan
A new CLI test in both the linter and formatter. The formatter test may not be necessary because I didn't have to modify any additional code to pass it, but the original report mentioned both
check
andformat
, so I wanted to be sure both were fixed.