-
-
Notifications
You must be signed in to change notification settings - Fork 31.8k
Speed up pattern parsing in pathlib.Path.glob()
#126363
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Labels
Comments
barneygale
added a commit
to barneygale/cpython
that referenced
this issue
Nov 3, 2024
The implementation of `Path.glob()` does rather a hacky thing: it calls `self.with_segments()` to convert the given pattern to a `Path` object, and then peeks at the private `_raw_path` attribute to see if pathlib removed a trailing slash from the pattern. In this patch, we make `glob()` use a new `_parse_pattern()` classmethod that splits the pattern into parts while preserving information about any trailing slash. This skips the cost of creating a `Path` object, and avoids some path anchor normalization, which makes `Path.glob()` slightly faster. But mostly it's about making the code less naughty.
barneygale
added a commit
that referenced
this issue
Nov 4, 2024
The implementation of `Path.glob()` does rather a hacky thing: it calls `self.with_segments()` to convert the given pattern to a `Path` object, and then peeks at the private `_raw_path` attribute to see if pathlib removed a trailing slash from the pattern. In this patch, we make `glob()` use a new `_parse_pattern()` classmethod that splits the pattern into parts while preserving information about any trailing slash. This skips the cost of creating a `Path` object, and avoids some path anchor normalization, which makes `Path.glob()` slightly faster. But mostly it's about making the code less naughty. Co-authored-by: Tomas R. <[email protected]>
picnixz
pushed a commit
to picnixz/cpython
that referenced
this issue
Dec 8, 2024
…ython#126364) The implementation of `Path.glob()` does rather a hacky thing: it calls `self.with_segments()` to convert the given pattern to a `Path` object, and then peeks at the private `_raw_path` attribute to see if pathlib removed a trailing slash from the pattern. In this patch, we make `glob()` use a new `_parse_pattern()` classmethod that splits the pattern into parts while preserving information about any trailing slash. This skips the cost of creating a `Path` object, and avoids some path anchor normalization, which makes `Path.glob()` slightly faster. But mostly it's about making the code less naughty. Co-authored-by: Tomas R. <[email protected]>
ebonnal
pushed a commit
to ebonnal/cpython
that referenced
this issue
Jan 12, 2025
…ython#126364) The implementation of `Path.glob()` does rather a hacky thing: it calls `self.with_segments()` to convert the given pattern to a `Path` object, and then peeks at the private `_raw_path` attribute to see if pathlib removed a trailing slash from the pattern. In this patch, we make `glob()` use a new `_parse_pattern()` classmethod that splits the pattern into parts while preserving information about any trailing slash. This skips the cost of creating a `Path` object, and avoids some path anchor normalization, which makes `Path.glob()` slightly faster. But mostly it's about making the code less naughty. Co-authored-by: Tomas R. <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
pathlib.Path.glob()
converts a given string pattern to aPath
object:cpython/Lib/pathlib/_local.py
Lines 644 to 645 in dcae5cd
Which has the following drawbacks:
Path
object, which is slowNotImplementedError
to be raised.pattern._raw_path
to restore itIt would be better to add a variant of
PurePath._parse_path()
for dealing specifically with glob patterns.Linked PRs
pathlib.Path.glob()
#126364The text was updated successfully, but these errors were encountered: