-
Notifications
You must be signed in to change notification settings - Fork 1.4k
Fix bad-string-format-character
when a placeholder occurs within a format spec part
#6773
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
/// ``` | ||
/// "x{foo}y{bar}z" | ||
/// ``` | ||
fn consume_all_placeholders<'a>( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The following implementation is a little clunky. I'm guessing there's a more idiomatic way to do this?
PR Check ResultsEcosystem✅ ecosystem check detected no changes. BenchmarkLinux
Windows
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So I'm playing around with this a bit, and I'm starting to wonder if attempting to parse these format strings (when nested replacements are present) is sound.
Consider FormatSpec::parse("{a}30")
. In this case, 30
could either be the width or the precision depending on whether a
is >
or .
. So assigning it to the width (as we do now) is incorrect.
Similarly, for FormatSpec::parse("{a}s")
, this could either resolve to !s
or :s
depending on the value of a
, and so treating s
as either the format type or the conversion would be wrong.
In general, I just don't know that we can soundly analyze these when the contents are dynamic, which leads me to think we should use a different representation for them and avoid raising errors for them... What do you think?
@charliermarsh are you saying that the format string gets evaluated at runtime? Like, |
@MichaReiser - Yes that's correct. |
I do think that we have a possibility of false diagnostics with the current approach, but it seems better than giving up entirely. I don't think replacements are used to replace large parts of the format specification in practice and are typically used for values rather than clause identifiers. I could be convinced to just go to my initial approach which was an entirely different data structure. If we're not going to bother writing any diagnostics though we might as well just add a |
(I am the original reporter of this bug in: #6442 (comment).)
Maybe I am missing something here, but it seems the colon or exclamation mark must be included statically in the format string, as otherwise it produces a ValueError:
Adding the implicit zero does not make it work either:
|
@tdulcet - Sorry, you're right, it seems that it doesn't behave that way in that case. The first example does hold though (of width vs. precision), in my retesting. |
…re present (#6858) Closes #6767 Replaces #6773 (this cherry-picks some parts from there) Alternative to the approach introduced in #6616 which added support for placeholders in format specifications while retaining parsing of other format specification parts. The idea is that if there are placeholders in a format specification we will not attempt to glean semantic meaning from the other parts of the format specification we'll just extract all of the placeholders ignoring other characters. The dynamic content of placeholders can drastically change the meaning of the format specification in ways unknowable by static analysis. This change prevents false analysis and will ensure safety if we build other rules on top of this at the cost of missing detection of some bad specifications. Minor note: I've use "replacements" and "placeholders" interchangeably but am trying to go with "placeholder" as I think it's a better term for the static analysis concept here
Closes #6767
The presented false positive
"{0:.{prec}g}".format(1.23, prec=15)
occurs because we treated the entire format specification of.{prec}g
as the format type (which is typically at the end of a format specification). The nested replacement was not detected because a leading{
was not seen. Instead,.
remained the leading character — it's the leading character for a precision clause but was not consumed since the replacement was not a valid numberOne way to resolve this is to update parsing of every clause in a format specification to be replacement aware e.g. precision parsing would accept a replacement as well as a number. I determined that this is too complicated and provides little value for our current rules around format specifications.
Instead, after we've parsed most of the format specification parts, we will check for any remaining replacements (i.e. the presence of
{
anywhere in the remaining string) and consume them and any intermediate characters. This matches our existing intention of basically ignoring any replacements while retaining as much of the normal format specification linting as we can.