-
-
Notifications
You must be signed in to change notification settings - Fork 31.8k
gh-131421: fix ASDL grammar for Dict
to have an expr?*
keys field
#131419
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 2 commits
dcc5610
2a19020
cbccd92
da764a3
9406ddb
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -12,7 +12,7 @@ | |
# type ::= product | sum | ||
# product ::= fields ["attributes" fields] | ||
# fields ::= "(" { field, "," } field ")" | ||
# field ::= TypeId ["?" | "*"] [Id] | ||
# field ::= TypeId { "?" | "*" } [Id] | ||
# sum ::= constructor { "|" constructor } ["attributes" fields] | ||
# constructor ::= ConstructorId [fields] | ||
# | ||
|
@@ -64,34 +64,45 @@ def __init__(self, name, fields=None): | |
def __repr__(self): | ||
return 'Constructor({0.name}, {0.fields})'.format(self) | ||
|
||
class Quantifier: | ||
class _Optional: pass | ||
class _Sequence: pass | ||
OPTIONAL = _Optional() | ||
SEQUENCE = _Sequence() | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Could this just be an enum? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Sure, I just prefer these to have unique/singleton values (as opposed to I can change it if its outside the expected coding style. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. If you wanted singletons you could also just do There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. (one reason to do the class trick is to have a bit of a nicer repr(), but OTOH, I don't think we need to worry about importing There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is it worth importing There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes let's use an enum here, it's the right tool. |
||
|
||
class Field(AST): | ||
def __init__(self, type, name=None, seq=False, opt=False): | ||
def __init__(self, type, name=None, quantifiers=None): | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is there a need to support more than two qualifiers? and/or shouldn't we check the consistency of those qualifiers as well? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Currently there are no more than two quantifiers on any type, as per my change to Python.asdl. I'm just implementing the EBNF grammar described at the top of the asdl.py file. And I don't think this files does any amount of extra validation so far anyway. |
||
self.type = type | ||
self.name = name | ||
self.seq = seq | ||
self.opt = opt | ||
self.seq = False | ||
self.opt = False | ||
self.quantifiers = quantifiers or [] | ||
if len(self.quantifiers) > 0: | ||
self.seq = self.quantifiers[-1] is Quantifier.SEQUENCE | ||
self.opt = self.quantifiers[-1] is Quantifier.OPTIONAL | ||
|
||
def __str__(self): | ||
if self.seq: | ||
extra = "*" | ||
elif self.opt: | ||
extra = "?" | ||
else: | ||
extra = "" | ||
extra = "" | ||
for mod in self.quantifiers: | ||
if mod is Quantifier.SEQUENCE: | ||
extra += "*" | ||
elif mod is Quantifier.OPTIONAL: | ||
extra += "?" | ||
|
||
return "{}{} {}".format(self.type, extra, self.name) | ||
|
||
def __repr__(self): | ||
if self.seq: | ||
extra = ", seq=True" | ||
elif self.opt: | ||
extra = ", opt=True" | ||
else: | ||
extra = "" | ||
extra = "" | ||
for mod in self.quantifiers: | ||
if mod is Quantifier.SEQUENCE: | ||
extra += ", SEQUENCE" | ||
elif mod is Quantifier.OPTIONAL: | ||
extra += ", OPTIONAL" | ||
|
||
if self.name is None: | ||
return 'Field({0.type}{1})'.format(self, extra) | ||
return 'Field({0.type}, quantifiers=[{1}])'.format(self, extra) | ||
else: | ||
return 'Field({0.type}, {0.name}{1})'.format(self, extra) | ||
return 'Field({0.type}, {0.name}, quantifiers=[{1}])'.format(self, extra) | ||
|
||
class Sum(AST): | ||
def __init__(self, types, attributes=None): | ||
|
@@ -314,10 +325,10 @@ def _parse_fields(self): | |
self._match(TokenKind.LParen) | ||
while self.cur_token.kind == TokenKind.TypeId: | ||
typename = self._advance() | ||
is_seq, is_opt = self._parse_optional_field_quantifier() | ||
quantifiers = self._parse_optional_field_quantifier() | ||
id = (self._advance() if self.cur_token.kind in self._id_kinds | ||
else None) | ||
fields.append(Field(typename, id, seq=is_seq, opt=is_opt)) | ||
fields.append(Field(typename, id, quantifiers=quantifiers)) | ||
if self.cur_token.kind == TokenKind.RParen: | ||
break | ||
elif self.cur_token.kind == TokenKind.Comma: | ||
|
@@ -339,14 +350,15 @@ def _parse_optional_attributes(self): | |
return None | ||
|
||
def _parse_optional_field_quantifier(self): | ||
is_seq, is_opt = False, False | ||
if self.cur_token.kind == TokenKind.Asterisk: | ||
is_seq = True | ||
self._advance() | ||
elif self.cur_token.kind == TokenKind.Question: | ||
is_opt = True | ||
quantifiers = [] | ||
while self.cur_token.kind in (TokenKind.Asterisk, TokenKind.Question): | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Shouldn't we check that we don't have There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Maybe. But both ?? and ** are legitimate types, ** perhaps more legitimate than ?? is. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Ah yes, actually the current parser doesn't care. I thought that it would have warned the user. And should I have no strong opinion now on whether we should warn There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. There is essentially no change to the C code at all at this point. I don't know how this was handled before, since expr?* was always there implicitly, but this didn't seem to matter. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yeah I think this is fine as is |
||
if self.cur_token.kind == TokenKind.Asterisk: | ||
quantifiers.append(Quantifier.SEQUENCE) | ||
elif self.cur_token.kind == TokenKind.Question: | ||
quantifiers.append(Quantifier.OPTIONAL) | ||
self._advance() | ||
return is_seq, is_opt | ||
|
||
Demonstrandum marked this conversation as resolved.
Show resolved
Hide resolved
|
||
return quantifiers | ||
|
||
def _advance(self): | ||
""" Return the value of the current token and read the next one into | ||
|
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This kind of change makes me wonder whether we should also add '+' to indicate 1 or more items. For instance
nonlocal
requires at least one name, same forimport
,global
and others.Similarly, there is no explanation of what
*
and?
actually mean. While they can be inferred, I would expect a caption for them. WDYT @JelleZijlstra?