Skip to content

BSV Language Reference Grammar Issues #764

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
robertszafa opened this issue Mar 10, 2025 · 4 comments
Open

BSV Language Reference Grammar Issues #764

robertszafa opened this issue Mar 10, 2025 · 4 comments

Comments

@robertszafa
Copy link

While implementing a tree-sitter parser for BSV, I noticed a few minor issues with the grammar description in the BSV Language Reference Guide. Some examples:

  1. The packageStmt rule is missing the externCImport choice.
  2. The methodDef rule does not allow an empty methodFormals list.
  3. The actionStmt and actionValueStmt misses a ; at the end of the regWrite choice.

Besides such minor bugs, there is another question if the BSV Language Reference should be changed to align more closely with the actual bsc compiler implementation. For example:

  1. The spec requires parens in the methodDef rule, but bsc does not. This method declaration is accepted without warning by bsc: method a getOutput;
  2. bsc allows structured bindings, but the spec is missing this. This is accepted by bsc: {b, i, .*} <- mkSub;

If there is interest, I can put together a PR with fixes to the BSV Language Reference Guide, either just the first type of issue or both.

@yuyuranium
Copy link

Hi,

I also implemented a tree-sitter parser for BSV and noticed these issues in the reference guide!

There are undocumented or incorrect grammar rules/specifications in the reference guide, such as elements like function types and namespaced identifiers

While my implementation lacks unit tests for the grammars, I use the tree-sitter parser for syntax highlighting in my Neovim, which works almost perfectly.

I thought you might find my implementation useful.

@quark17
Copy link
Collaborator

quark17 commented Mar 14, 2025

Very cool to hear about these parsers. And thank you for bringing discrepancies to our attention! Yes, corrections to the grammar would be great. (Or potentially corrections to BSC.)

For the two questions you have, I think I would need to see some examples, to understand what we're talking about.

For (2): BSC does support pattern matching of bindings using the match keyword, but I think what you're pointing out is that BSC seems to have special handling of pattern matches for tuples. For example, these are the expected syntax for pattern binding:

match ST {f1: .b, f2: .v} <- mkM;  // match a struct

match {.b,.v} <- mkM;  // match a tuple

These declare new variable names; if you tried to precede the binding with a type declaration for the names, BSC would report an error:

Bool b;
Bit#(8) v;
match {.b, .v} <- mkM;

Error: "Example.bsv", line 8, column 9: (T0011)
  Declaration of `b' conflicts with previous declaration at
  "Example.bsv", line 6, column 8

However, in the testsuite directory bsc/testsuite/bsc.syntax/bsv05/ containing examples for testing the BSV parser, there are a series of examples ImperativeTupleBind*.bsv that seem to test a special syntax for binding tuples of predeclared variables. There is no match keyword and the left-hand side is not a pattern, it can only be open/close curly braces with a comma-separated list of variable names. The names do not start with . because they are not pattern bindings, but .* is allowed in place of a variable name, where you want to ignore that element. Nested tuples are not allowed and patterns are not allowed (whether just binding a new name with .v or a pattern for a struct or union). Because it is not declaring a new variable name, the variables must have been previously declared (and possibly even previously assigned).

Bool b;
Bit#(8) v;
{b, v} <- mkM;

I was not aware of this syntax! It may be a vestige of the first days of BSV, when the parser was being written and the language (and style of coding) was being invented, by taking the BH code base and seeing how one would write in BSV. I don't have a sense of whether anyone writes BSV code with it now?

A limitation of match statements is that you can't specify the types of the new variables, so this syntax has that benefit. I think I'd rather find a way to add types to the match syntax, than use this syntax. The other thing that it does is allow binding to an existing variable -- to re-write that using match, you'd need a match statement (binding new names) plus assignment statements (assigning back to the existing names). Here again, I'd prefer to add this as a feature of match (for example, use b<- or @b or some syntax to indicate where a variable in the pattern should be assigned, not declared or referenced). The feature also seems incomplete, since it doesn't generalize to structs or to nested tuples/structs. I would be tempted to remove it from BSC. But I'll bring it up with the people who designed the language.

I haven't yet looked into question 1 -- if you have more detail on that, that'd be helpful. I do often use the syntax method v = expr and I seem to recall that a syntax without types is allowed (method f(x); ... endmethod and probably method f(x) = expr), but I'm less clear on what partial types can be provided. You're pointing out that BSC seems to be allowing no parentheses when also specifying a return type? I don't know if that's intentional or just a result of how parsing was decomposed (perhaps the parsing of the argument list is independent from the parsing of the return type, resulting in this mix being accepted). Again, I'll bring it up with the language folks, if we have some examples of what is and isn't allowed. Thanks!

@robertszafa
Copy link
Author

These are some examples of the discrepancies I found in the grammar, along with some proposed fixes. @yuyuranium, feel free to add to this list if you found something I hadn't. It is possible that I have misunderstood the spec grammar in some places and that some of the below issues are not actually issues, in which case please point this out.

Rule Issue Possible fix
1. sourceFile 'package' and 'endpackage' tokens are required in spec, but optional in bsc. Make 'package' and 'endpackage' tokens optional in spec.
2. packageStmt varAssign choice allows let statements (varAssign ::= let identifier <- expression ;) which bsc does not accept at package scope. Change varAssign to lValue '=' rValue, i.e. do not allow varAssign that uses match or let.
3. packageStmt externCImport is missing from the choice. Add externCImport to the packageStmt choice.
4. type bsc accepts qualified types, e.g., Prelude::List in bsc/testsuite/bsc.syntax/bsv05/QualifiedType.bsv. The spec type rule does not. Add optional Identifier:: qualifier to type.
5. typePrimary bsc sometimes accepts ? instead of a type. E.g. mkReg#(?) the_x(x); in bsc/testsuite/bsc.syntax/bsv05/GCD.bsv. Add ? choice to typePrimary.
6. typePrimary bsc allows function prototypes as types. E.g. bsc/testsuite/bsc.syntax/bsv05/functionBit.bsv. Decouple the relevant bits from the functionProto rule into a standalone functionType rule. Add functionType as choice to typePrimary. Remove semicolon from functionProto to allow it to be used as a function parameter. Add the semicolon to any rules that use functionProto, e.g., functionDef.
7. functionFormal bsc allows to pass a functionProto as a parameter. E.g. bsc/testsuite/bsc.syntax/bsv05/functionBit.bsv. Add functionProto choice to functionFormal.
8. methodProtoFormal Same as above. Add functionProto choice to methodProtoFormal.
9. new subFunctionType bsc performs (partial) type inference for parameters and return values in subfunctions. E.g. bsc/testsuite/bsc.syntax/bsv05/functionBit.bsv. Add new subFunctionType, subFunctionDef, and subFunctionFormals rules where types are optional. Add subFunctionDef choice to the functionBodyStmt rule.
10. binOp The spec is mising the ** binary operator Add ** choice to binOp.
11. exprPrimary The spec is mising a qualified identifier, e.g., a List::nil rvalue is not allowed by the spec. Add qualified identifier choice to exprPrimary.
12. exprPrimary bsc allows type hinst in the form of hint'exprPrimary. E.g. in bsc/testsuite/bsc.syntax/bsv05/LookupU.bsv. Add type hint choice to exprPrimary.
13. actionStmt bsc accepts the special 'noAction' keyword here, but the spec does not. Add 'noAction' choice to actionStmt.
14. interfaceExpr The spec requires a semicolon, i.e., 'interface' Identifier ';', but bsc accepts code without the semicolon. Make the semicolon optional in the grammar spec interfaceExpr rule, or make bsc error out when the semicolon is missing.
15. rulesExpr The spec allows only one rulesStmt. Allow one, or more rulesStmt.
16. pattern bsc allows patterns to be nested in parentheses, but the spec does not. Allow parens around pattern.
17. taggedUnionPattern bsc allows to match a structPattern here, but the spec does not. Add 'tagged' structPattern choice to taggedUnionPattern.
18. fsmStmt The exprFsmStmt seems wrong here. E.g., it can match expression ;, which can be matched by actionBlock ;, but bsc does not allow semicolons after an ction block. Replace the exprFsmStmt choice in fsmStmt with actionStmt.
19. typeFormal The spec rule is [ 'numeric' ] 'type' typeIde, which does not allow nested types. The 'type' string should also be optional, I think. Change to [ 'numeric', 'string', 'parameter' ] ['type'] typePrimary.
20. subinterfaceDecl The spec rule is missing an identifier token before the semicolon. Add an identifier token.
21. moduleStmt The spec has a choice for varDo and varDeclDo, which I don't think are alowed at module scope. Remove varDo and varDeclDo choices.
22. moduleStmt functionCall and ( expression ) choices are missing semicolons at the end. Add semicolons at the end of functionCall and ( expression ) choices.
23. moduleStmt bsc allows naked case expressions at module scope, e.g. as seen in bsc/testsuite/bsc.syntax/bsv05/NakedExpr_Stmt.bsv. Add condExpr ; choice to moduleStmt.
24. moduleApp bsc allows module instances without parentheses, e.g. Reg#(Int) t <- mkRegU;. Make parentheses optional in moduleApp.
25. methodDef bsc allows method definitions (both in interfaces and mosules) without parentheses, e.g. method Action whatIsTheAnswer;. Make parentheses optional in methodDef.
26. methodDef The spec has a bug where it requires at least one methodFormal in the parentheses. Make methodFormals optional.
27. typedefTaggedUnion The spec allows an empty body, i.e., no union members at all, but bsc does require at least one member. Require at least one tagged union member in the spec.
28. overloadedDef The spec rule misses choices for moduleDef and moduleProto. Add moduleDef and moduleProto choices.
29. typeclassInstanceDef The spec has an erroneous semicolon after varAssign. Remove semicolon.
30. typeclassInstanceDef bsc performs partial type inference inside typeclass instances. Change functionDef to the new rule subFunctionDef which allows to omit some types.

I think section 9 on variable declarations and statements would benefit from a cleanup. For example, the regWrite rule does not require a semicolon at the end, which is a mistake. I did not include rules from this section in the above table, because I think the fixes here require a bigger change. I propose to introduce a new rValue rule which would capture expressions on the right of an assignment (be it =, <=, or whatever), and which would be terminated by a semicolon. I also propose to extend the existing lValue rule to capture more patterns that can appear on the left, e.g., array access lValue [ expression : expression ], or the discussed above tupleBind syntax without match, or a functionCall which seems to be allowed as an lValue (e.g., a function returning a Reg). After these changes, regWrite can be simply expressed as lValue <= rValue. varDecl, varInit, varAssign, varDeclDo, and varDo can be similarly simplified.

Finally, just a comment about function/method syntax; the following is not a grammar bug. In functionCall and methodCall, both bsc and the language spec allow to omit parentheses. Making parentheses mandatory here is probably too late, but I think it would make code a bit easier read and would be closer to the behaviour of most software languages. It would also make parsing a bit easier, since we would not have to disambiguate function and method calls with other identifiers. But I can understand if the parentheses are kept optional to align with SystemVerilog.

@ChienTeLi
Copy link

ChienTeLi commented Apr 9, 2025

Hi, @quark17
Yes, I happened to use this syntax:

Bool b;
Bit#(8) v;
{b, v} <- mkM;

It could even be written as:

let {b, v} <- mkM;

As you mentioned, the functionality is not complete. Besides structs or nested tuples/structs, I've also discovered that [] cannot be used within {}. I would greatly appreciate if this feature could be preserved or if an alternative syntax might be considered in future updates.
This is my current usage example. I haven't found a better approach yet.

Vector#(2, Reg#(Bool)) reg_inst;
StructType cfg;
let register;
List#(RegisterOperator) operators = Nil;
{register, operators} <- mkWoscReg ( 0*4, operators); 
reg_inst[0] = register;
{register, operators} <- mkRoReg   ( 1*4, operators); 
cfg.someting = register;
let reg_handler <- mkRegHandler(operators);

I would like to be able to write something similar to the following, or have corresponding syntax implemented in the future.

Vector#(2, Reg#(Bool)) reg_inst;
StructType cfg;
List#(RegisterOperator) operators = Nil;
{reg_inst[0] , operators} <- mkWoscReg ( 0*4, operators); // syntax error now
{cfg.someting, operators} <- mkRoReg   ( 1*4, operators); // syntax error now
let reg_handler <- mkRegHandler(operators);

In this example, using "match" would require mkWoscReg and mkRoReg to have fixed types, when I actually need flexibility to change the reg_inst type. Additionally, operators are repeatedly passed with appended information. Although we could work around this by using different variable names each line.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants