Skip to content

LLVM verification #356

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 42 commits into from
Apr 24, 2025
Merged

LLVM verification #356

merged 42 commits into from
Apr 24, 2025

Conversation

AFOliveira
Copy link
Collaborator

Following up on #258.

I've been doing work on what @lenary proposed as a first approach, I think this is an ok first mock-up. Still a WIP since many instructions still have bugs, but if you have any comments or recommendations, please LMK :).

I did not add LLVM as a submodule yet because it may be easier licensing wise to just point at it in the script? Usage is python3 riscv_parser.py <tablegen_json_file> <arch_inst_directory>"). I've also used jq to enhance readibility on the output of llvm-tblgen -I llvm/include -I llvm/lib/Target/RISCV llvm/lib/Target/RISCV/RISCV.td --dump-json -o <path-to-json-output>.

Copy link
Collaborator

@lenary lenary left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a good start, and is heading in the right direction. Some immediate comments before I look again in the next few days.

Is there anything specific you want feedback on?

Which instructions are you running into issues with so far?

Another overall thought I have is: if this is a test/validation suite, could this be structured using pytest? For instance, you could parse the YAML and the JSON to match up instruction descriptions, and use that to create parameterized fixtures (one instance per matched description) - the advantage of this is that you get all of the niceness of pytest's asserts, and pytest's test suite reports, without having to reimplement all the testcase management. This also makes it easier to split the test code that checks the encoding matches from tests that check the assembly strings match (for example, but we can think of others).

@AFOliveira
Copy link
Collaborator Author

I think this is a good start, and is heading in the right direction. Some immediate comments before I look again in the next few days.

Thanks for your feedback!

Is there anything specific you want feedback on?

Not yet, the point was just some general considerations about the initial approach.

Which instructions are you running into issues with so far?

I still didnt find a pattern, but I'll try to fix this soon and if I run into any issue I'm not able to solve by myself, I'll try to bring it up, thanks!

Another overall thought I have is: if this is a test/validation suite, could this be structured using pytest? For instance, you could parse the YAML and the JSON to match up instruction descriptions, and use that to create parameterized fixtures (one instance per matched description) - the advantage of this is that you get all of the niceness of pytest's asserts, and pytest's test suite reports, without having to reimplement all the testcase management. This also makes it easier to split the test code that checks the encoding matches from tests that check the assembly strings match (for example, but we can think of others).

I'll take a look into pytest and see how to port this, thanks for the suggestion!

@AFOliveira
Copy link
Collaborator Author

AFOliveira commented Dec 19, 2024

current known bugs are:

  • Can't parse yaml complex imm encodings (i.e. 31|7|30-25|11-8)
  • Some instructions need to be fixed in the UDB, since this already caught some errors.
  • Need to split pytest into different unit tests rather than one big test.
  • Use ASMString from JSON.

I'm working on both now.

@lenary
Copy link
Collaborator

lenary commented Dec 19, 2024

Somewhere, a __pycache__ and *.pyc is missing from a gitignore. :)

@AFOliveira
Copy link
Collaborator Author

Unit testing is now working.

TODO:
Solve complex immediates
Add the optimization @lenary proposed on code structure
Add this to the validation process

This was referenced Dec 23, 2024
Signed-off-by: Afonso Oliveira <[email protected]>
@AFOliveira
Copy link
Collaborator Author

Can I insert LLVM as a submodule or would this be another licensing problem?

@AFOliveira
Copy link
Collaborator Author

AFOliveira commented Dec 23, 2024

This is functional. Things missing is how to address LLVM and adding it the test to Rakefile. I will also see how I can optimize code, e.g. what @lenary proposed on one review.

@lenary
Copy link
Collaborator

lenary commented Dec 23, 2024

Can I insert LLVM as a submodule or would this be another licensing problem?

I highly suggest using environment variables to point to LLVM, and skipping these tests if those environment variables are not set.

Signed-off-by: Afonso Oliveira <[email protected]>
@AFOliveira
Copy link
Collaborator Author

Sorry for the mess with force pushes, but I had some trouble finding what I was messing up with GH actions caching. However, I believe this PR is ready for merge after #428 is solved and after @dhower-qc and @lenary reviews.

Copy link
Collaborator

@lenary lenary left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I've taken too long to get back to this.

I have one actionable comment, but let's just land this and fix-forward the verification/validation (i.e., new PRs)

Hopefully that makes sense?

@AFOliveira
Copy link
Collaborator Author

Sorry, I've taken too long to get back to this.

I have one actionable comment, but let's just land this and fix-forward the verification/validation (i.e., new PRs)

Hopefully that makes sense?

Yes, that makes total sense. @dhower-qc mentioned that this was on queue depending on #386, so likely I'll have to adapt to that as well. I guess I'll change both things at the time.

Thank for the review!

@AFOliveira
Copy link
Collaborator Author

@dhower-qc should be working now. Remember that the test is optional, so if the users don't have LLVM tablegen generated in their computers, it will just skip all tests.

@AFOliveira
Copy link
Collaborator Author

@dhower-qc Can I get an approval on this so I don't need to consistently come back and solve merge conflicts?

@dhower-qc
Copy link
Collaborator

@dhower-qc Can I get an approval on this so I don't need to consistently come back and solve merge conflicts?

Yep, thanks for the bump

@dhower-qc dhower-qc enabled auto-merge April 18, 2025 17:17
@AFOliveira
Copy link
Collaborator Author

@dhower-qc @kbroch-rivosinc the pre-commit is complaining about copyright licenses, should I just insert BSD-3 license?

@AFOliveira
Copy link
Collaborator Author

AFOliveira commented Apr 24, 2025

@kbroch-rivosinc Sorry for bothering about this, but is it now a requirement that all files introduced into UDB have an explicit LICENSE somewhere? To fix this I should introduce the BSD-3 reference on the top of each of the files I am adding?

reuse lint...............................................................Failed
- hook id: reuse
- exit code: 1

# MISSING COPYRIGHT AND LICENSING INFORMATION

The following files have no copyright and licensing information:
* ext/auto-inst/parsing.py
* ext/auto-inst/test_parsing.py

@AFOliveira AFOliveira disabled auto-merge April 24, 2025 14:24
@AFOliveira AFOliveira added this pull request to the merge queue Apr 24, 2025
Merged via the queue into main with commit 914e289 Apr 24, 2025
16 checks passed
@AFOliveira AFOliveira deleted the AFOliveira/LLVM branch April 24, 2025 16:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants