Investigate if/how "Verifying Peephole Rewriting In SSA Compiler IRs" can be used in HEIR #817

AlexanderViand-Intel · 2024-07-23T20:34:13Z

As some of you might be aware a group of researchers around Prof. Tobias Grosser from the University of Edinburgh/Cambridge University have been using (an earlier version of) our polynomial dialect as an example for their work on verifying MLIR in LEAN .

I'm happy to report that their paper¹ has been accepted at ITP 2024 and is now available: https://doi.org/10.4230/LIPIcs.ITP.2024.9 or https://arxiv.org/abs/2407.03685 🎉

While this is already a cool result in itself, the natural question would be to see if we can somehow make use of this in HEIR (or MLIR, if it concerns up streamed polynomial things). For example, I wonder if something like this could have helped us catch #749 and/or might be useful as we go into more advanced modular arithmetic optimizations and transitions between poly and mod arith. The paper also implements reasoning about bit vectors, so there's a good chance it could capture such lowerings/rewrites.

Siddharth Bhat, Alex Keizer, Chris Hughes, Andrés Goens, and Tobias Grosser.
Verifying Peephole Rewriting in SSA Compiler IRs.
In 15th International Conference on Interactive Theorem Proving (ITP 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 309, pp. 9:1-9:20, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)
https://doi.org/10.4230/LIPIcs.ITP.2024.9 ↩

The text was updated successfully, but these errors were encountered:

AlexanderViand-Intel · 2024-09-09T10:24:27Z

Adding examples of issues/bugs that motivate why formal verification might be interesting for polynomial->arith lowerings:

PR ArithExt Improvements: Arbitrary Coefficient Bitwidth + lowerings of modular add/sub/mul to arith #731 needed some fixing before merging, because I was not aware of arith restrictions on remui/remsi, meaning that we cannot fit, e.g., numbers modulo $2^{31}-1$ into i31 (artificial example) and need an extra bit to ensure things work as expected when lowered to LLVM.
PR [mod_arith] fix sub lowering #928 fixes an issue where "the current implementation would have incorrect behaviour if the arith.sub resulted in a negative signed number that was then interpreted as an unsigned int with arith.remui" which "is fixed by adding the modulus to the result before applying remui such that the range after sub shifts from (-mod, mod) -> [0, mod)"
Issue polynomial-to-standard: error on integer width for poly.add when ring coefficientModulus width is less than coefficient width #990 is another example of pre-‘mod_artih’ code not correctly handling lowering modular arithmetic to ‘arith’
PR mod_arith.reduce: require modulus width to be smaller than underlying type width #998 fixes an issue with bit-widths not being correct for reduce (technically during the review of PR [mod_arith] add reduce op #927 this was already noticed, but the fixes didn't make it in there, exposing this bug)

Maokami · 2025-03-19T18:05:33Z

I would like to share that, although it may seem straightforward, the correctness of the canonicalization passes for the ModArith dialect (google/heir#1506) has been successfully verified through translation validation using the lean-mlir project:

https://github.com/Maokami/lean-mlir/blob/main/SSA/Projects/ModArith/Examples.lean

(Note: I relied heavily on existing code within lean-mlir.)

Though this is currently an elementary application, I found this tool very useful for defining and verifying the semantics/transformations of our compiler, and I look forward to exploring how we can extend it to cover more aspects of our project!

j2kun · 2025-03-20T21:08:59Z

@Maokami That is fantastic! Do you think it would make sense to incorporate the lean proofs into this repo, so that we can try to keep them in sync with the actual pattern implementations? (Or if this is part of your active research, maybe you would want to wait to publish?)

In regards to future work: At HACS a few years ago @AlexanderViand-Intel and another Alex tried hacking together a formal verification of Halevi-Shoup packing, and I feel like that (formally-verified the linalg kernels) would be a super valuable way to incorporate formal verification into HEIR. IIRC, the limitation with their demo was that z3 didn't have a good language for modeling vectors (e.g., a cyclic rotate), and so they had to bit-blast everything and it blew up the verification time.

I wonder if lean would handle that much easier. Given all the work that has gone into the linear algebra part of lean's mathlib, it seems like it should be able to handle simple vector operations in a dimension-independent manner.

Maokami · 2025-03-21T10:02:58Z

Thanks! :)

it would make sense to incorporate the lean proofs into this repo, so that we can try to keep them in sync with the actual pattern implementations

That sounds great, and I’m fully in favor of incorporating these Lean proofs into the repo. At this stage, this is more of a personal exploration driven by my interest in formal verification rather than an active research project, so I’d be happy to integrate them.

And do you have any thoughts on how we might structure this integration? One idea I’ve been thinking about is that, in the long term, it would be great to provide a translation validation tool (similar to how Alive2 works for LLVM). This would enable easy verification whenever someone develops a new compiler pass. (For instance, an alive proof in llvm/llvm-project#129947)

Also, thanks a lot for mentioning that future work! It sounds interesting, and I’d love to explore it further.
If you happen to have any pointers or recommended resources, I’d love to check them out!

j2kun · 2025-03-24T04:53:37Z

TIL about alive.

I think before figuring out an integration I need to understand a bit more about how the overall system is intended to function. Is it a completely parallel spec to the HEIR implementation, defining ops and semantics and the patterns in Lean? Or maybe the input-output examples are generated by running HEIR passes, and then the Lean proofs assert the results are equivalent according to the Lean-defined semantics?

If the latter, I could see us running the proofs against test cases in the repository as part of CI, using the actual HEIR passes to produce the input-output pairs to verify. I think the most robust integration would somehow tie the proof system to the implementation, and without having the rewrite patterns in lean be derived from the HEIR codebase, I don't see another easy way to do that. (By the way, do you know if there is a plan for lean-mlir to derive patterns from, say, tablegen or PDLL? I'd be happy to rewrite tablegen-based patterns in HEIR to PDLL if that would help)

I also like the idea of having proofs as part of an issue when creating a new pattern. Maybe lean-mlir.grosser.es could take the place of the alive2.llvm.org page as a first step in that direction?

AlexanderViand-Intel · 2025-03-24T06:21:42Z

This is so cool!! Thanks @Maokami!

I'm currently at HACS and will be trying to investigate with Tobias Grosser/some of the lean folks here how we could go about automating the extraction of rewrite rules into Lean-MLIR PeepholeRewrites, as a step towards getting lean-mlir integrated into CI, similar to Alive. Unfortunately, I won't be able to attend the HACS Hack day as it collides with the FHE.org event, but we can always do a "virtual hack day" with the lean-mlir folks some time after HACS/FHE/RWC.

AlexanderViand-Intel · 2025-03-24T09:20:44Z

Ok, so the path forward for automatic extraction does indeed seem to be to express the Dialect in IRDL (upstream now has IRDL to C++) and the rewrite patterns in PDL

Maokami · 2025-03-24T14:29:53Z

@j2kun @AlexanderViand-Intel

Is it a completely parallel spec to the HEIR implementation, defining ops and semantics and the patterns in Lean? Or maybe the input-output examples are generated by running HEIR passes, and then the Lean proofs assert the results are equivalent according to the Lean-defined semantics?

It's the latter. Specifically, in example.lean, each MLIR program (test_*_RHS) is generated by applying heir-opt to test_*_LHS. I manually proved in Lean that these input-output pairs have equivalent semantics. More precisely, by extending and utilizing lean-mlir, I encoded the denotational semantics of operations and types from the mod_arith dialect into Lean's Zp mathematical structure. Using this encoding, I manually verified the equivalence Semantics{test_LHS} = Semantics{test_RHS} within Lean.

I could see us running the proofs against test cases in the repository as part of CI, using the actual HEIR passes to produce the input-output pairs to verify.

I completely agree. Currently, manually writing these proofs in Lean and running them against the test programs within the CI seems to be the best approach.

I think the most robust integration would somehow tie the proof system to the implementation, and without having the rewrite patterns in Lean be derived from the HEIR codebase, I don't see another easy way to do that. (By the way, do you know if there is a plan for lean-mlir to derive patterns from, say, tablegen or PDLL? I'd be happy to rewrite tablegen-based patterns in HEIR to PDLL if that would help.)

I wasn't aware of any plans to derive patterns into lean-mlir, but the automated extraction of rewrite rules into lean-mlir PeepholeRewrites, as mentioned by @AlexanderViand-Intel, seems promising for addressing this issue. I'm also highly interested in this topic and would gladly contribute if there's an opportunity! :)

I also like the idea of having proofs as part of an issue when creating a new pattern. Maybe lean-mlir.grosser.es could take the place of the alive2.llvm.org page as a first step in that direction?

Yes, that would be great! To make it happen, we’ll need to get in touch with Professor Grosser to see if we can extend the site to support custom dialect Lean semantics.

AlexanderViand-Intel · 2025-03-24T16:20:30Z

@Maokami I just showed @tobiasgrosser your ModArith lean mlir code and he said "Yes! This is amazing" so I think the next logical step would be to open up a PR against the lean-mlir repo to get this reviewed by someone from his group?

Tobias and I are also continuing to brainstorm on how to do automated extraction/CI stuff. I don't think we need a parallel definiton of the ops and rewrite rules, we just need to switch from using TableGen (ODS/DRR) as our "source of truth" in HEIR to actually using IRDL/PDL to do the defintions (because that's much easier to extract from then TableGen stuff)

tobiasgrosser · 2025-03-24T16:21:13Z

Yes, please pr this. 🙏

Hello lean-mlir team, I'm very interested in translation validation/compiler verification techniques for MLIR-based compilers, and I found your work truly impressive and valuable—thank you for sharing it! Recently, I used this project to validate a canonicalization pass I contributed to the `ModArith` dialect (which corresponds to arithmetic over Zp) in the [heir project](google/heir#1506). This PR contains the result of that effort. I’d appreciate it if you could take a look and review it. Most of the implementation was straightforward by referring to `SSA/Projects/FullyHomomorphicEncryption`. For more context, I’ve shared a detailed discussion here: google/heir#817 (comment) Thank you!

AlexanderViand-Intel added the research synthesis Reading papers to figure out which ideas can be incorporated label Jul 23, 2024

AlexanderViand-Intel changed the title ~~Investigate if/how "Verifying Peephole Rewriting In SSA Compiler IRs" can be integrated into our CI~~ Investigate if/how "Verifying Peephole Rewriting In SSA Compiler IRs" can be used in HEIR Jul 23, 2024

AlexanderViand-Intel mentioned this issue Oct 9, 2024

mod_arith.reduce: require modulus width to be smaller than underlying type width #998

Merged

ZenithalHourlyRate mentioned this issue Oct 11, 2024

Fix tensor type for polynomial-mul-to-ntt #1037

Merged

Maokami mentioned this issue Mar 3, 2025

Add canonicalization patterns and constant foldings for ModArith binary ops #1506

Merged

AlexanderViand-Intel mentioned this issue Mar 24, 2025

Investigate feasibility of migrating from ODS to IRDL and DRR to PDL. #1616

Open

Maokami mentioned this issue Mar 25, 2025

feat: Add ModArith dialect opencompl/lean-mlir#1082

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Investigate if/how "Verifying Peephole Rewriting In SSA Compiler IRs" can be used in HEIR #817

Investigate if/how "Verifying Peephole Rewriting In SSA Compiler IRs" can be used in HEIR #817

AlexanderViand-Intel commented Jul 23, 2024 •

edited

Loading

AlexanderViand-Intel commented Sep 9, 2024 •

edited

Loading

Maokami commented Mar 19, 2025

j2kun commented Mar 20, 2025

Maokami commented Mar 21, 2025

j2kun commented Mar 24, 2025

AlexanderViand-Intel commented Mar 24, 2025

AlexanderViand-Intel commented Mar 24, 2025 •

edited

Loading

Maokami commented Mar 24, 2025 •

edited

Loading

AlexanderViand-Intel commented Mar 24, 2025

tobiasgrosser commented Mar 24, 2025

Investigate if/how "Verifying Peephole Rewriting In SSA Compiler IRs" can be used in HEIR #817

Investigate if/how "Verifying Peephole Rewriting In SSA Compiler IRs" can be used in HEIR #817

Comments

AlexanderViand-Intel commented Jul 23, 2024 • edited Loading

Footnotes

AlexanderViand-Intel commented Sep 9, 2024 • edited Loading

Maokami commented Mar 19, 2025

j2kun commented Mar 20, 2025

Maokami commented Mar 21, 2025

j2kun commented Mar 24, 2025

AlexanderViand-Intel commented Mar 24, 2025

AlexanderViand-Intel commented Mar 24, 2025 • edited Loading

Maokami commented Mar 24, 2025 • edited Loading

AlexanderViand-Intel commented Mar 24, 2025

tobiasgrosser commented Mar 24, 2025

AlexanderViand-Intel commented Jul 23, 2024 •

edited

Loading

AlexanderViand-Intel commented Sep 9, 2024 •

edited

Loading

AlexanderViand-Intel commented Mar 24, 2025 •

edited

Loading

Maokami commented Mar 24, 2025 •

edited

Loading