Skip to content

Parallelize collapse method for lightning.qubit with OpenMP #962

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
tomlqc opened this issue Oct 24, 2024 · 10 comments
Open

Parallelize collapse method for lightning.qubit with OpenMP #962

tomlqc opened this issue Oct 24, 2024 · 10 comments

Comments

@tomlqc
Copy link
Contributor

tomlqc commented Oct 24, 2024

Important Note

⚠️ This issue is part of an internal assignment and not meant for external contributors

Context

Recently, Mid-Circuit Measurement (MCM) support was added to the lightning.qubit backend. The performance of the collapse method is very import for the MCM support. Parallelize the collapse method with OpenMP would boost the performance of MCM simulations using lightning.qubit.

Requirements

The collapse method with OpenMP support has to be implemented in the lightning.qubit C++ backend. For more information about the collapse method, please refer to Pennylane-Lightning source code.

  • Parallelize the collapse method with OpenMP in pennylane_lightning/core/src/simulators/lightning_qubit/StateVectorLQubit.hpp
  • The new implementation would allow users to choose whether the collapse method is built against OpenMP or not.
  • Create a pull-request in the PennyLane Lightning repository and ensure to complete all the steps outlined in the PR template.
  • Benchmark the new C++ implementation (depending on Num_Qubits and OpenMP threads number) and upload the results to the pull-request for further discussions.
  • Mark the PR ready for review.

Don't hesitate to ask for clarification or raise any concerns regarding the issue. We'll be happy to discuss with you!

@xiaohanzai
Copy link

xiaohanzai commented Nov 1, 2024

Hi Thomas, for the benchmarking, should I just take the collapse function out and run it on my own laptop? I guess I don't have to install the whole package right? What's the possible range of values for Num_Qubits?

@tomlqc
Copy link
Contributor Author

tomlqc commented Nov 1, 2024

Hi Xiaohan, yes, you may just benchmark collapse() separately, and please share with us how you did it. The upper limit for the number of qubits will be your laptop's memory, you can start with 10. And don't forget to compare different number of threads 🙂

@xiaohanzai
Copy link

Thanks! And I guess wire can be anything that satisfies (1+wire) < getNumQubits()?

@xiaohanzai
Copy link

Hi Thomas, looks like I can't push my changes. Should I be added as a collaborator first?

@tomlqc
Copy link
Contributor Author

tomlqc commented Nov 6, 2024

Hi @xiaohanzai, you will have to create a fork of the repository, where you can push to, and then you can create a PR to merge your branch to PennyLaneAI:master.

@xiaohanzai
Copy link

Hi Thomas, a few more questions before I submit for pull request...

  1. How many threads do you usually use, and do you expect to see good performance with a lot of threads? I tested with number of qubits from 20 to 30 on a cluster at UofT, and there's not much improvements in performance beyond 8 threads. I'm considering cache misses and false sharing etc. but I'm not sure if I'm expected to make things work perfect with large numbers of threads.

  2. Do you expect to see good performance on a laptop? Because on my mac the scaling seems pretty bad actually, but on the UofT cluster the scaling is a lot better.

  3. For the testing mentioned in PR, do I need to modify any of the py files in tests/lightning_qubits to test the implementation?

@maliasadi
Copy link
Member

Hi @xiaohanzai, Thank you for your clear and detailed communication!

  1. There's no need for additional effort in optimizing this specifically for HPC machines. The number of threads required to achieve optimal performance generally depends on several factors, particularly the complexity of your example and the number of physical threads available on your machine. We would be happy to review your pull request.
  2. There are differences in the gate kernels between macOS and Linux, so we expect some performance differences between the two platforms. For this project, sharing the results from your Mac laptop will be perfectly fine.
  3. Yes, please add unit tests to check the correctness of your changes in ./tests/test_measurements.py.

@xiaohanzai
Copy link

xiaohanzai commented Nov 6, 2024

Hi @maliasadi , thank you so much for your reply! Sorry I'm still quite confused about adding the tests. Should I add a test in test_measurements.py or maybe a cpp file under pennylane_lightning/core/src/simulators/lightning_qubit/tests/? I actually took the collapse function out and did a scaling test for it individually without compiling the whole pennylane package, so I'm not sure if I should put that file in the repo. Or should I just submit my scaling test code only for the PR, not putting it as a test module?

I think because I took the collapse function out to the scaling test instead of compiling the whole pennylane package with a test, I'm getting quite confused what I should do right now...

@maliasadi
Copy link
Member

@xiaohanzai No need to update the Pytohn tests for now! Please go ahead and create the pull request with your changes. Feel free to include any additional benchmark scripts and filed to the PR. We'd be happy to review your code and continue the discussion there!

@xiaohanzai
Copy link

Thanks! I just created a pull request. There doesn't seem to be option to put it ready for review though...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants