Faster random circuits #464

J-C-Q · 2025-01-20T17:00:44Z

Motivation

Motivated by random circuits, generating a random Clifford operation can be a time-critical function. When trying to implement such circuits using QuantumClifford.jl I noticed many allocations from the random_clifford function. So I tried to implement a version of the random_clifford function that reuses memory, so that repeated generation of random gates doesn't allocate that much.

Implementation

This PR implements a RandDestabMemory type which stores all the memory needed in the already existing random_destabilizer algorithm. An instance of this type can be passed to the random_destabilizer function to reuse the same memory. The algorithm is should be unchanged; however, this PR changes the code in some places to avoid allocations. Some of the functionality gets separated into helper functions, but some loops persist in the main function, which arguably hurts readability. To avoid as many allocations as possible, this PR also rewrites a few other functions like quantum_mallows to reuse memory. Most notably, the calculation of the inverse of delta' and delta_p' doesn't rely on external packages anymore and makes use of the respective structure of the matrices, which increases performance.
This PR also implements this new version of random_destabilizer in the random_brickwork_clifford_circuit function.

TODO

There is still some cleanup to be done. As well as documentation and testing. However, I thought I'd share this idea already to get feedback if you are even interested in integrating this. I'm very open to any criticism and ideas to improve my code, as well as happy to work on this more.

Benchmark

Attached you can find some rudimentary benchmarks of the random_brickwork_clifford_circuit function, first without and then with the new version implemented.

…rix inverse calculation and other functions to be allocation free

codecov · 2025-01-21T15:22:14Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 83.42%. Comparing base (0d13791) to head (3ee7299).

Additional details and impacted files

@@            Coverage Diff             @@
##           master     #464      +/-   ##
==========================================
+ Coverage   82.73%   83.42%   +0.68%     
==========================================
  Files          70       72       +2     
  Lines        4656     4801     +145     
==========================================
+ Hits         3852     4005     +153     
+ Misses        804      796       -8

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

J-C-Q · 2025-01-21T16:57:43Z

I think this is ready for review.

I haven't put a changelog entry yet, but can do so if you want me to do it.
Also, I'm not sure if you want this to be documented in the random_destabilizer doc string, or somewhere else.
The tests are very basic, just comparing the result to the already existing implementation. Do you want me to add more or is that enough? The existing tests seem to be passing.

Thank you for the positive reaction to this idea! :)

Krastanov

This is fantastic, thank you!

I have left a few comments in, mostly focusing on the custom functions you have implemented and checking whether they are necessary on latest julia.

If you can also just add a changelog and bump the version number, I should be able to merge and release this quite quickly.

Krastanov · 2025-01-29T21:19:48Z

test/test_random.jl

+                @test non_reuse_version == reuse_version
+            end
+        end
+    end


May we add another testset that uses something like @allocated or @allocations to keep track of potential regressions in performance.

Krastanov · 2025-01-29T21:20:20Z

test/test_random.jl

+    @testset "Random sampling of operators memory reuse" begin
+        for n in [1, test_sizes..., 200, 500]
+            workingmemory = QuantumClifford.RandDestabMemory(n)
+            for _ in 1:100


now that this is very thoroughly tested, let's drop this down to for _ in 1:2 to not make the tests take too long.

Krastanov · 2025-01-29T21:26:25Z

src/randoms.jl

+# Allocation free mod.(x,n)
+function _inplace_mod!(x::Matrix{Int8}, n::Integer)
+    @inbounds for i in eachindex(x)
+        x[i] = mod(x[i], n)
+    end
+    return x
+end


could you confirm that x .= mod.(x, n) is worse than a new function. On newer julias I get this:

julia> a = rand(Int8, 20,20); julia> @time _inplace_mod!(a, 5); 0.000001 seconds julia> a = rand(Int8, 200,200); julia> @time _inplace_mod!(a, 5); 0.000054 seconds julia> function f(a,n) a.=mod.(a,n) end; julia> @time f(a, 5); 0.017539 seconds (13.95 k allocations: 738.812 KiB, 99.65% compilation time) julia> @time f(a, 5); 0.000056 seconds

Both seem equally fast and non-allocating. I prefer we just use broadcast instead of a new _inplace_mod! if there is indeed no difference in performance

Krastanov · 2025-01-29T21:33:28Z

src/randoms.jl

+function _inplace_equals!(x::BitMatrix, y::Matrix{Int8}, n::Integer)
+    @inbounds for i in eachindex(y)
+        x[i] = y[i] == n
+    end
+    return x
+end


same question about using broadcast here

julia> a = trues(10,10); julia> b = rand(Int8, 10, 10); julia> @time a .= b.==5; 0.047673 seconds (213.96 k allocations: 10.826 MiB, 99.95% compilation time) julia> @time a .= b.==5; 0.000004 seconds (1 allocation: 32 bytes) julia> function _inplace_equals!(x::BitMatrix, y::Matrix{Int8}, n::Integer) @inbounds for i in eachindex(y) x[i] = y[i] == n end return x end; julia> @time _inplace_equals!(a,b,5); 0.000001 seconds julia> f(a,b,n) = (a.=b.==n); julia> @time f(a,b,5); 0.032928 seconds (26.55 k allocations: 1.273 MiB, 99.97% compilation time) julia> @time f(a,b,5); 0.000003 seconds

Krastanov · 2025-01-29T21:34:17Z

src/randoms.jl

+function _mul!(C, A, B, n)
+    for i in 1:n
+        for j in 1:n
+            for k in 1:n
+                @inbounds C[i, j] += A[i, k] * B[k, j]
+            end
+        end
+    end
+end


what about LinearAlgebra.mul! -- is it measurably worse?

Krastanov · 2025-01-29T21:37:10Z

src/randoms.jl

+function _quantum_mallows!(
+    rng::AbstractRNG,
+    hadamard::BitVector,
+    perm::Vector{T},
+    arr::Vector{T}) where {T<:Integer} # inplace version
+
+    n = length(perm)
+    for idx in 1:n
+        m = n - idx + 1
+        # sample h_i from given prob distribution
+        l = sample_geometric_2(rng, 2 * m)
+        weight = 2 * m - l
+        hadamard[idx] = (weight < m)
+        k = weight < m ? weight : 2 * m - weight - 1
+        perm[idx] = _popat!(arr, k + 1)
+    end
+    return hadamard, perm
+end


to avoid code repetition (really, more to avoid the need to fix a bug in two separate places in the future), could you delete most of the content of quantum_mallows and have it internally just call your new quantum_mallows! together with whatever objects needs to be allocated?

J-C-Q · 2025-02-04T15:29:06Z

Thank you for the thoughtful feedback!

I've made the following changes based on your suggestions:

Removed the unnecessary _inplace_mod!, _inplace_equals, and _mul! functions, as they were not faster than native broadcasting and LinearAlgebra implementations.
Reduced the number of random tests and added a test to track allocations.
Updated quantum_mallows to call the in-place version.
Added a changelog entry and bumped the version number (hopefully, I did this correctly!).

Thanks again for your support—I really appreciate it!

J-C-Q · 2025-02-04T16:09:50Z

Well, apparently there is a big difference in allocations between 1.10 and 1.11. Somehow LinearAlgebra.mul! allocates in 1.10 but doesn't for 1.11 (at least for this use case).

Fe-r-oz · 2025-02-05T04:53:13Z

Well, apparently there is a big difference in allocations between 1.10 and 1.11. Somehow LinearAlgebra.mul! allocates in 1.10 but doesn't for 1.11 (at least for this use case).

This Julia issue (@allocated false positive on v1.12?) is due to false positives by @allocated. This diagnostic PR might be helpful: #467 (comment). There are some errors in CI documentation as well that are also reproduced in diagnostic PR #467. I think this issue started happening by the time Julia 1.11.3 was released on CI here.

Edit:

The CI documentation errors are all due to errors to run the @example blocks possibly stemming from QCMakieExt

Error: failed to run `@example` block in src/canonicalization.md:16-21
```@example
using QuantumClifford, CairoMakie
f=Figure()
stabilizerplot_axis(f[1,1], canonicalize!(random_stabilizer(20,30)))
f

Error:

ArgumentError: component type FixedPointNumbers.N0f8 is an 8-bit type representing 256 values from 0.0 to 1.0,
      but the values (0x001b9e77, 0x001b9e77, 0x001b9e77) do not lie within this range.
      See the READMEs for FixedPointNumbers and ColorTypes for more information.
    Stacktrace:

The CI doc errors were due to this line:

QuantumClifford.jl/ext/QuantumCliffordMakieExt/QuantumCliffordMakieExt.jl

Line 20 in 0d13791

    
           colormap = Makie.cgrad([:lightgray,Makie.RGB(0x1b9e77),Makie.RGB(0xd95f02),Makie.RGB(0x7570b3)], 4, categorical = true),

The CI documentation errors are fixed in #473 :)

Krastanov · 2025-02-11T13:49:58Z

As long as it is working fine on 1.11, I am not too worried about 1.10. In this library I feel comfortable relying on improvements in base julia to get simpler code (admittedly a freedom that bigger foundational libraries can not afford to themselves).

Could you summarize whether there are any outstanding issues on 1.11?

Krastanov · 2025-02-11T13:56:34Z

The allocation test fails are unrelated to this PR, right?

J-C-Q · 2025-02-11T14:01:34Z

Alright, since LinearAlgebra.mul! does not allocate and is faster in 1.11, we should use it.
To test allocations as expected, I changed the new test to only run on 1.11 and upwards. It seems to pass there. However, there are some other allocation tests that do not pass right now, but they are not related to this PR!

Krastanov · 2025-02-11T14:02:05Z

oh, one last thing: in your new allocation tests, make sure that you have called f1,f2,f3 once before you run the allocation checks on them, to get them compiled (otherwise depending on what the compiler does, you might end up measuring its performance instead of just f1,f2,f3)

Krastanov · 2025-02-11T14:03:04Z

Wonderful! (well, not great about the new failures, but you are right, they are unrelated)

I will plan to merge this shortly and look into the new failures separately.

Thank you for the contributions, it is much appreciated!

implement working memory for random destabilizer and optimize the mat…

18e6787

…rix inverse calculation and other functions to be allocation free

J-C-Q force-pushed the master branch from e63e05b to 18e6787 Compare January 21, 2025 15:02

adds basic testing; comparing to existing implementation

91f11a4

J-C-Q force-pushed the master branch from 0becee4 to 91f11a4 Compare January 21, 2025 16:31

J-C-Q marked this pull request as ready for review January 21, 2025 17:05

Krastanov requested changes Jan 29, 2025

View reviewed changes

random_destabilizer fixes and tests

09d3d50

J-C-Q force-pushed the master branch from fe66e01 to 1b353d0 Compare February 4, 2025 16:14

Fe-r-oz mentioned this pull request Feb 5, 2025

fix: RGB color initialization in QuantumCliffordMakieExt colormap #473

Merged

5 tasks

J-C-Q force-pushed the master branch from 1b353d0 to 0daca19 Compare February 11, 2025 13:16

changelog and version bump

3ee7299

J-C-Q force-pushed the master branch from 0daca19 to 3ee7299 Compare February 11, 2025 14:13

Krastanov and others added 2 commits February 18, 2025 11:58

Merge branch 'master' into master

4420192

changelog

d0e8ae5

Krastanov merged commit 0035c32 into QuantumSavory:master Feb 19, 2025
14 of 16 checks passed

Faster random circuits #464

Faster random circuits #464

Uh oh!

Conversation

J-C-Q commented Jan 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivation

Implementation

TODO

Benchmark

Uh oh!

codecov bot commented Jan 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

J-C-Q commented Jan 21, 2025

Uh oh!

Krastanov left a comment

Choose a reason for hiding this comment

Uh oh!

Krastanov Jan 29, 2025

Choose a reason for hiding this comment

Uh oh!

Krastanov Jan 29, 2025

Choose a reason for hiding this comment

Uh oh!

Krastanov Jan 29, 2025

Choose a reason for hiding this comment

Uh oh!

Krastanov Jan 29, 2025

Choose a reason for hiding this comment

Uh oh!

Krastanov Jan 29, 2025

Choose a reason for hiding this comment

Uh oh!

Krastanov Jan 29, 2025

Choose a reason for hiding this comment

Uh oh!

J-C-Q commented Feb 4, 2025

Uh oh!

J-C-Q commented Feb 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Fe-r-oz commented Feb 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Krastanov commented Feb 11, 2025

Uh oh!

Krastanov commented Feb 11, 2025

Uh oh!

J-C-Q commented Feb 11, 2025

Uh oh!

Krastanov commented Feb 11, 2025

Uh oh!

Krastanov commented Feb 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

J-C-Q commented Jan 20, 2025 •

edited

Loading

codecov bot commented Jan 21, 2025 •

edited

Loading

J-C-Q commented Feb 4, 2025 •

edited

Loading

Fe-r-oz commented Feb 5, 2025 •

edited

Loading

Krastanov commented Feb 11, 2025 •

edited

Loading