Random CI failures #903

woodsp-ibm · 2025-03-17T18:03:44Z

CI had a seeming random failure in #902 where re-running the failed job passed. In looking at the nightly CI Actions I see a few failures, such as below. The 3rd item below is what failed in #902.

Investigating if the code can be improved whereby the possibility of failure is understood and updating either the unit test and/or the mainline code would be good to avoid these,

test.gradients.test_estimator_gradient.TestEstimatorGradientV2.test_gradient_u_2_LinCombEstimatorGradient
---------------------------------------------------------------------------------------------------------

Captured traceback:
~~~~~~~~~~~~~~~~~~~
    Traceback (most recent call last):

      File "/opt/hostedtoolcache/Python/3.12.9/x64/lib/python3.12/site-packages/ddt.py", line 221, in wrapper
    return func(self, *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^

      File "/home/runner/work/qiskit-machine-learning/qiskit-machine-learning/test/gradients/test_estimator_gradient.py", line 540, in test_gradient_u
    self.assertAlmostEqual(value, correct_results[i][j], 1)

      File "/opt/hostedtoolcache/Python/3.12.9/x64/lib/python3.12/unittest/case.py", line 939, in assertAlmostEqual
    raise self.failureException(msg)

    AssertionError: np.float64(0.05078125) != 0.0 within 1 places (np.float64(0.05078125) difference)

test.algorithms.inference.test_qbayesian.TestQBayesianInference.test_trivial_circuit_V2
---------------------------------------------------------------------------------------

Captured traceback:
~~~~~~~~~~~~~~~~~~~
    Traceback (most recent call last):

      File "D:\a\qiskit-machine-learning\qiskit-machine-learning\test\algorithms\inference\test_qbayesian.py", line 243, in test_trivial_circuit_V2
    self.assertTrue(

      File "C:\hostedtoolcache\windows\Python\3.12.9\x64\Lib\unittest\case.py", line 727, in assertTrue
    raise self.failureException(msg)

    AssertionError: np.False_ is not true

test.gradients.test_estimator_gradient.TestEstimatorGradientV2.test_gradient_parameter_coefficient_2_LinCombEstimatorGradient
-----------------------------------------------------------------------------------------------------------------------------

Captured traceback:
~~~~~~~~~~~~~~~~~~~
    Traceback (most recent call last):

      File "/opt/hostedtoolcache/Python/3.12.9/x64/lib/python3.12/site-packages/ddt.py", line 221, in wrapper
    return func(self, *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^

      File "/home/runner/work/qiskit-machine-learning/qiskit-machine-learning/test/gradients/test_estimator_gradient.py", line 614, in test_gradient_parameter_coefficient
    np.testing.assert_allclose(gradients, correct_results[i], atol=1e-1, rtol=1e-1)

      File "/opt/hostedtoolcache/Python/3.12.9/x64/lib/python3.12/site-packages/numpy/testing/_private/utils.py", line 1691, in assert_allclose
    assert_array_compare(compare, actual, desired, err_msg=str(err_msg),

      File "/opt/hostedtoolcache/Python/3.12.9/x64/lib/python3.12/contextlib.py", line 81, in inner
    return func(*args, **kwds)
           ^^^^^^^^^^^^^^^^^^^

      File "/opt/hostedtoolcache/Python/3.12.9/x64/lib/python3.12/site-packages/numpy/testing/_private/utils.py", line 889, in assert_array_compare
    raise AssertionError(msg)

    AssertionError: 
Not equal to tolerance rtol=0.1, atol=0.1

Mismatched elements: 1 / 4 (25%)
Max absolute difference among violations: 0.19980126
Max relative difference among violations: 0.27495638
 ACTUAL: array([-0.926467, -0.528749, -0.017578, -0.92627 ])
 DESIRED: array([-0.726665, -0.490513, -0.006861, -0.922888])

The text was updated successfully, but these errors were encountered:

edoaltamura · 2025-03-17T22:32:39Z

Could this be because GenericBackendV2 has non-zero noise by default, which makes the results stochastic?
https://github.com/Qiskit/qiskit/blob/3da3d5ad37dd0b255ab83582d43ff7ee33fa9edf/qiskit/providers/fake_provider/generic_backend_v2.py#L667
@OkuyanBoga and I saw something like this before, specifically when using Aer primitives, which prompted us to relax the tolerance in some unit tests with numerical comparisons.

RishiNandha · 2025-03-18T12:19:09Z

The failures that you've attached seems to repeat sometimes. The first test that you have attached about test_gradient_u_2_LinCombEstimatorGradient seems to have occurred in #902 again just now in the macos 3.9 tests

woodsp-ibm · 2025-03-18T15:40:06Z

Could this be because GenericBackendV2 has non-zero noise by default

I looked at EstimatorV2 from the Runtime and it has an EstimatorOptions with a seed_estimator and also contains a SimulatorOptions that allows the noise_model to be set in local testing mode (e,g can be None) and a seed_simulator. It would seem that these should be used in testing for reproducibility. From the test I looked at it did not so maybe setting these as a try. Looping the test locally should allow the failure to be observed and then setting these - which maybe needs the outcome adjusting unless suitable seeds can be found - hopefully it should be reproducible and pass each time.

edoaltamura added help wanted Extra attention is needed priority: medium labels Mar 17, 2025

edoaltamura mentioned this issue Mar 18, 2025

Adhoc now supports n>3, and new arguments #902

Merged

edoaltamura added this to the v.0.9.0 milestone Mar 24, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Random CI failures #903

Random CI failures #903

woodsp-ibm commented Mar 17, 2025

edoaltamura commented Mar 17, 2025

RishiNandha commented Mar 18, 2025 •

edited

Loading

woodsp-ibm commented Mar 18, 2025

Random CI failures #903

Random CI failures #903

Comments

woodsp-ibm commented Mar 17, 2025

edoaltamura commented Mar 17, 2025

RishiNandha commented Mar 18, 2025 • edited Loading

woodsp-ibm commented Mar 18, 2025

RishiNandha commented Mar 18, 2025 •

edited

Loading