Skip to content

consistently failing tests on some architectures, flaky on others #748

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
sten0 opened this issue Dec 27, 2020 · 26 comments · Fixed by #788 or #877
Closed

consistently failing tests on some architectures, flaky on others #748

sten0 opened this issue Dec 27, 2020 · 26 comments · Fixed by #788 or #877
Assignees
Labels
Milestone

Comments

@sten0
Copy link
Contributor

sten0 commented Dec 27, 2020

On i386 the following fail: test_archive_mount, test_archive_extract, test_archive_diff, test_borg_prune, test_repo_unlink, test_repo_add_success, test_create, test_scheduler_create_backup

On armhf (32bit, used on the Raspberry pi 3, I think): same.

These tests also occasionally fail on Intel/AMD 64bit CI, and I've received release critical bugs about this issue (Debian bugtracker) for arm64 (used on the Raspberri pi ≥ 4, I think). Flakiness on arm64 seems like it will affect Borg on laptops using Apple M1 chip, along with future ARM laptops, and I also wonder if it's useful fuzzing for amd64 (aka: x86-64).

Vorta for Debian is currently based on 0.7.1, plus the following patches (mostly cherry picked from master):
https://salsa.debian.org/python-team/packages/vorta/-/tree/debian/master/debian/patches

I've also increased all timeouts to 10 (where they were previously << 10) or 20 (where they were previously ≥10) seconds. For the moment, this patch appears to have mitigated the rare nondeterministic test failures on amd64, and it remains to be seen whether it mitigates future failures on arm64.

The most interesting error output from tests.reproducible-builds.org is

ERROR    vorta.scheduler:scheduler.py:118 Conditions for backup not met. Aborting.
ERROR    vorta.scheduler:scheduler.py:119 Backup is already in progress.

and my primary concern is this might indicate the existence of some corner-case failures in Vorta's backup scheduling, possibly due to race conditions or missing locking conditions.

Logs include a list of the full stack of all installed dependencies, and the actual commands used to initiate tests.

https://tests.reproducible-builds.org/debian/rb-pkg/unstable/i386/vorta.html
https://tests.reproducible-builds.org/debian/rb-pkg/unstable/armhf/vorta.html

and here are some logs from another one of our three CI networks:
https://ci.debian.net/data/autopkgtest/testing/arm64/v/vorta/8520699/log.gz
https://ci.debian.net/data/autopkgtest/testing/i386/v/vorta/8521042/log.gz

These logs were cited in Debian Bug #976134

It's of course also possible that this is a bug in another Python package, or in Qt, or in Borg. If I've made a mistake anywhere, please let me know, and I'll humbly accept responsibility and fix it :-)

Merry Christmas and Happy New Year!
Nicholas

@m3nu m3nu self-assigned this Dec 28, 2020
@m3nu m3nu added the type:task A TODO label Dec 28, 2020
@m3nu
Copy link
Contributor

m3nu commented Dec 28, 2020

Never tried the tests on none-AMD64, but there are a few tests where the Borg background thread will run longer than the test. Most of those are already fixed, but maybe not all. This wouldn't be relevant outside test environments.

@jk-advedes
Copy link

I can confirm the same behavior ("Conditions for backup not met. Aborting", "Backup is already in progress") on RaspberryPi v2.

  • It happens on scheduled tasks with small data content - not on larger tasks.
  • It doesn't fails when triggered manually.

Installed through FlatPak, Vorta version 0.7.1, Borg version 1.1.13.

Thanks so much for maintaining Vorta!

@m3nu
Copy link
Contributor

m3nu commented Jan 22, 2021

It's likely a race condition where another test's backup keeps running in the background. I need to address this case more fully. Just a bit hard to reproduce and test, as you say.

@sten0
Copy link
Contributor Author

sten0 commented Jan 28, 2021

@jk-advedes, just to confirm, are you reporting a bug in the Vorta desktop application, or in the test suite? It sounds to me like scheduled backups are failing to occur during real-life usage. If so, is more than one backup job scheduled? Is archive pruning also scheduled?

@m3nu, if I use Vorta's default test timeout values I can reliably trigger on multiple archs, albeit at a different rate per arch...which is to say, point me to a branch that has been instrumented and/or has profiling enabled and I'll get to work producing useful logs :-) On thing that surprises me is that test_archive_mount errors in two different ways:

tests/test_archives.py::test_archive_mount Qt exceptions in virtual methods:
________________________________________________________________________________
Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/peewee.py", line 3129, in execute_sql
    cursor.execute(sql, params or ())
sqlite3.OperationalError: no such table: eventlogmodel

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/build/vorta-0.7.1/.pybuild/cpython3_3.9_vorta/build/vorta/borg/borg_thread.py", line 236, in run
    log_entry.save()
  File "/usr/lib/python3/dist-packages/peewee.py", line 6546, in save
    rows = self.update(**field_dict).where(self._pk_expr()).execute()
  File "/usr/lib/python3/dist-packages/peewee.py", line 1898, in inner
    return method(self, database, *args, **kwargs)
  File "/usr/lib/python3/dist-packages/peewee.py", line 1969, in execute
    return self._execute(database)
  File "/usr/lib/python3/dist-packages/peewee.py", line 2465, in _execute
    cursor = database.execute(self)
  File "/usr/lib/python3/dist-packages/peewee.py", line 3142, in execute
    return self.execute_sql(sql, params, commit=commit)
  File "/usr/lib/python3/dist-packages/peewee.py", line 3136, in execute_sql
    self.commit()
  File "/usr/lib/python3/dist-packages/peewee.py", line 2902, in __exit__
    reraise(new_type, new_type(exc_value, *exc_args), traceback)
  File "/usr/lib/python3/dist-packages/peewee.py", line 185, in reraise
    raise value.with_traceback(tb)
  File "/usr/lib/python3/dist-packages/peewee.py", line 3129, in execute_sql
    cursor.execute(sql, params or ())
peewee.OperationalError: no such table: eventlogmodel
________________________________________________________________________________
ERROR

https://tests.reproducible-builds.org/debian/rb-pkg/bullseye/amd64/vorta.html

vs

tests/test_archives.py::test_archive_mount Qt exceptions in virtual methods:
________________________________________________________________________________
Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/peewee.py", line 3129, in execute_sql
    cursor.execute(sql, params or ())
sqlite3.OperationalError: database is locked

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/build/1st/vorta-0.7.1/.pybuild/cpython3_3.9_vorta/build/vorta/borg/borg_thread.py", line 236, in run
    log_entry.save()
  File "/usr/lib/python3/dist-packages/peewee.py", line 6546, in save
    rows = self.update(**field_dict).where(self._pk_expr()).execute()
  File "/usr/lib/python3/dist-packages/peewee.py", line 1898, in inner
    return method(self, database, *args, **kwargs)
  File "/usr/lib/python3/dist-packages/peewee.py", line 1969, in execute
    return self._execute(database)
  File "/usr/lib/python3/dist-packages/peewee.py", line 2465, in _execute
    cursor = database.execute(self)
  File "/usr/lib/python3/dist-packages/peewee.py", line 3142, in execute
    return self.execute_sql(sql, params, commit=commit)
  File "/usr/lib/python3/dist-packages/peewee.py", line 3136, in execute_sql
    self.commit()
  File "/usr/lib/python3/dist-packages/peewee.py", line 2902, in __exit__
    reraise(new_type, new_type(exc_value, *exc_args), traceback)
  File "/usr/lib/python3/dist-packages/peewee.py", line 185, in reraise
    raise value.with_traceback(tb)
  File "/usr/lib/python3/dist-packages/peewee.py", line 3129, in execute_sql
    cursor.execute(sql, params or ())
peewee.OperationalError: database is locked
________________________________________________________________________________
ERROR

https://tests.reproducible-builds.org/debian/rb-pkg/unstable/armhf/vorta.html

@m3nu
Copy link
Contributor

m3nu commented Jan 28, 2021

Yeah, I know those errors. I need to better isolate the different tests. Currently we empty and reuse the same DB. That's what's causing the error. I'll get to it soon.

@m3nu m3nu added this to the 0.7.3 (Next minor release) milestone Jan 31, 2021
@m3nu
Copy link
Contributor

m3nu commented Feb 11, 2021

Working on this now. More on the same issue: #632

@m3nu m3nu mentioned this issue Feb 11, 2021
@m3nu
Copy link
Contributor

m3nu commented Feb 11, 2021

Related: pytest-dev/pytest-qt#313

@m3nu
Copy link
Contributor

m3nu commented Feb 11, 2021

Did a few things to make tests more reliable and can't trigger one of those DB errors on Github Actions any more.

Let me know how it is for other archs, @sten0

PR: #788

@sten0
Copy link
Contributor Author

sten0 commented Feb 17, 2021

@m3nu, please reopen this issue; it's unfortunately not fixed. Both 0.7.2 and 52233a9 contain the following regression compared to 0.7.1:

=================================== FAILURES ===================================
_____________________________ test_archive_extract _____________________________

qapp = <vorta.application.VortaApp object at 0x7fa772488e50>
qtbot = <pytestqt.qtbot.QtBot object at 0x7fa77038c490>
mocker = <pytest_mock.MockFixture object at 0x7fa770392ee0>
borg_json_output = <function borg_json_output.<locals>._read_json at 0x7fa7703bd160>

    def test_archive_extract(qapp, qtbot, mocker, borg_json_output):
        main = qapp.main_window
        tab = main.archiveTab
        main.tabWidget.setCurrentIndex(3)
    
        tab.populate_from_profile()
        qtbot.waitUntil(lambda: tab.archiveTable.rowCount() == 2)
    
        tab.archiveTable.selectRow(0)
        stdout, stderr = borg_json_output('list_archive')
        popen_result = mocker.MagicMock(stdout=stdout, stderr=stderr, returncode=0)
        mocker.patch.object(vorta.borg.borg_thread, 'Popen', return_value=popen_result)
        qtbot.mouseClick(tab.extractButton, QtCore.Qt.LeftButton)
    
        qtbot.waitUntil(lambda: hasattr(tab, '_window'), **pytest._wait_defaults)
>       qtbot.waitUntil(lambda: tab._window == qapp.activeWindow(), **pytest._wait_defaults)
E       AssertionError: waitUntil timed out in 20000 miliseconds

tests/test_archives.py:128: AssertionError
----------------------------- Captured Qt messages -----------------------------
    QtWarningMsg: QImage::scaleHeight: Image is a null image
    QtWarningMsg: QImage::scaleHeight: Image is a null image
    QtWarningMsg: QImage::scaleHeight: Image is a null image
    QtWarningMsg: QImage::scaleHeight: Image is a null image
    QtWarningMsg: QImage::scaleHeight: Image is a null image
    QtWarningMsg: QImage::scaleHeight: Image is a null image
    QtWarningMsg: QImage::scaleHeight: Image is a null image
    QtWarningMsg: QImage::scaleHeight: Image is a null image
    QtWarningMsg: QImage::scaleHeight: Image is a null image
    QtWarningMsg: QImage::scaleHeight: Image is a null image
    QtWarningMsg: QImage::scaleHeight: Image is a null image
    QtWarningMsg: QImage::scaleHeight: Image is a null image
    QtWarningMsg: QImage::scaleHeight: Image is a null image
    QtWarningMsg: QImage::scaleHeight: Image is a null image
    QtWarningMsg: QImage::scaleHeight: Image is a null image
    QtWarningMsg: QImage::scaleHeight: Image is a null image
    QtWarningMsg: QImage::scaleHeight: Image is a null image
    QtWarningMsg: QImage::scaleHeight: Image is a null image
    QtWarningMsg: QImage::scaleHeight: Image is a null image
    QtWarningMsg: QImage::scaleHeight: Image is a null image
______________________________ test_archive_diff_______________________________

qapp = <vorta.application.VortaApp object at 0x7fa772488e50>
qtbot = <pytestqt.qtbot.QtBot object at 0x7fa7703a9b50>
mocker = <pytest_mock.MockFixture object at 0x7fa770366640>
borg_json_output = <function borg_json_output.<locals>._read_json at 0x7fa7683eb670>

    def test_archive_diff(qapp, qtbot, mocker, borg_json_output):
        main = qapp.main_window
        tab = main.archiveTab
        main.tabWidget.setCurrentIndex(3)
    
        tab.populate_from_profile()
        qtbot.waitUntil(lambda: tab.archiveTable.rowCount() == 2)
    
        stdout, stderr = borg_json_output('diff_archives')
        popen_result = mocker.MagicMock(stdout=stdout, stderr=stderr, returncode=0)
        mocker.patch.object(vorta.borg.borg_thread, 'Popen', return_value=popen_result)
    
        qtbot.mouseClick(tab.diffButton, QtCore.Qt.LeftButton)
        qtbot.waitUntil(lambda: hasattr(tab, '_window'), **pytest._wait_defaults)
>       qtbot.waitUntil(lambda: tab._window == qapp.activeWindow(), **pytest._wait_defaults)
E       AssertionError: waitUntil timed out in 20000 miliseconds

tests/test_archives.py:169: AssertionError
----------------------------- Captured Qt messages -----------------------------
    QtWarningMsg: QImage::scaleHeight: Image is a null image
    QtWarningMsg: QImage::scaleHeight: Image is a null image
    QtWarningMsg: QImage::scaleHeight: Image is a null image
    QtWarningMsg: QImage::scaleHeight: Image is a null image
    QtWarningMsg: QImage::scaleHeight: Image is a null image
    QtWarningMsg: QImage::scaleHeight: Image is a null image
    QtWarningMsg: QImage::scaleHeight: Image is a null image
    QtWarningMsg: QImage::scaleHeight: Image is a null image
    QtWarningMsg: QImage::scaleHeight: Image is a null image
    QtWarningMsg: QImage::scaleHeight: Image is a null image
    QtWarningMsg: QImage::scaleHeight: Image is a null image
    QtWarningMsg: QImage::scaleHeight: Image is a null image
    QtWarningMsg: QImage::scaleHeight: Image is a null image
    QtWarningMsg: QImage::scaleHeight: Image is a null image
    QtWarningMsg: QImage::scaleHeight: Image is a null image
    QtWarningMsg: QImage::scaleHeight: Image is a null image
    QtWarningMsg: QImage::scaleHeight: Image is a null image
    QtWarningMsg: QImage::scaleHeight: Image is a null image
    QtWarningMsg: QImage::scaleHeight: Image is a null image
    QtWarningMsg: QImage::scaleHeight: Image is a null image
=========================== short test summary info ============================
FAILED tests/test_archives.py::test_archive_extract - AssertionError: waitUnt...
FAILED tests/test_archives.py::test_archive_diff - AssertionError: waitUntil ...
=================== 2 failed, 49 passed, 2 skipped in 52.15s ===================

Full build log, including list of all installed dependencies available here:
vorta_0.7.2+15.g52233a9-1_amd64-2021-02-17T04:40:47Z.build.txt

Debian 11 (bullseye) was frozen 12 Feb 2021, and one day was not enough time react to #788. Debian 11 with currently ship with Vorta 0.7.1 due to this regression, and that release will be supported until somewhere around mid to late 2024, although thankfully most users will upgrade in 2023. The window for applying for an exception to the freeze for one Vorta version upgrade will definitely close a week from now (24 Feb), so if that's something you're interested in, please fix whatever is causing test_archive_extract and test_archive_diff to fail. I can help with bisecting over the weekend, but waiting for me will run the risk of cutting things too close.

@m3nu
Copy link
Contributor

m3nu commented Feb 17, 2021

Maybe qtbot.mouseClick() doesn't work on some platforms? I started calling the functions directly.

@m3nu
Copy link
Contributor

m3nu commented Feb 17, 2021

Removed the mouseClick function for those tests in this branch: https://github.com/m3nu/vorta/tree/issue/570/rename-archive

What's the window manager used while testing anyways? Can I reproduce it somehow? For Github Actions we use "Herbstluft".

@m3nu
Copy link
Contributor

m3nu commented Feb 17, 2021

Both of the failing tests above, open a new window. So I would suspect an issue with the window manager used for testing. E.g. if the new window is opened in the background. Is Debian's CI only using xvfb or also a window manager?

The testing script we use for GH Actions may be useful: https://github.com/borgbase/vorta/blob/master/.github/workflows/test.yml#L40

That's also the reason why mouseClick doesn't work on new windows. They are not visible or something. So those tests only work by calling the function directly.

I'll try to refactor the tests and run them with xvfb only.

@m3nu
Copy link
Contributor

m3nu commented Feb 17, 2021

I've removed the dependency on a window manager and we now use almost the same command as Debian to run Linux tests. Hope this finally helps, @sten0. In this branch: https://github.com/m3nu/vorta/tree/issue/570/rename-archive

@jk-advedes
Copy link

Gentlemen,
I need to apologize - I found this page by Google when debugging my failing backups. My issues have nothing to do with test suite.

@m3nu
Copy link
Contributor

m3nu commented Feb 18, 2021

The latest test fixes (remove requirement for window manager) are now merged into master.

Will keep this open for now.

@m3nu
Copy link
Contributor

m3nu commented Feb 18, 2021

Will do a new release now, since many pending PRs were merged. This is to get some more real-world testing. From my experience, very few users try the master branch. So it's better to make a release and then another one with fixes. (Don't tell anyone 😄)

Also includes many test fixes.

@sten0
Copy link
Contributor Author

sten0 commented Feb 18, 2021 via email

@sten0
Copy link
Contributor Author

sten0 commented Feb 18, 2021 via email

@m3nu
Copy link
Contributor

m3nu commented Feb 18, 2021

Was Thomas Waldmann consulted about the safety of the addition of the lock-breaking feature?

Yes, he was very helpful in making the dialog more scary: #867 😄

Other than that, wow, 0.7.3 looks like a nice improvement all around

Thanks for the nice feedback. It includes a considerable backlog of PRs that were lined up. Mostly by @samuel-w . So credit to him as well.

and definitely the one we should be shipping in Debian 11 (bullseye)!

We won't add large features over the next week. Just bug fixes, if they should get reported.

@sten0
Copy link
Contributor Author

sten0 commented Feb 18, 2021 via email

@sten0
Copy link
Contributor Author

sten0 commented Feb 18, 2021 via email

@sten0
Copy link
Contributor Author

sten0 commented Feb 20, 2021

@m3nu The results on in for 0.7.3: still failing on armhf, still failing on i386, and while tests pass on amd64, they don't always pass cleanly:

...
======================== 52 passed, 2 skipped in 14.47s ========================
Fatal Python error: Segmentation fault

Thread 0x00007f6b95ffb700 (most recent call first):
  File "/build/2/vorta-0.7.3/2nd/.pybuild/cpython3_3.9_vorta/build/vorta/borg/borg_thread.py", line 261 in run

Thread 0x00007f6b95ffb700 (most recent call first):
  File "/build/2/vorta-0.7.3/2nd/.pybuild/cpython3_3.9_vorta/build/vorta/borg/borg_thread.py", line 261 in run

Thread 0x00007f6b95ffb700 (most recent call first):
  File "/build/2/vorta-0.7.3/2nd/.pybuild/cpython3_3.9_vorta/build/vorta/borg/borg_thread.py", line 261 in run

Current thread 0x00007f6c76273740 (most recent call first):
<no Python frame>
E: pybuild pybuild:353: test: plugin distutils failed with: exit code=-11: cd /build/2/vorta-0.7.3/2nd/.pybuild/cpython3_3.9_vorta/build; python3.9 -m pytest tests

https://tests.reproducible-builds.org/debian/logs/unstable/amd64/vorta_0.7.3-1.build2.log.gz
https://tests.reproducible-builds.org/debian/rb-pkg/unstable/amd64/vorta.html

  • you can get results for all architectures from this page

I seem to remember you mentioning that you were considering cleaning up the many Qt objects in another issue? Given this, and I suspect, yes, maybe that's actually needed. My hypothesis is that this segfault is related to those. Apologies for the poor debugging info, I haven't yet been able to reproduce this on my local machine, but was able to reproduce it twice on tests.reproducible-builds.org infrastructure. I'll see what I can do to get better debugging output!

@m3nu
Copy link
Contributor

m3nu commented Feb 21, 2021

That's mostly good news. All the previous test failures are resolved. 👍

This just leaves #456 , which I can already reproduce reliably.

The failure on ARM was caused by a disappearing file. This should be caught anyways. So I added an exception and wait time to the previous test: (PR : #877

________________________ ERROR at setup of test_keyring ________________________
SETUP ERROR: Qt exceptions in virtual methods:
________________________________________________________________________________
Traceback (most recent call last):
  File "/build/2/vorta-0.7.3/2nd/.pybuild/cpython3_3.9_vorta/build/vorta/utils.py", line 47, in run
    self.size, self.files_count = get_path_datasize(self.path)
  File "/build/2/vorta-0.7.3/2nd/.pybuild/cpython3_3.9_vorta/build/vorta/utils.py", line 90, in get_path_datasize
    data_size, files_count = get_directory_size(file_info.absoluteFilePath())
  File "/build/2/vorta-0.7.3/2nd/.pybuild/cpython3_3.9_vorta/build/vorta/utils.py", line 65, in get_directory_size
    stat = os.stat(file_path)
FileNotFoundError: [Errno 2] No such file or directory: '/tmp/pytest-of-pbuilder2/pytest-0/Vorta32/settings.sqlite-shm'

@m3nu
Copy link
Contributor

m3nu commented Feb 21, 2021

No luck with reproducing the error in my Debian VM in a LXC with autopkgtest. But I believe #877 addresses the segfault, as well as the other test error, @sten0.

@m3nu m3nu closed this as completed in #877 Feb 22, 2021
@m3nu m3nu reopened this Feb 22, 2021
@m3nu
Copy link
Contributor

m3nu commented Feb 22, 2021

This should be fixed in the current master branch and will be released as 0.7.4.

Would appreciate feedback is your reported bug here is addressed.

@m3nu
Copy link
Contributor

m3nu commented Mar 1, 2021

Should be fixed by #898. The last "hangs" I noticed on macOS were due to a bad mutex setup in BorgThread.

@m3nu m3nu closed this as completed Mar 1, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants