-
Notifications
You must be signed in to change notification settings - Fork 158
consistently failing tests on some architectures, flaky on others #748
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Never tried the tests on none-AMD64, but there are a few tests where the Borg background thread will run longer than the test. Most of those are already fixed, but maybe not all. This wouldn't be relevant outside test environments. |
I can confirm the same behavior ("Conditions for backup not met. Aborting", "Backup is already in progress") on RaspberryPi v2.
Installed through FlatPak, Vorta version 0.7.1, Borg version 1.1.13. Thanks so much for maintaining Vorta! |
It's likely a race condition where another test's backup keeps running in the background. I need to address this case more fully. Just a bit hard to reproduce and test, as you say. |
@jk-advedes, just to confirm, are you reporting a bug in the Vorta desktop application, or in the test suite? It sounds to me like scheduled backups are failing to occur during real-life usage. If so, is more than one backup job scheduled? Is archive pruning also scheduled? @m3nu, if I use Vorta's default test timeout values I can reliably trigger on multiple archs, albeit at a different rate per arch...which is to say, point me to a branch that has been instrumented and/or has profiling enabled and I'll get to work producing useful logs :-) On thing that surprises me is that
https://tests.reproducible-builds.org/debian/rb-pkg/bullseye/amd64/vorta.html vs
https://tests.reproducible-builds.org/debian/rb-pkg/unstable/armhf/vorta.html |
Yeah, I know those errors. I need to better isolate the different tests. Currently we empty and reuse the same DB. That's what's causing the error. I'll get to it soon. |
Working on this now. More on the same issue: #632 |
Related: pytest-dev/pytest-qt#313 |
@m3nu, please reopen this issue; it's unfortunately not fixed. Both 0.7.2 and 52233a9 contain the following regression compared to 0.7.1:
Full build log, including list of all installed dependencies available here: Debian 11 (bullseye) was frozen 12 Feb 2021, and one day was not enough time react to #788. Debian 11 with currently ship with Vorta 0.7.1 due to this regression, and that release will be supported until somewhere around mid to late 2024, although thankfully most users will upgrade in 2023. The window for applying for an exception to the freeze for one Vorta version upgrade will definitely close a week from now (24 Feb), so if that's something you're interested in, please fix whatever is causing test_archive_extract and test_archive_diff to fail. I can help with bisecting over the weekend, but waiting for me will run the risk of cutting things too close. |
Maybe |
Removed the What's the window manager used while testing anyways? Can I reproduce it somehow? For Github Actions we use "Herbstluft". |
Both of the failing tests above, open a new window. So I would suspect an issue with the window manager used for testing. E.g. if the new window is opened in the background. Is Debian's CI only using xvfb or also a window manager? The testing script we use for GH Actions may be useful: https://github.com/borgbase/vorta/blob/master/.github/workflows/test.yml#L40 That's also the reason why I'll try to refactor the tests and run them with xvfb only. |
I've removed the dependency on a window manager and we now use almost the same command as Debian to run Linux tests. Hope this finally helps, @sten0. In this branch: https://github.com/m3nu/vorta/tree/issue/570/rename-archive |
Gentlemen, |
The latest test fixes (remove requirement for window manager) are now merged into Will keep this open for now. |
Will do a new release now, since many pending PRs were merged. This is to get some more real-world testing. From my experience, very few users try the master branch. So it's better to make a release and then another one with fixes. (Don't tell anyone 😄) Also includes many test fixes. |
Hi Manu!
Manu <[email protected]> writes:
Maybe `qtbot.mouseClick()` doesn't work on some platforms? I started calling the functions directly.
Removed the `mouseClick` function for those tests in this branch: https://github.com/m3nu/vorta/tree/issue/570/rename-archive
What's the window manager used while testing anyways? Can I reproduce it somehow? For Github Actions we use "Herbstluft".
Yes, you can absolutely reproduce it, but that would require learning a
bit about Debian packages. As you're probably more interested in CI
than in manual tests, the setup would looke something like:
1. Install Debian (the VM instructions we discussed previously will work
great).
2. sudo apt install debci
3. sudo debci setup
4. autopkgtest -apt-upgrade --no-built-binaries vorta -- lxc --sudo
autopkgtest-unstable-amd64
* This will do a one-off test, but the "debci" can be configured to
do this automatically.
-- this method was advocated on debian-devel by Daniel Leidert the 15 Feb 2021 13:59:58 +0100.
If you want more of a Github-style CI interface to logs I can setup a
runner on salsa.debian.org's gitlab instance. At this time the admins
have asked us to minimise its usage, because we don't yet have to
resources to run CI on every push for every package. For now, that
would mean you'd ask me to test a committish and I'd merge, update the
changelog, and push to a devel branch.
At any time you're of course free to apply for a salsa account, fork my
project, and test whenever you want. Gitlab also has a fancy API, so it
should be possible to get Salsa's CI results on your Github page.
Both of the failing tests above, open a new window. So I would suspect an issue with the window manager used for testing. E.g. if the new window is opened
in the background. Is Debian's CI only using xvfb or also a window manager?
Your deduction is 100% solid. Yes, at this time we use only xvfb, and
it's too late in the release cycle to introduce another variable like
adding a window manager Herbsluft
The testing script we use for GH Actions may be useful: https://github.com/borgbase/vorta/blob/master/.github/workflows/test.yml#L40
Thanks! I will consult that if necessary when I write the Gitlab one :-)
I've removed the dependency on a window manager and we now use almost the same command as Debian to run Linux tests. Hope this finally helps, @sten0. In this branch: https://github.com/m3nu/vorta/tree/issue/570/rename-archive
Wow thank you, that's very accommodating and very much appreciated :-D
|
Will do a new release now, since many pending PRs were merged. This is to get some more real-world testing. From my experience, very few users try the master branch. So it's better to make a release and then another one with fixes. (Don't tell anyone 😄)
Also includes many test fixes.
Thanks! I ran a bunch of self-test test runs, and did a fair bit of
manual testing, was satisfied by the results, and uploaded 0.7.3-1 to
Debian unstable. It ought to migrate in time for the hard freeze.
Was Thomas Waldmann consulted about the safety of the addition of the
lock-breaking feature? I understand that this feature is convenient for
users, but it feels unsafe, and I worry that users who elect to use it
might suffer from an inconsistent repo leading to a faulty subsequent
backup. I hope this is just a worry from ignorance about borg's
resiliency...but the reason I mention it is if users suffer disk
failure, then can't restore a usable backup, that they will then blame
Vorta whose reputation will then suffer.
Other than that, wow, 0.7.3 looks like a nice improvement all around,
and definitely the one we should be shipping in Debian 11 (bullseye)!
The kwallet support also seems like it will be a good way to accelerate
adoption, because KDE/Plasma has a number of good backup applications
that support this--however, none that use Borg, and of course I believe
in Vorta more than them :-p
|
Yes, he was very helpful in making the dialog more scary: #867 😄
Thanks for the nice feedback. It includes a considerable backlog of PRs that were lined up. Mostly by @samuel-w . So credit to him as well.
We won't add large features over the next week. Just bug fixes, if they should get reported. |
Hi!
jk-advedes <[email protected]> writes:
Gentlemen,
I need to apologize - I found this page by Google when debugging my failing backups. My issues have nothing to do with test suite.
No need to apologise! I'm actually really happy to hear from you, and
that you're following up on the new bug :-) No need to apologise,
because you did everything right, followed the bread crumb trail to this
issue, and are in the process of confirming what appears to be a
real-world issue and not just a test suite failure in Continuous
Integration on a less well tested architecture. The sum of all this is
that the technology and community is working :-)
Thank you
|
Manu <[email protected]> writes:
> Was Thomas Waldmann consulted about the safety of the addition of the lock-breaking feature?
Yes, he was very helpful in making the dialog more scary: #867 😄
Excellent! :-D
> Other than that, wow, 0.7.3 looks like a nice improvement all around
Thanks for the nice feedback. It includes a considerable backlog of PRs that were lined up. Mostly by @samuel-w . So credit to him as well.
Definitely :-) I had already cherry picked about eight of his
commits onto 0.7.1, and IIRC he was the one who added the kwallet
support (Thank you @samuel-w! Also, thank you for using appropriate
versioning in your PPA :-))
> and definitely the one we should be shipping in Debian 11 (bullseye)!
We won't add large features over the next week. Just bug fixes, if they should get reported.
Thanks, much appreciated! For future reference, if ever you spot a
security or data-loss related fix, please CC me so I can cherry pick the
commit[s] onto the 0.7.3-based Debian package. As long as they're
targetted fixes for serious or grave issues I should be able to get a
release team (or security team) exception to upload a fixed package
at anytime.
|
@m3nu The results on in for 0.7.3: still failing on armhf, still failing on i386, and while tests pass on amd64, they don't always pass cleanly:
https://tests.reproducible-builds.org/debian/logs/unstable/amd64/vorta_0.7.3-1.build2.log.gz
I seem to remember you mentioning that you were considering cleaning up the many Qt objects in another issue? Given this, and I suspect, yes, maybe that's actually needed. My hypothesis is that this segfault is related to those. Apologies for the poor debugging info, I haven't yet been able to reproduce this on my local machine, but was able to reproduce it twice on tests.reproducible-builds.org infrastructure. I'll see what I can do to get better debugging output! |
That's mostly good news. All the previous test failures are resolved. 👍 This just leaves #456 , which I can already reproduce reliably. The failure on ARM was caused by a disappearing file. This should be caught anyways. So I added an exception and wait time to the previous test: (PR : #877
|
This should be fixed in the current Would appreciate feedback is your reported bug here is addressed. |
Should be fixed by #898. The last "hangs" I noticed on macOS were due to a bad mutex setup in |
On i386 the following fail: test_archive_mount, test_archive_extract, test_archive_diff, test_borg_prune, test_repo_unlink, test_repo_add_success, test_create, test_scheduler_create_backup
On armhf (32bit, used on the Raspberry pi 3, I think): same.
These tests also occasionally fail on Intel/AMD 64bit CI, and I've received release critical bugs about this issue (Debian bugtracker) for arm64 (used on the Raspberri pi ≥ 4, I think). Flakiness on arm64 seems like it will affect Borg on laptops using Apple M1 chip, along with future ARM laptops, and I also wonder if it's useful fuzzing for amd64 (aka: x86-64).
Vorta for Debian is currently based on 0.7.1, plus the following patches (mostly cherry picked from master):
https://salsa.debian.org/python-team/packages/vorta/-/tree/debian/master/debian/patches
I've also increased all timeouts to 10 (where they were previously << 10) or 20 (where they were previously ≥10) seconds. For the moment, this patch appears to have mitigated the rare nondeterministic test failures on amd64, and it remains to be seen whether it mitigates future failures on arm64.
The most interesting error output from
tests.reproducible-builds.org
isand my primary concern is this might indicate the existence of some corner-case failures in Vorta's backup scheduling, possibly due to race conditions or missing locking conditions.
Logs include a list of the full stack of all installed dependencies, and the actual commands used to initiate tests.
https://tests.reproducible-builds.org/debian/rb-pkg/unstable/i386/vorta.html
https://tests.reproducible-builds.org/debian/rb-pkg/unstable/armhf/vorta.html
and here are some logs from another one of our three CI networks:
https://ci.debian.net/data/autopkgtest/testing/arm64/v/vorta/8520699/log.gz
https://ci.debian.net/data/autopkgtest/testing/i386/v/vorta/8521042/log.gz
These logs were cited in Debian Bug #976134
It's of course also possible that this is a bug in another Python package, or in Qt, or in Borg. If I've made a mistake anywhere, please let me know, and I'll humbly accept responsibility and fix it :-)
Merry Christmas and Happy New Year!
Nicholas
The text was updated successfully, but these errors were encountered: