-
Notifications
You must be signed in to change notification settings - Fork 927
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Don't capture hard errors for old cagg format #7903
base: main
Are you sure you want to change the base?
Conversation
28637ce
to
dc1c1ae
Compare
Note that this is not capturing errors it should capture, it seems. Hence the failures. Looking into it. |
dc1c1ae
to
4dc601c
Compare
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #7903 +/- ##
==========================================
+ Coverage 80.06% 81.96% +1.89%
==========================================
Files 190 249 +59
Lines 37181 46162 +8981
Branches 9450 11575 +2125
==========================================
+ Hits 29770 37836 +8066
- Misses 2997 3766 +769
- Partials 4414 4560 +146 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
f365ec7
to
d708c4c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there anyway we can test this? Looks like error handler is not covered by codecov.
There is unfortunately no easy way to test this. Something like this might be doable, but it is a bigger job, so suitable for a separate PR:
The same applies to #7897 and #7893. Nevertheless, I will take a look and see if I can create something like the above with a moderate effort. |
I think it is worthwhile working on tests, especially since there are several PRs trying to address similar issues so it seems important to get this right. While the changes in the PR seem reasonable, there's no way to currently know we are actually improving something or just making it worse. So, why should we merge some changes that we don't know works for sure? |
@mkindahl Is it possible to test this by storing a partial agg result in a table, then modify that to some garbage value, then run finalize over it? It should fail in the deserialization function I guess... |
Hmmm... now I was looking to it because I'm sure we had those tests and we had: 6c38c60 But at some point later we removed it :-( |
It was removed in 50578e4 because it caused issues with the test suite. |
Yes, but that will only test one part of the code. A "hard" error should still cause an abortion, and this part is not dealt with using this approach. |
318a081
to
a6a481a
Compare
Looking at the test, it seems like we have no tests to check that the repair works, so added it here. |
The repair can only handle very specific issues, so we cannot do random changes. I added a test that truncates the last 16 bytes of the partial to generate the "broken" partial that this code can repair. I also added debug printouts so that we can see that the repair is attempted and actually done. Since these are debug printouts, they print a little more than just the bare minimum, but it at least checks that the repair is attempted and works. |
a6a481a
to
4d62ea6
Compare
If an error occurs inside the deserialization function for continuous aggregates, we are using the old numeric format and this was captured inside `inner_agg_deserialize` and the format was "repaired". However, this captured hard errors as well, which can cause cascading errors if execution continues after the error. This commit fixes that by capturing data exception errors, internal error (which is the default for elog()) and protocol error (c.f. `pq_getmsgint64` used by `numeric_deserialize`), but re-throwing all other errors. It also adds tests to make sure that the repair is done and not generates an error.
4d62ea6
to
673027d
Compare
If an error occurs inside the deserialization function for continuous aggregates, we are using the old numeric format and this was captured inside
inner_agg_deserialize
and the format was "repaired". However, this captured hard errors as well, which can cause cascading errors if execution continues after the error.This commit fixes that by capturing data exception errors, internal error (which is the default for elog()) and protocol error (c.f.
pq_getmsgint64
used bynumeric_deserialize
), but re-throwing all other errors.It also adds tests to check that a repair is attempted on old format, and that it does not throw en error.