Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Initial integration of Sentry (crash monitoring) #4172

Merged
merged 6 commits into from
Jun 9, 2021

Conversation

jonoomph
Copy link
Member

@jonoomph jonoomph commented Jun 3, 2021

Initial integration of Sentry tracing, to better track stack-traces and bugs in OpenShot. Removing the old hooks and HTTP posts to openshot.org, since they would be redundant. Sentry is still gated behind our 'send_metrics' setting, just like before.

Sentry.io offers far better error reporting than my crude home-made version, and will allow more developers to monitor the exceptions, group them, and convert them into useful information and ultimately bug fixes.

jonoomph added 2 commits June 3, 2021 16:35
…nd bugs in OpenShot. Removing the old hooks and HTTP posts to openshot.org, since they would be redundant. Sentry is still gated behind our 'send_metrics' setting, just like before.
@jonoomph jonoomph changed the title Initial integration of Sentry (crash monitoring) WIP: Initial integration of Sentry (crash monitoring) Jun 3, 2021
@jonoomph
Copy link
Member Author

jonoomph commented Jun 3, 2021

This generally works great, except for one big issue. Our custom logger, which redirects all STDERR to both the console and our log file... causes a bunch of individual traces on Sentry. Each log.error() level message generates a separate trace event... which is of course not correct.

If I remove the STDERR redirection, we lose exceptions from our log file, and we lose certain output from other programs I think (such as blender output)... I need to do more testing around this though.

@ferdnyc
Copy link
Contributor

ferdnyc commented Jun 3, 2021

Hm. Interesting.

...Do we have a Sentry organization already?

image

(If I "click here"...)

image

Copy link
Collaborator

@JacksonRG JacksonRG left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Can't wait to see the data start coming in.

@ferdnyc
Copy link
Contributor

ferdnyc commented Jun 3, 2021

@jonoomph

This generally works great, except for one big issue. Our custom logger, which redirects all STDERR to both the console and our log file... causes a bunch of individual traces on Sentry. Each log.error() level message generates a separate trace event... which is of course not correct.

Did you turn on attach_stacktrace? The documentation says it's off by default, and will cause that. Stacktraces are always attached to exceptions, so we'd want to leave that option off I'd think.

@ferdnyc
Copy link
Contributor

ferdnyc commented Jun 3, 2021

(Also, setting a development sample_rate of 1.0 might be excessive. That rate is the sampling of generic error events, and sending 100% of all log.error()s to Sentry seems like overkill even in development.)

@ferdnyc
Copy link
Contributor

ferdnyc commented Jun 3, 2021

Meanwhile, I'll try to figure out why the Python bindings in the libopenshot PPA builds currently seem to think openshot.Mask() is an abstract class. Doesn't happen in my Fedora builds!

(Edit: No clue. Works fine in my Ubuntu Focal VM as well, with the newest libopenshot PPA installed. Doing a full apt upgrade now, in case it's a weird interaction with some other package.)

@ferdnyc
Copy link
Contributor

ferdnyc commented Jun 3, 2021

(Also, setting a development sample_rate of 1.0 might be excessive. That rate is the sampling of generic error events, and sending 100% of all log.error()s to Sentry seems like overkill even in development.)

Then again, maybe not. I mean, we should strive to eliminate all error conditions, and maybe having them tallied in Sentry will motivate us. As long as they're captured in a lightweight way, getting totals on how frequently they occur would be good data to have, even if we mostly ignore it.

(This presumes their filtering and drill-down is robust enough that "mostly ignore it" is a workable option. I haven't looked at the reporting interface yet, still trying to figure out which way I'm supposed to create my account.)

@jonoomph
Copy link
Member Author

jonoomph commented Jun 4, 2021

@ferdnyc Just sent you an invite to Sentry

@jonoomph
Copy link
Member Author

jonoomph commented Jun 4, 2021

@ferdnyc If I disable our Logger redirection of stdout and stderr, sentry will log a single trace event for an error. It includes a stack trace in my examples (at least for unhandled exceptions, such as 1 / 0). But with our Logger redirecting things to stderr, it generates a ton of noise on Sentry.

@jonoomph
Copy link
Member Author

jonoomph commented Jun 4, 2021

Regarding the Sentry filtering and grouping... it's pretty great. And lots of great ignore rules. However, I want to make sure we have as much detail about errors as possible, including stack traces when available, and also minimize noisy useless info. An actual exception should generate a single Sentry trace ideally... and not 10 of them, lol

@ferdnyc
Copy link
Contributor

ferdnyc commented Jun 4, 2021

@jonoomph Ah, gotcha. Probably for the same reason this happens:

A total mess on sys.stdout, courtesy of our redirection

>>> from classes.logger import log, reroute_output
>>> from pprint import pprint
>>> import sys
>>> reroute_output()
>>> pprint(sys.modules)
INFO logger: {
INFO logger: 'PyQt5'
INFO logger: :
INFO logger: 
INFO logger: ,
INFO logger: 'PyQt5.QtCore'
INFO logger: :
INFO logger: 
INFO logger: ,
INFO logger: 'PyQt5.sip'
INFO logger: :
INFO logger: 
INFO logger: ,
INFO logger: '__future__'
INFO logger: :
INFO logger: 
INFO logger: ,
INFO logger: '__main__'
INFO logger: :
INFO logger: 
INFO logger: ,
INFO logger: '_abc'
INFO logger: :
INFO logger: 
INFO logger: ,
INFO logger: '_bootlocale'
INFO logger: :
INFO logger: 
INFO logger: ,
INFO logger: '_codecs'
INFO logger: :
INFO logger: 
INFO logger: ,
INFO logger: '_collections'
INFO logger: :
INFO logger: 
INFO logger: ,
INFO logger: '_collections_abc'
INFO logger: :
INFO logger: 
INFO logger: ,
INFO logger: '_compat_pickle'
INFO logger: :
INFO logger: 
INFO logger: ,
INFO logger: '_datetime'
INFO logger: :
INFO logger: 
INFO logger: ,
INFO logger: '_frozen_importlib'
INFO logger: :
INFO logger: 
INFO logger: ,
INFO logger: '_frozen_importlib_external'
INFO logger: :
INFO logger: 
INFO logger: ,
INFO logger: '_functools'
INFO logger: :
INFO logger: 
INFO logger: ,
INFO logger: '_heapq'
INFO logger: :
INFO logger: 
INFO logger: ,
INFO logger: '_imp'
INFO logger: :
INFO logger: 
INFO logger: ,
INFO logger: '_io'
INFO logger: :
INFO logger: 
INFO logger: ,
INFO logger: '_locale'
INFO logger: :
INFO logger: 
INFO logger: ,
INFO logger: '_operator'
INFO logger: :
INFO logger: 
INFO logger: ,
INFO logger: '_pickle'
INFO logger: :
INFO logger: 
INFO logger: ,
INFO logger: '_posixsubprocess'
INFO logger: :
INFO logger: 
INFO logger: ,
INFO logger: '_queue'
INFO logger: :
INFO logger: 
INFO logger: ,
INFO logger: '_signal'
INFO logger: :
INFO logger: 
INFO logger: ,
INFO logger: '_sitebuiltins'
INFO logger: :
INFO logger: 
INFO logger: ,
INFO logger: '_socket'
INFO logger: :
INFO logger: 
INFO logger: ,
INFO logger: '_sre'
INFO logger: :
INFO logger: 
INFO logger: ,
INFO logger: '_stat'
INFO logger: :
INFO logger: 
INFO logger: ,
INFO logger: '_string'
INFO logger: :
INFO logger: 
INFO logger: ,
INFO logger: '_struct'
INFO logger: :
INFO logger: 
INFO logger: ,
INFO logger: '_sysconfigdata__linux_x86_64-linux-gnu'
INFO logger: :
INFO logger: 
INFO logger: ,
INFO logger: '_thread'
INFO logger: :
INFO logger: 
INFO logger: ,
INFO logger: '_uuid'
INFO logger: :
INFO logger: 
INFO logger: ,
INFO logger: '_warnings'
INFO logger: :
INFO logger: 
INFO logger: ,
INFO logger: '_weakref'
INFO logger: :
INFO logger: 
INFO logger: ,
INFO logger: '_weakrefset'
INFO logger: :
INFO logger: 
INFO logger: ,
INFO logger: 'abc'
INFO logger: :
INFO logger: 
INFO logger: ,
INFO logger: 'abrt_exception_handler3'
INFO logger: :
INFO logger: 
INFO logger: ,
INFO logger: 'array'
INFO logger: :
INFO logger: 
INFO logger: ,
INFO logger: 'atexit'
INFO logger: :
INFO logger: 
INFO logger: ,
INFO logger: 'builtins'
INFO logger: :
INFO logger: 
INFO logger: ,
INFO logger: 'classes'
INFO logger: :
INFO logger: 
INFO logger: ,
INFO logger: 'classes.info'
INFO logger: :
INFO logger: 
INFO logger: ,
INFO logger: 'classes.logger'
INFO logger: :
INFO logger: 
INFO logger: ,
INFO logger: 'codecs'
INFO logger: :
INFO logger: 
INFO logger: ,
INFO logger: 'collections'
INFO logger: :
INFO logger: 
INFO logger: ,
INFO logger: 'collections.abc'
INFO logger: :
INFO logger: 
INFO logger: ,
INFO logger: 'contextlib'
INFO logger: :
INFO logger: 
INFO logger: ,
INFO logger: 'copy'
INFO logger: :
INFO logger: 
INFO logger: ,
INFO logger: 'copyreg'
INFO logger: :
INFO logger: 
INFO logger: ,
INFO logger: 'datetime'
INFO logger: :
INFO logger: 
INFO logger: ,
INFO logger: 'encodings'
INFO logger: :
INFO logger: 
INFO logger: ,
INFO logger: 'encodings.aliases'
INFO logger: :
INFO logger: 
INFO logger: ,
INFO logger: 'encodings.latin_1'
INFO logger: :
INFO logger: 
INFO logger: ,
INFO logger: 'encodings.utf_8'
INFO logger: :
INFO logger: 
INFO logger: ,
INFO logger: 'enum'
INFO logger: :
INFO logger: 
INFO logger: ,
INFO logger: 'errno'
INFO logger: :
INFO logger: 
INFO logger: ,
INFO logger: 'functools'
INFO logger: :
INFO logger: 
INFO logger: ,
INFO logger: 'genericpath'
INFO logger: :
INFO logger: 
INFO logger: ,
INFO logger: 'google'
INFO logger: :
INFO logger: 
INFO logger: ,
INFO logger: 'google.logging'
INFO logger: :
INFO logger: 
INFO logger: ,
INFO logger: 'greplin'
INFO logger: :
INFO logger: 
INFO logger: ,
INFO logger: 'grp'
INFO logger: :
INFO logger: 
INFO logger: ,
INFO logger: 'heapq'
INFO logger: :
INFO logger: 
INFO logger: ,
INFO logger: 'importlib'
INFO logger: :
INFO logger: 
INFO logger: ,
INFO logger: 'importlib._bootstrap'
INFO logger: :
INFO logger: 
INFO logger: ,
INFO logger: 'importlib._bootstrap_external'
INFO logger: :
INFO logger: 
INFO logger: ,
INFO logger: 'importlib.abc'
INFO logger: :
INFO logger: 
INFO logger: ,
INFO logger: 'importlib.machinery'
INFO logger: :
INFO logger: 
INFO logger: ,
INFO logger: 'importlib.util'
INFO logger: :
INFO logger: 
INFO logger: ,
INFO logger: 'io'
INFO logger: :
INFO logger: 
INFO logger: ,
INFO logger: 'itertools'
INFO logger: :
INFO logger: 
INFO logger: ,
INFO logger: 'keyword'
INFO logger: :
INFO logger: 
INFO logger: ,
INFO logger: 'language'
INFO logger: :
INFO logger: 
INFO logger: ,
INFO logger: 'language.openshot_lang'
INFO logger: :
INFO logger: 
INFO logger: ,
INFO logger: 'lazr'
INFO logger: :
INFO logger: 
INFO logger: ,
INFO logger: 'linecache'
INFO logger: :
INFO logger: 
INFO logger: ,
INFO logger: 'logging'
INFO logger: :
INFO logger: 
INFO logger: ,
INFO logger: 'logging.handlers'
INFO logger: :
INFO logger: 
INFO logger: ,
INFO logger: 'marshal'
INFO logger: :
INFO logger: 
INFO logger: ,
INFO logger: 'math'
INFO logger: :
INFO logger: 
INFO logger: ,
INFO logger: 'moksha'
INFO logger: :
INFO logger: 
INFO logger: ,
INFO logger: 'mpl_toolkits'
INFO logger: :
INFO logger: 
INFO logger: ,
INFO logger: 'operator'
INFO logger: :
INFO logger: 
INFO logger: ,
INFO logger: 'os'
INFO logger: :
INFO logger: 
INFO logger: ,
INFO logger: 'os.path'
INFO logger: :
INFO logger: 
INFO logger: ,
INFO logger: 'paste'
INFO logger: :
INFO logger: 
INFO logger: ,
INFO logger: 'pickle'
INFO logger: :
INFO logger: 
INFO logger: ,
INFO logger: 'pkgutil'
INFO logger: :
INFO logger: 
INFO logger: ,
INFO logger: 'platform'
INFO logger: :
INFO logger: 
INFO logger: ,
INFO logger: 'posix'
INFO logger: :
INFO logger: 
INFO logger: ,
INFO logger: 'posixpath'
INFO logger: :
INFO logger: 
INFO logger: ,
INFO logger: 'pprint'
INFO logger: :
INFO logger: 
INFO logger: ,
INFO logger: 'pwd'
INFO logger: :
INFO logger: 
INFO logger: ,
INFO logger: 'queue'
INFO logger: :
INFO logger: 
INFO logger: ,
INFO logger: 're'
INFO logger: :
INFO logger: 
INFO logger: ,
INFO logger: 'readline'
INFO logger: :
INFO logger: 
INFO logger: ,
INFO logger: 'reprlib'
INFO logger: :
INFO logger: 
INFO logger: ,
INFO logger: 'rlcompleter'
INFO logger: :
INFO logger: 
INFO logger: ,
INFO logger: 'ruamel'
INFO logger: :
INFO logger: 
INFO logger: ,
INFO logger: 'select'
INFO logger: :
INFO logger: 
INFO logger: ,
INFO logger: 'selectors'
INFO logger: :
INFO logger: 
INFO logger: ,
INFO logger: 'signal'
INFO logger: :
INFO logger: 
INFO logger: ,
INFO logger: 'sip'
INFO logger: :
INFO logger: 
INFO logger: ,
INFO logger: 'site'
INFO logger: :
INFO logger: 
INFO logger: ,
INFO logger: 'socket'
INFO logger: :
INFO logger: 
INFO logger: ,
INFO logger: 'sphinxcontrib'
INFO logger: :
INFO logger: 
INFO logger: ,
INFO logger: 'sre_compile'
INFO logger: :
INFO logger: 
INFO logger: ,
INFO logger: 'sre_constants'
INFO logger: :
INFO logger: 
INFO logger: ,
INFO logger: 'sre_parse'
INFO logger: :
INFO logger: 
INFO logger: ,
INFO logger: 'stat'
INFO logger: :
INFO logger: 
INFO logger: ,
INFO logger: 'straight'
INFO logger: :
INFO logger: 
INFO logger: ,
INFO logger: 'string'
INFO logger: :
INFO logger: 
INFO logger: ,
INFO logger: 'struct'
INFO logger: :
INFO logger: 
INFO logger: ,
INFO logger: 'subprocess'
INFO logger: :
INFO logger: 
INFO logger: ,
INFO logger: 'sys'
INFO logger: :
INFO logger: 
INFO logger: ,
INFO logger: 'sysconfig'
INFO logger: :
INFO logger: 
INFO logger: ,
INFO logger: 'syslog'
INFO logger: :
INFO logger: 
INFO logger: ,
INFO logger: 'systemd'
INFO logger: :
INFO logger: 
INFO logger: ,
INFO logger: 'systemd._journal'
INFO logger: :
INFO logger: 
INFO logger: ,
INFO logger: 'systemd._reader'
INFO logger: :
INFO logger: 
INFO logger: ,
INFO logger: 'systemd.id128'
INFO logger: :
INFO logger: 
INFO logger: ,
INFO logger: 'systemd.journal'
INFO logger: :
INFO logger: 
INFO logger: ,
INFO logger: 'threading'
INFO logger: :
INFO logger: 
INFO logger: ,
INFO logger: 'time'
INFO logger: :
INFO logger: 
INFO logger: ,
INFO logger: 'token'
INFO logger: :
INFO logger: 
INFO logger: ,
INFO logger: 'tokenize'
INFO logger: :
INFO logger: 
INFO logger: ,
INFO logger: 'traceback'
INFO logger: :
INFO logger: 
INFO logger: ,
INFO logger: 'types'
INFO logger: :
INFO logger: 
INFO logger: ,
INFO logger: 'typing'
INFO logger: :
INFO logger: 
INFO logger: ,
INFO logger: 'typing.io'
INFO logger: :
INFO logger: 
INFO logger: ,
INFO logger: 'typing.re'
INFO logger: :
INFO logger: 
INFO logger: ,
INFO logger: 'usercustomize'
INFO logger: :
INFO logger: 
INFO logger: ,
INFO logger: 'uuid'
INFO logger: :
INFO logger: 
INFO logger: ,
INFO logger: 'warnings'
INFO logger: :
INFO logger: 
INFO logger: ,
INFO logger: 'weakref'
INFO logger: :
INFO logger: 
INFO logger: ,
INFO logger: 'xstatic'
INFO logger: :
INFO logger: 
INFO logger: ,
INFO logger: 'xstatic.pkg'
INFO logger: :
INFO logger: 
INFO logger: ,
INFO logger: 'zipimport'
INFO logger: :
INFO logger: 
INFO logger: ,
INFO logger: 'zope'
INFO logger: :
INFO logger: 
INFO logger: }

(I've always hated the output redirection, TBH, for exactly that reason. It also breaks things like profilers that expect to dump their data collection to stdout in a certain format. First thing I always do when running OpenShot interactively is sys.stdout = sys.__stdout__ to un-break normal output.)

I kind of feel like it should be possible to capture stdout into the logger, without also affecting stdout. Like, we should be able to just impose ourselves into the stream, copy whatever's sent there, but then let it go to the output exactly as we found it, only modifying the version we send to the logfile output handler.

It'd probably require some careful stream processing, but it should be possible. It'd mean that things like uncaught exceptions aren't shown as log messages, on the console — but that's a good thing, really. And they'll still be log-decorated in the actual log file.

@ferdnyc
Copy link
Contributor

ferdnyc commented Jun 4, 2021

It took me ALL night, but I believe I have solved the logging/redirection issue. #4175 is a complete overhaul of classes.logger, targeting this PR branch (not develop), as I'm hopeful it'll be most immediately useful here. From my PR summary:

This is a major rewrite of classes.logging, which I'm hopeful solves the long-standing issues caused by the redirection of sys.stdout and sys.stderr for capture purposes.

This PR doesn't do away with the redirection, instead it does it cleanly and in a manner that doesn't break Python features that depend on the standard I/O streams.

In the new logging, the StreamToLogger shims that take over for sys.stdout and sys.stderr return to buffering the messages written to the streams, logging only when the stream is flush()ed. In my testing, this means that multi-line texts like tracebacks will be captured as a single log message. Or in the worst case, as a very small number of partial messages, when other output interrupts the output when it's only partially complete.

All output that's captured on sys.stdout or sys.stderr is immediately passed along to the real stream outputs; the only thing the shim does is buffer a copy first. When a flush() comes through, the buffer is then logged to an OpenShot.stderr child logger of our main OpenShot logging context, and those log messages are suppressed from being re-echoed to the console, to avoid redundant duplication of output.

This means that any output send to the standard output streams, whether by OpenShot, by Python, or by some other code, will be displayed on the terminal unaltered from its original form (no more logging decoration), which should make tools that depend on the format of those messages much happier. (I'm hopeful this includes Sentry, as well.)

…#4175)

StreamToLogger is now backed by a filter on the console stream handler,
which rejects messages that originated ON stdout/stderr.

Additionally, StreamToLogger's capture now only logs after the output
is flushed, meaning multi-line tracebacks and the like will be logged
as a single message.
@jonoomph
Copy link
Member Author

jonoomph commented Jun 5, 2021

@ferdnyc This seems to work great, as far as I can tell! Nice work! 🏆 I am seeing stack-trace output in both our log files and only a single exception sent to Sentry! Woot!

@jonoomph jonoomph changed the title WIP: Initial integration of Sentry (crash monitoring) Initial integration of Sentry (crash monitoring) Jun 7, 2021
@jonoomph
Copy link
Member Author

jonoomph commented Jun 7, 2021

Another interesting idea... is to catch the before_send event (should be possible with Sentry), collect the current trace ID, and display an error icon on the main toolbar in OpenShot (maybe a little bug or exclamation mark). The user has 30 minutes to click the bug, and enter a text description of what happened before the bug... and Sentry has an API we can call, to append the user feedback to the actual trace/exception report. This sounds like a really cool idea though!

I like this idea vs opening a dialog, in those cases where we might have a bunch of errors, or a repeating error... I would hate to block the entire UI continuously with a user-feedback dialog.

@ferdnyc
Copy link
Contributor

ferdnyc commented Jun 8, 2021

I definitely hate the idea of ever forcing a dialog on the user, TBH most of our EXISTING message dialogs should go.

(The user never, EVER needs to be interrupted to be told we couldn't read an icon or a transition image, with the possible exception of it being one that's actually referenced in the project file. But when building the models? Nuh-uh. Totally not their problem.)

Regarding before_send, would the Sentry message delivery then be held for the user's input, in that scenario? I'm just concerned... that sounds like a good way to lose a lot of exception data, when they never submit anything (and manage to crash the program or hard-reboot the OS without giving us a chance to complete the delayed report).

If it's possible in the Sentry API to attach user feedback to existing data after-the-fact, so we could deliver the initial data immediately and then update it if/when they provide details, that feels safer to me. In which case, I don't see the point of the 30-minute timeout... can we just leave the option open at least through the end of the session? (Or the next session, if it's crash-recovery feedback.) If not... well, indefinitely?

(Fedora's Abrt lets users open bugzilla tickets off of any unreported crash log that's still in its cache -- hours, days, weeks, even months old. I'm not saying we have to go that far, but 30 minutes is an awfully short arbitrary window. Especially if you're in the middle of trying to actually do something productive with the program that just crapped itself, and don't really have 5 minutes to spare RIGHT NOW so you can describe how.)

@jonoomph
Copy link
Member Author

jonoomph commented Jun 9, 2021

Yeah, it would immediately send the trace to Sentry... but we have 30 minutes of additional time to collect user input (if we want to) and POST it to their API to append to our trace.

@jonoomph jonoomph merged commit d014ca4 into develop Jun 9, 2021
@jonoomph
Copy link
Member Author

jonoomph commented Jun 9, 2021

I've tested this on all 4 builders/installers, and it seems to work (best I can tell).

@jonoomph jonoomph deleted the integrate-sentry-tracing branch June 9, 2021 04:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants