Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Logging overhaul, to solve issues caused by stdout/stderr redirection #4175

Merged
merged 3 commits into from
Jun 5, 2021

Conversation

ferdnyc
Copy link
Contributor

@ferdnyc ferdnyc commented Jun 4, 2021

This is a major rewrite of classes.logger, which I'm hopeful solves the long-standing issues caused by the redirection of sys.stdout and sys.stderr for capture purposes.

This PR doesn't do away with the redirection, instead it does it cleanly and in a manner that doesn't break Python features that depend on the standard I/O streams.

In the new logging, the StreamToLogger shims that take over for sys.stdout and sys.stderr return to buffering the messages written to the streams, logging only when the stream is flush()ed. In my testing, this means that multi-line texts like tracebacks will be captured as a single log message. Or in the worst case, as a very small number of partial messages, when other output interrupts the output when it's only partially complete.

All output that's captured on sys.stdout or sys.stderr is immediately passed along to the real stream outputs; the only thing the shim does is buffer a copy first. When a flush() comes through, the buffer is then logged to an OpenShot.stderr child logger of our main OpenShot logging context, and those log messages are suppressed from being re-echoed to the console, to avoid redundant duplication of output.

This means that any output sent to the standard output streams, whether by OpenShot, by Python, or by some other code, will be displayed on the terminal unaltered from its original form (no more logging decoration), which should make tools that depend on the format of those messages much happier. (I'm hopeful this includes Sentry, as well.)

StreamToLogger is now backed by a filter on the console stream handler,
which rejects messages that originated ON stdout/stderr.

Additionally, StreamToLogger's capture now only logs after the output
is flushed, meaning multi-line tracebacks and the like will be logged
as a single message.
# logging record, if present.
# {"source": "stream"} will be attached to any records generated
# by the redirection, thanks to LoggingAdapter)
filt = StreamFilter(lambda x: x != "stream")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💯

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like it! Nice

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is actually a bit of an overcomplicated design (as the fact that it requires 5 lines of comment to explain one line of code makes pretty clear)... originally I thought I was going to need two filters, one on the stream handler and one on the file handler, and I wanted to write it in a way that I could use the same class for both filters.

In the end, the issue was the propagation to the root logger, and one I fixed that I didn't need a file handler filter. (Which makes sense, we want ALL log messages to go to the logfile, so what's there to filter?) So really the filter class could just have bool(source != "stream") hardcoded into it, which would probably be clearer.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did indeed end up eliminating it. The cleverness just didn't justify the complexity.

sys.stderr = handlers['stderr']
if (getattr(sys, 'frozen', False)
or sys.stdout != sys.__stdout__):
return
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't recall why this frozen block was needed. Do you recall? If so, we should add a comment here to demystify it, lol

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I honestly have never had any idea. I assumed it had something to do with the AppImage, but that's pure conjecture.

or sys.stdout != sys.__stdout__):
return
sys.stdout = StreamToLogger(sys.stdout, logging.INFO)
sys.stderr = StreamToLogger(sys.stderr, logging.WARNING)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we be using info.LOG_LEVEL_CONSOLE and info.LOG_LEVEL_FILE here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Those are the levels the messages are emitted AT, not the filtering levels for the loggers, so I'd say no. (In theory sys.stderr could be info.ERROR, but then we'd have a lot more errors in the logs. 😆 )

handlers['stream'].setLevel(level)

"""Adjust the minimum log level for output to the terminal"""
sh.setLevel(level)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nitpick: is there a blank line at the end of this file? Or is the PR just not showing it... 🤔

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The file ENDS with a newline (my editor makes sure of that automatically), but I'm not sure there's a blank line. I can check.

Copy link
Contributor Author

@ferdnyc ferdnyc Jun 5, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, turns out there was a surprising amount of nuance to that question.

Like I said, the file ended on a newline, my editor is set to make sure of that. (Actually, thanks to the .editorconfig file at the root of the repo, so is any editor that respects those files.) That means that the file is protected against concatenation damage and other issues, so you could do this:

$ echo "=======" >> logger.py; tail -3 logger.py
    """Adjust the minimum log level for output to the terminal"""
    sh.setLevel(level)
=======

You'd THINK it would also mean that an editor would let you position your cursor on line 136, since the last line of the file (line 135) ends with a newline. But surprisingly enough, the first two editors I checked (vim and gedit) didn't; they eat the final newline, so in those editors the last line of the file indeed appeared to be line 135.

So, I added a second newline to the end of the file, ensuring that editors would show a blank line at the end.

...Naturally, my own editor (Github Atom) doesn't eat the final newline, so in Atom it now appears as though there are two blank lines at the end. 🤣 :

image

@jonoomph
Copy link
Member

jonoomph commented Jun 5, 2021

Thanks @ferdnyc, this looks very nice! I was also spending too many hours messing with this logging class. I would make a bunch of changes... almost solve it, and then get angry and revert it all, lol. I'm very happy you found a pretty slick solution, looking forward to testing it out. Feel free to merge this into the integrate-sentry-tracing branch when you are ready!

@ferdnyc
Copy link
Contributor Author

ferdnyc commented Jun 5, 2021

Thanks @ferdnyc, this looks very nice! I was also spending too many hours messing with this logging class. I would make a bunch of changes... almost solve it, and then get angry and revert it all, lol.

Ayup. The reason it took me ALL night is, this was version... three, I think. The first two went nowhere.

In the end it all came down to one line:

# Don't pass messages on to root logger
log.propagate = False

Until I stopped propagation to the root logger, it would still get the messages and do its own thing, so I had no real control over what was done with them. I didn't figure that out until I gave every level a completely different logging format. Then I finally figured out where the messages I couldn't seem to eliminate were actually coming from.

(Reading this flowchart in the documentation was also immensely helpful, in the end.)

image

I'm very happy you found a pretty slick solution, looking forward to testing it out. Feel free to merge this into the integrate-sentry-tracing branch when you are ready!

Will do, first I'm gonna fire off another CI run (which I believe should be fixed now, a couple of hours ago I merged my libopenshot changes that fixed the Python bindings — if not, it'll be fixed as soon as the PPAs are regenerated), and also see what bug is up Codacy's ass this time.

ferdnyc added 2 commits June 5, 2021 01:42
- Eliminate the unnecessary filter predicate
- Remove the dicts I got rid of
@ferdnyc ferdnyc merged commit 5031f2d into OpenShot:integrate-sentry-tracing Jun 5, 2021
@ferdnyc ferdnyc deleted the logging-redux branch June 5, 2021 10:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants