-
-
Notifications
You must be signed in to change notification settings - Fork 3
refactor: Job dict object as Class #234
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
in03
merged 38 commits into
main
from
feature/issue-202-Refactor-job-dict-object-as-Class
Nov 16, 2022
Merged
refactor: Job dict object as Class #234
in03
merged 38 commits into
main
from
feature/issue-202-Refactor-job-dict-object-as-Class
Nov 16, 2022
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
…ttps://github.com/in03/resolve-proxy-encoder into feature/issue-202-Refactor-job-dict-object-as-Class
…ttps://github.com/in03/resolve-proxy-encoder into feature/issue-202-Refactor-job-dict-object-as-Class
…ttps://github.com/in03/resolve-proxy-encoder into feature/issue-202-Refactor-job-dict-object-as-Class
…ttps://github.com/in03/resolve-proxy-encoder into feature/issue-202-Refactor-job-dict-object-as-Class
in03
reviewed
Nov 16, 2022
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I reckon we've got it!
in03
approved these changes
Nov 16, 2022
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Original issue description
Problem
We've been living dangerously for the past couple years since our humble beginings, and we're still passing a dictionary around as our job object. This isn't great. We don't get any type-validation or intellisense that comes with classes. And, ignoring the FP vs OOP debate, we don't get any instance methods or shared state between very related functions. Most of our handlers can join the job class. We're iterating the same jobs over and over again to filter, validate and prompt based on different criteria of the same data, which is very inefficient. We could do most of this on
__init__
.Dataclasses
Dataclasses do away with the
__init__
section to make working with lots of variables easy. They also implement some common class methods like_repr_
and__eq__
to allow comparing classes. They keep data-centric classes tidy, helping keep bugs to a minimum. Implementing dataclasses will win us:Unfortunately we have some hurdles to make them work.
Mutability
Currently, jobs have data that can be treated as immutable, mostly in the form of clip properties and project metadata. Some of this metadata we manipulate to reflect user choices and detected paths, notably:
proxy
- Set to None to rerender offlineproxy_media_path
- set to determined output pathWe use the existing keys to save us from extra complications later – these are big dictionaries, and potentially a lot of them to iterate through!
Unfortunately this is slow 🐌 since we're constantly handling mutable data, and there's no static type safety. Just
KeyError
,TypeError
or worst of allNoneType has no attribute ___
Parsing
We need to parse certain values before we use them in certain places. That's exactly what the
__init__
section is for, handling the instance variables... but that's a lot of handling! We don't need to handle them all. To avoid handling them all, we could update only the ones we need using the__post_init__
dunder.Clip properties form the main 'chunk' of a job.
Then there's project properties that we add later.
Proposal
A traditional class,
Job
, with immutable,SourceMetadata
andProjectMetadata
dataclasses; and dynamic variables to easily updatemediapoolitem
, proxy status and any changing paths.All types are set to target types, like
bool
forh_flip
andv_flip
, list for horizontal and verticalresolution
, instead ofstr
for everything. This means the worker gets what it expects, pre-parsed.Instead of passing proxy and path settings, we pass the entire settings object, then we can validate the object as type
SettingsManager
Any handlers that don't prompt the user can be become class methods, like getting a collision-free output path.
Moving forward from here
At some point we'll want to introduce chunking. Since the
resolve
module actually instantiates all theJobs
withget_proxy_jobs()
, it returns them to the queuer class. If we want them to be chunked, we need to re instantiate each job, since every job should be a chunk. We don't want to re-init everything again by making new classes...We can merge
project_metadata
dataclass withsource_metadata
. That way we don't have to iterate eachsource_metadata
object and add the sameproject_metadata
data.We can make
MediaPoolRefs
a Singleton class. That way we can register all themedia_pool_items
from theResolve
class and just importMediaPoolRefs
in thequeuer
module. The reason why we can't ship them withsource_metadata
is that they are unhashable, referencing a pointer in memory to a proprietary API, and will fail transport in Celery. We also don't want to remove them, and we want easy access to them from other modules. A singleton provides that.There seems to be some discouragement of singletons in Python. Many call them an anti-pattern. We may be able to substitute them for a plain module? All global code and functions. We'd just have to be careful that we are indeed importing the same module instance whenever we do.
If that works out well for us, it may be an idea to re-evaluate SettingsManager as a plain module.
Perhaps instead of returning
jobs
when we callget_proxy_jobs
, we can addget_proxy_jobs
and other sister functions to theResolve
class. Then we can save any variables we need to thatself
and call them like so:closes #202