Skip to content

feat: Queuer-side progress indication #190

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 44 commits into from
Aug 8, 2022

Conversation

github-actions[bot]
Copy link
Contributor

@github-actions github-actions bot commented Jul 8, 2022

It would be really nice if we could have some indication of progress (ideally a progress bar) showing how far through the current jobs we have. Alive progress handles multiple nested progress bars. That could be really cool, although not sure how feasible depending on how many jobs we have queued. Otherwise an overall progress bar would be fine. Some single-line logging on which files have finished would be awesome too.
closes #145

in03 added 5 commits July 8, 2022 12:58
- Removed some dead code
- Turned some very detached statements into utils functions
- Centralised logging statements
Update task progress from ffmpeg_process using task id and Redis
connection.
MVP atm. Set up pretty basically. Currently each task gets its own bar.
- Uses redis pub/sub, to communicate encoding progress updates from worker
- Basic progress bar routing based on Celery task ID

Next steps:
- Get average progress from all reporting tasks instead of multiple bars
- Get task group active worker count as single-line spinner
- Secondary progress bar to display discretely total tasks done
@in03 in03 changed the title Queuer-side progress indication feat: Queuer-side progress indication Jul 12, 2022
in03 and others added 19 commits July 12, 2022 16:03
Looks like it's working! Had to refactor a lot. And more refactoring
will be necessary. Looking like a bit of a dog's breakfast.

But now we have secondary bar showing discrete total progress with
 a 'last encoded' data field. Nice.
- Worker spinner displays host machine count, will alter to worker count
- Task progress bar displays output file name instead of task_id now
Rationale for changes:
Job expiry allows jobs that are never received to be cleared, so they
don't stick around until a worker picks them up two days later.
This should also deal with broken tasks that cannot be received for some
reason.

We're also beginning to make terminology changes that fully commit to Redis,
since we're relying on it for progress report and webapp monitor
and can't use other brokers now. Hence 'job' instead of Celery's 'task'.
We want to abstract Celery's settings a bit since the end user doesn't
need the granular access anymore.
-
- Renamed 'celery' section 'broker'

- Dropped 'result_backend' key, 'host_address' key, 'broker_url' key
for simple 'url'

- Dropped 'flower_url' since being replaced by webapp monitor
Also removed 'mon' subcommand that relies on removed 'flower_url'.
Easily enough saved as a desktop shortcut, but will likely be
superseded by webapp monitor anyway.
Moved 'worker_concurrency', 'worker_prefetch_multiplier' and
'worker_max_tasks_per_child' to celery.py app.conf.update section.

These settings force Celery workers to run tasks one at a time.
That means one task per process at a time. Higher concurrency on
a host requires more processes to be run. This is the workaround
for prefork not being supported on windows.

As such the worker launcher is set up to work with this pattern only.
There's no point supporting other methods for other OSes, since
they're all compatible with this one.

At this point, changing the concurrency settings would break the
launcher anyway.
app.config_from_object expected keys to match celery documentation of
course. Key names are a little obtuse for end users not touching Celery.

Since dropping support for other brokers, Redis result backend
and broker url are the same. We just remap the terms in a new dict.
It's another thing to maintain, but hopefully won't change much.
Broker command acts as a gateway to Celery commands.
This is necessary since access to Celery is buried deep within pipx
virtualenv if following recommended install procedure.

Initial support only. Doesn't support any options yet, just command.
We need to figure out how to pass options ('-f', '-Q', etc) unparsed
by Typer.

Also fixed purge command. Purge command only purged default queue
('celery'). We now pass the version constraint key to purge jobs on our
version.
- Refactored TaskTracker as broker and ProgressTracker classes
- Started trying to fix average progress... Not there yet
- Added worker pickup/success/fail logging above progress
- Added logger debug messages above progress
- Remove unused imports�
in03 and others added 18 commits July 27, 2022 13:01
- Added overall progress bar
- Check prevents progress going backwards, may need additional work
- Worker status updates printed same line, missing all but "STARTED"
Assertion was causing corrupt proxies to crash Resolve Proxy Encoder.
Not sure why I put it in there. Probably just for testing...
Dummy.
This really was to get this long-awaited feature out the door.
Issues with messages buffering with Redis pub/sub was preventing
progress from showing accurately. It would slow right down when lots of
tasks were queued. Pub/sub probably needs some multi-threading support
to work properly. Since the buffering is taking place at a TCP level, we
can't just ask Redis to ignore 'guaranteed delivery' (which is not important
for us). This works for now, but I'm no expert on interacting with databases.

Otherwise, the pub/sub pattern worked really nicely for the
following reasons:

- easy channel pattern matching
- channel-based handlers
- messages are only received once, as opposed to
having to ignore the same keys with get/set
- keys aren't stored in Redis once all subscribers have received
Worker names are difficult to get without deep-diving Celery signals.
Instead of getting worker names through progress reports from
the worker task, we can just catch task events and get worker names
from there. If the name is unique, we count it.

Done.
Forgot to remove the argument passed to the ffmpeg run call from tasks.
It was throwing:

TypeError: run() got an unexpected keyword argument 'worker_name'
To avoid reading the same data multiple times, we need to make sure
we can compare new data to stale data. I'm not sure what the best way
to do it is, but we were storing all the progress dicts and comparing
each new dict with the list of old ones.

Surely performance would eventually degrade and storing whole dicts is
not wise. I've replaced storing dicts with storing quick SHA1 sums.
Calculating the sums doesn't seem to affect performance negatively.
Will keep an eye on it. Open to suggestions.
Sometimes when all tasks finish and progress is still going, we can
get a DivisionByZero error. This happens because the criteria for loop
finish is actually Celery's `results.ready`, not when progress bar
reaches end.
@in03
Copy link
Owner

in03 commented Aug 8, 2022

I think we can consider this mergable now. It serves its basic purpose, but there are still some niggly things to iron out. They're not easy fixes though, so we'll merge for now and come back to it later. Maybe someone more knowledgable on progress bars can help.

Known issues

  • Average progress bar jumps on callable task completion
  • Average progress can sometimes go backwards, despite negative progress check? Weird.

@in03 in03 marked this pull request as ready for review August 8, 2022 04:10
@in03 in03 merged commit 4928568 into main Aug 8, 2022
@in03 in03 deleted the feature/issue-145-Queuer-side-progress-indication branch August 8, 2022 04:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Queuer-side progress indication
1 participant