feat: Queuer-side progress indication #190

github-actions · 2022-07-08T02:08:21Z

It would be really nice if we could have some indication of progress (ideally a progress bar) showing how far through the current jobs we have. Alive progress handles multiple nested progress bars. That could be really cool, although not sure how feasible depending on how many jobs we have queued. Otherwise an overall progress bar would be fine. Some single-line logging on which files have finished would be awesome too.
closes #145

- Removed some dead code - Turned some very detached statements into utils functions - Centralised logging statements

Update task progress from ffmpeg_process using task id and Redis connection.

MVP atm. Set up pretty basically. Currently each task gets its own bar. - Uses redis pub/sub, to communicate encoding progress updates from worker - Basic progress bar routing based on Celery task ID Next steps: - Get average progress from all reporting tasks instead of multiple bars - Get task group active worker count as single-line spinner - Secondary progress bar to display discretely total tasks done

…ttps://github.com/in03/resolve-proxy-encoder into feature/issue-145-Queuer-side-progress-indication

Looks like it's working! Had to refactor a lot. And more refactoring will be necessary. Looking like a bit of a dog's breakfast. But now we have secondary bar showing discrete total progress with a 'last encoded' data field. Nice.

…ttps://github.com/in03/resolve-proxy-encoder into feature/issue-145-Queuer-side-progress-indication

- Worker spinner displays host machine count, will alter to worker count - Task progress bar displays output file name instead of task_id now

…ttps://github.com/in03/resolve-proxy-encoder into feature/issue-145-Queuer-side-progress-indication

Rationale for changes: Job expiry allows jobs that are never received to be cleared, so they don't stick around until a worker picks them up two days later. This should also deal with broken tasks that cannot be received for some reason. We're also beginning to make terminology changes that fully commit to Redis, since we're relying on it for progress report and webapp monitor and can't use other brokers now. Hence 'job' instead of Celery's 'task'. We want to abstract Celery's settings a bit since the end user doesn't need the granular access anymore. -

- Renamed 'celery' section 'broker' - Dropped 'result_backend' key, 'host_address' key, 'broker_url' key for simple 'url' - Dropped 'flower_url' since being replaced by webapp monitor

Also removed 'mon' subcommand that relies on removed 'flower_url'. Easily enough saved as a desktop shortcut, but will likely be superseded by webapp monitor anyway.

Moved 'worker_concurrency', 'worker_prefetch_multiplier' and 'worker_max_tasks_per_child' to celery.py app.conf.update section. These settings force Celery workers to run tasks one at a time. That means one task per process at a time. Higher concurrency on a host requires more processes to be run. This is the workaround for prefork not being supported on windows. As such the worker launcher is set up to work with this pattern only. There's no point supporting other methods for other OSes, since they're all compatible with this one. At this point, changing the concurrency settings would break the launcher anyway.

app.config_from_object expected keys to match celery documentation of course. Key names are a little obtuse for end users not touching Celery. Since dropping support for other brokers, Redis result backend and broker url are the same. We just remap the terms in a new dict. It's another thing to maintain, but hopefully won't change much.

Broker command acts as a gateway to Celery commands. This is necessary since access to Celery is buried deep within pipx virtualenv if following recommended install procedure. Initial support only. Doesn't support any options yet, just command. We need to figure out how to pass options ('-f', '-Q', etc) unparsed by Typer. Also fixed purge command. Purge command only purged default queue ('celery'). We now pass the version constraint key to purge jobs on our version.

- Refactored TaskTracker as broker and ProgressTracker classes - Started trying to fix average progress... Not there yet - Added worker pickup/success/fail logging above progress - Added logger debug messages above progress - Remove unused imports�

…ttps://github.com/in03/resolve-proxy-encoder into feature/issue-145-Queuer-side-progress-indication

- Added overall progress bar - Check prevents progress going backwards, may need additional work - Worker status updates printed same line, missing all but "STARTED"

…ttps://github.com/in03/resolve-proxy-encoder into feature/issue-145-Queuer-side-progress-indication

Assertion was causing corrupt proxies to crash Resolve Proxy Encoder. Not sure why I put it in there. Probably just for testing... Dummy.

…ttps://github.com/in03/resolve-proxy-encoder into feature/issue-145-Queuer-side-progress-indication

This really was to get this long-awaited feature out the door. Issues with messages buffering with Redis pub/sub was preventing progress from showing accurately. It would slow right down when lots of tasks were queued. Pub/sub probably needs some multi-threading support to work properly. Since the buffering is taking place at a TCP level, we can't just ask Redis to ignore 'guaranteed delivery' (which is not important for us). This works for now, but I'm no expert on interacting with databases. Otherwise, the pub/sub pattern worked really nicely for the following reasons: - easy channel pattern matching - channel-based handlers - messages are only received once, as opposed to having to ignore the same keys with get/set - keys aren't stored in Redis once all subscribers have received

Worker names are difficult to get without deep-diving Celery signals. Instead of getting worker names through progress reports from the worker task, we can just catch task events and get worker names from there. If the name is unique, we count it. Done.

Forgot to remove the argument passed to the ffmpeg run call from tasks. It was throwing: TypeError: run() got an unexpected keyword argument 'worker_name'

To avoid reading the same data multiple times, we need to make sure we can compare new data to stale data. I'm not sure what the best way to do it is, but we were storing all the progress dicts and comparing each new dict with the list of old ones. Surely performance would eventually degrade and storing whole dicts is not wise. I've replaced storing dicts with storing quick SHA1 sums. Calculating the sums doesn't seem to affect performance negatively. Will keep an eye on it. Open to suggestions.

Sometimes when all tasks finish and progress is still going, we can get a DivisionByZero error. This happens because the criteria for loop finish is actually Celery's `results.ready`, not when progress bar reaches end.

…ttps://github.com/in03/resolve-proxy-encoder into feature/issue-145-Queuer-side-progress-indication

in03 · 2022-08-08T04:09:50Z

I think we can consider this mergable now. It serves its basic purpose, but there are still some niggly things to iron out. They're not easy fixes though, so we'll merge for now and come back to it later. Maybe someone more knowledgable on progress bars can help.

Known issues

Average progress bar jumps on callable task completion
Average progress can sometimes go backwards, despite negative progress check? Weird.

Create draft PR for #145

e871fed

github-actions bot added the enhancement label Jul 8, 2022

github-actions bot assigned in03 Jul 8, 2022

in03 added 5 commits July 8, 2022 12:58

refactor(tasks): Tidied encode_proxy task

8d92e60

- Removed some dead code - Turned some very detached statements into utils functions - Centralised logging statements

feat: Add task-tracker to track progress

1e725db

Update task progress from ffmpeg_process using task id and Redis connection.

refactor: Update version_constraint_key

11d3ff0

Merge branch 'feature/issue-145-Queuer-side-progress-indication' of h…

feb7878

…ttps://github.com/in03/resolve-proxy-encoder into feature/issue-145-Queuer-side-progress-indication

in03 changed the title ~~Queuer-side progress indication~~ feat: Queuer-side progress indication Jul 12, 2022

refactor: Update version_constraint_key

48d3c3f

in03 mentioned this pull request Jul 12, 2022

Queuer-side progress indication #145

Closed

in03 and others added 19 commits July 12, 2022 16:03

feat: Add discrete total progress

c02af0a

Looks like it's working! Had to refactor a lot. And more refactoring will be necessary. Looking like a bit of a dog's breakfast. But now we have secondary bar showing discrete total progress with a 'last encoded' data field. Nice.

Merge branch 'feature/issue-145-Queuer-side-progress-indication' of h…

4c0c000

…ttps://github.com/in03/resolve-proxy-encoder into feature/issue-145-Queuer-side-progress-indication

refactor: Update version_constraint_key

678064e

feat: Add worker spinner, improve task progress

1f9be17

- Worker spinner displays host machine count, will alter to worker count - Task progress bar displays output file name instead of task_id now

Merge branch 'feature/issue-145-Queuer-side-progress-indication' of h…

553db65

…ttps://github.com/in03/resolve-proxy-encoder into feature/issue-145-Queuer-side-progress-indication

refactor: Update version_constraint_key

1b989ff

refactor: Use avg progress instead of multi

c8b74b9

chore: [pre-commit.ci] automatic fixes

c41c72e

refactor: Update version_constraint_key

895b7ad

refactor(settings): Rework 'celery' terminology

66f9059

- Renamed 'celery' section 'broker' - Dropped 'result_backend' key, 'host_address' key, 'broker_url' key for simple 'url' - Dropped 'flower_url' since being replaced by webapp monitor

refactor: Update setting refs to match new terms

51a1361

Also removed 'mon' subcommand that relies on removed 'flower_url'. Easily enough saved as a desktop shortcut, but will likely be superseded by webapp monitor anyway.

Feat(progress): A whole frickload

1eb3602

- Refactored TaskTracker as broker and ProgressTracker classes - Started trying to fix average progress... Not there yet - Added worker pickup/success/fail logging above progress - Added logger debug messages above progress - Remove unused imports�

Merge branch 'feature/issue-145-Queuer-side-progress-indication' of h…

7d5417d

…ttps://github.com/in03/resolve-proxy-encoder into feature/issue-145-Queuer-side-progress-indication

chore: [pre-commit.ci] automatic fixes

93f6511

refactor: Update version_constraint_key

eeaf1d6

in03 and others added 18 commits July 27, 2022 13:01

Feat: implement average overall progress bar

c6fe48c

- Added overall progress bar - Check prevents progress going backwards, may need additional work - Worker status updates printed same line, missing all but "STARTED"

Merge branch 'feature/issue-145-Queuer-side-progress-indication' of h…

72e933a

…ttps://github.com/in03/resolve-proxy-encoder into feature/issue-145-Queuer-side-progress-indication

chore: [pre-commit.ci] automatic fixes

a889334

refactor: Update version_constraint_key

ef1f8b2

fix(link): Remove assert

9066e80

Assertion was causing corrupt proxies to crash Resolve Proxy Encoder. Not sure why I put it in there. Probably just for testing... Dummy.

refactor: Update version_constraint_key

41bb343

refactor: WIP - change Redis pubsub to polling

6db33ee

Merge branch 'feature/issue-145-Queuer-side-progress-indication' of h…

29f6097

…ttps://github.com/in03/resolve-proxy-encoder into feature/issue-145-Queuer-side-progress-indication

refactor: Update version_constraint_key

5dca625

refactor: Update version_constraint_key

4c09cdd

refactor: Remove dead code and closed TODO

1a5fd44

fix(tasks): Remove unused worker_name var

ee89d46

Forgot to remove the argument passed to the ffmpeg run call from tasks. It was throwing: TypeError: run() got an unexpected keyword argument 'worker_name'

fix: Catch DivisionByZero error

c101ab7

Sometimes when all tasks finish and progress is still going, we can get a DivisionByZero error. This happens because the criteria for loop finish is actually Celery's `results.ready`, not when progress bar reaches end.

Merge branch 'feature/issue-145-Queuer-side-progress-indication' of h…

539f871

…ttps://github.com/in03/resolve-proxy-encoder into feature/issue-145-Queuer-side-progress-indication

refactor: Update version_constraint_key

fabd4ac

in03 marked this pull request as ready for review August 8, 2022 04:10

in03 merged commit 4928568 into main Aug 8, 2022

in03 deleted the feature/issue-145-Queuer-side-progress-indication branch August 8, 2022 04:11

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Queuer-side progress indication #190

feat: Queuer-side progress indication #190

github-actions bot commented Jul 8, 2022

in03 commented Aug 8, 2022

feat: Queuer-side progress indication #190

feat: Queuer-side progress indication #190

Conversation

github-actions bot commented Jul 8, 2022

in03 commented Aug 8, 2022

Known issues