-
Notifications
You must be signed in to change notification settings - Fork 3.2k
Fix webhooks worker not restarting #8921
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
WalkthroughA fix has been implemented to improve the reliability of the webhook worker by ensuring it can automatically restart after losing the Redis connection. The change involves modifying the Supervisord configuration for the webhook worker to enable automatic restarts, which prevents potential service disruptions and enhances the worker's resilience. Changes
Poem
Finishing Touches
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
31b6adb
to
f16e893
Compare
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
changelog.d/20250109_195116_roman_restart_webhooks.md (1)
3-3
: Enhance changelog entry with implementation details.The current entry is good but could be more specific about the solution implemented.
Consider this enhanced description:
-Fixed webhook worker not restarting after losing Redis connection +Fixed webhook worker not restarting after losing Redis connection by enabling automatic restart in supervisord configuration
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
changelog.d/20250109_195116_roman_restart_webhooks.md
(1 hunks)supervisord/worker.webhooks.conf
(1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (5)
- GitHub Check: build
- GitHub Check: generate_github_pages
- GitHub Check: testing
- GitHub Check: Linter
- GitHub Check: Analyze (javascript)
🔇 Additional comments (1)
supervisord/worker.webhooks.conf (1)
26-26
: LGTM! Verify consistency with other workers.The
autorestart=true
configuration is correctly placed and will ensure the webhooks worker restarts automatically on exit.Let's verify if this configuration is consistent with other workers:
Additionally, consider adding monitoring to track restart frequency:
[program:rqworker-webhooks] command=%(ENV_HOME)s/wait_for_deps.sh python3 %(ENV_HOME)s/manage.py rqworker -v 3 webhooks --worker-class cvat.rqworker.DefaultWorker environment=VECTOR_EVENT_HANDLER="SynchronousLogstashHandler",CVAT_POSTGRES_APPLICATION_NAME="cvat:worker:webhooks" numprocs=%(ENV_NUMPROCS)s process_name=%(program_name)s-%(process_num)d autorestart=true +stdout_logfile=%(ENV_HOME)s/logs/%(program_name)s-%(process_num)d.log +stdout_logfile_maxbytes=50MB +stdout_logfile_backups=10✅ Verification successful
Configuration is consistent with other workers
The
autorestart=true
setting is already present in other worker configurations, making this change to the webhooks worker consistent with the existing pattern.🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Check autorestart configuration in other worker configs rg -A 1 "program:.*worker" supervisord/ | rg -A 1 "autorestart"Length of output: 63
Script:
#!/bin/bash # First, find all supervisord config files echo "=== Supervisord config files ===" fd -e conf -e ini . supervisord/ echo -e "\n=== Worker configurations ===" # Search for program sections containing 'worker' and show surrounding context fd -e conf -e ini . supervisord/ -x cat {} | grep -A 5 -B 1 "\[program:.*worker.*\]"Length of output: 3183
Script:
#!/bin/bash # Check complete content of a few worker configs echo "=== worker.annotation.conf ===" cat supervisord/worker.annotation.conf echo -e "\n=== worker.export.conf ===" cat supervisord/worker.export.conf echo -e "\n=== worker.import.conf ===" cat supervisord/worker.import.confLength of output: 3642
Motivation and context
rq exits with a 0 status if its Redis connection times out, which means that supervisord will not restart it by default. Set
autorestart=true
, like for all other workers.How has this been tested?
Checklist
develop
branch[ ] I have updated the documentation accordingly[ ] I have added tests to cover my changes[ ] I have linked related issues (see GitHub docs)[ ] I have increased versions of npm packages if it is necessary(cvat-canvas,
cvat-core,
cvat-data and
cvat-ui)
License
Feel free to contact the maintainers if that's a concern.
Summary by CodeRabbit