Skip to content

Fix Docker Healthcheck #29471

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Mar 11, 2025
Merged

Fix Docker Healthcheck #29471

merged 5 commits into from
Mar 11, 2025

Conversation

benbz
Copy link
Member

@benbz benbz commented Mar 11, 2025

Checklist

  • Tests written for new code (and old code if feasible).
  • New or updated public/exported symbols have accurate TSDoc documentation.
  • Linter and other CI checks pass.
  • I have licensed the changes to Element by completing the Contributor License Agreement (CLA)

Fixes up the healthcheck added in #29351.

The wget in container is provided by busybox and only provides the following options:

$ docker run -it --rm --entrypoint sh vectorim/element-web:v1.11.95              
/ $ wget
BusyBox v1.37.0 (2025-01-17 18:12:01 UTC) multi-call binary.

Usage: wget [-cqS] [--spider] [-O FILE] [-o LOGFILE] [--header STR]
	[--post-data STR | --post-file FILE] [-Y on/off]
	[-P DIR] [-U AGENT] [-T SEC] URL...

Retrieve files via HTTP or FTP

	--spider	Only check URL existence: $? is 0 if exists
	--header STR	Add STR (of form 'header: value') to headers
	--post-data STR	Send STR using POST method
	--post-file FILE	Send FILE using POST method
	-c		Continue retrieval of aborted transfer
	-q		Quiet
	-P DIR		Save to DIR (default .)
	-S    		Show server response
	-T SEC		Network read timeout is SEC seconds
	-O FILE		Save to FILE ('-' for stdout)
	-o LOGFILE	Log messages to FILE
	-U STR		Use STR for User-Agent header
	-Y on/off	Use proxy
/ $ 

The test didn't catch this as State.Running is true even if the healthcheck is failing, e.g:

...
        "State": {
            "Status": "running",
            "Running": true,
            "Paused": false,
            "Restarting": false,
            "OOMKilled": false,
            "Dead": false,
            "Pid": 747376,
            "ExitCode": 0,
            "Error": "",
            "StartedAt": "2025-03-11T15:35:11.270785924Z",
            "FinishedAt": "0001-01-01T00:00:00Z",
            "Health": {
                "Status": "unhealthy",
                "FailingStreak": 35,
                "Log": [
                    {
                        "Start": "2025-03-11T15:50:18.41101394Z",
                        "End": "2025-03-11T15:50:18.465344516Z",
                        "ExitCode": 1,
                        "Output": "wget: unrecognized option: retry-connrefused\nBusyBox v1.37.0 (2025-01-17 18:12:01 UTC) multi-call binary.\n\nUsage: wget [-cqS] [--spider] [-O FILE] [-o LOGFILE] [--header STR]\n\t[--post-data STR | --post-file FILE] [-Y on/off]\n\t[-P DIR] [-U AGENT] [-T SEC] URL...\n\nRetrieve files via HTTP or FTP\n\n\t--spider\tOnly check URL existence: $? is 0 if exists\n\t--header STR\tAdd STR (of form 'header: value') to headers\n\t--post-data STR\tSend STR using POST method\n\t--post-file FILE\tSend FILE using POST method\n\t-c\t\tContinue retrieval of aborted transfer\n\t-q\t\tQuiet\n\t-P DIR\t\tSave to DIR (default .)\n\t-S    \t\tShow server response\n\t-T SEC\t\tNetwork read timeout is SEC seconds\n\t-O FILE\t\tSave to FILE ('-' for stdout)\n\t-o LOGFILE\tLog messages to FILE\n\t-U STR\t\tUse STR for User-Agent header\n\t-Y on/off\tUse proxy\n"
                    },
...

Fixes #29472

@benbz benbz requested review from a team as code owners March 11, 2025 17:00
@t3chguy t3chguy enabled auto-merge March 11, 2025 17:02
@t3chguy t3chguy added this pull request to the merge queue Mar 11, 2025
@t3chguy t3chguy removed this pull request from the merge queue due to a manual request Mar 11, 2025
@t3chguy t3chguy enabled auto-merge March 11, 2025 17:44
@t3chguy t3chguy added this pull request to the merge queue Mar 11, 2025
Merged via the queue into develop with commit 2052080 Mar 11, 2025
42 checks passed
@t3chguy t3chguy deleted the bbz/fix-docker-healthcheck branch March 11, 2025 18:17
spantaleev added a commit to spantaleev/matrix-docker-ansible-deploy that referenced this pull request Mar 11, 2025
@litetex
Copy link

litetex commented Mar 13, 2025

I would appreciate if this could be hotfixed in a new release because currently loadbalancers may refuse to route traffic to the container as the healthcheck is failing...

Totally didn't happen to me and I'm totally not going to disable the healthcheck for now :D

@Domoel
Copy link

Domoel commented Mar 16, 2025

A hotfix new release would be appreciated indeed! :)

@Aterfax
Copy link

Aterfax commented Mar 24, 2025

Temporary solution if using compose:

    healthcheck:
      test: ["CMD", "sh", "-c", "wget -q -O- http://localhost:80/config.json"]
      start_period: 5s
      interval: 30s
      retries: 3
      timeout: 5s

You may need to amend the port.

Surprised no hotfix is out? (yet?)

spantaleev added a commit to spantaleev/matrix-docker-ansible-deploy that referenced this pull request Mar 31, 2025
Now that element-hq/element-web#29471
is part of v1.11.96, we don't need to carry around this custom workaround.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Broken healthcheck in v1.11.95 container image
5 participants