-
Notifications
You must be signed in to change notification settings - Fork 2.7k
Add support for Docker container health checks to the collector image #30798
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Pinging code owners:
See Adding Labels via Comments if you do not have permissions to add labels yourself. |
cc @mwear, who's in the process of building the next version of this component. |
Hey @mwear, any thoughts on the addition of a |
It will be great to have it in the official image for now this works FROM public.ecr.aws/aws-observability/aws-otel-collector:latest as aws-otel
FROM otel/opentelemetry-collector-contrib as otel-collector
COPY --from=aws-otel /healthcheck /healthcheck
HEALTHCHECK --interval=5s --timeout=6s --retries=5 CMD ["/healthcheck"] Remember to enable the healthcheck extension |
This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping Pinging code owners:
See Adding Labels via Comments if you do not have permissions to add labels yourself. |
This issue has been closed as inactive because it has been stale for 120 days with no activity. |
Is there any update on this? I am looking for this feature as well and trying to avoid using AWS' distro/build my own. |
Is this workaround still working? How does it look on the task definition side of things? |
Built my own, works. Easier to manage and modify than copying AWS' solution. healthchecker: package main
import (
"context"
"fmt"
"io"
"net/http"
"os"
"os/signal"
"syscall"
"time"
)
func main() {
var err error
defer func() {
if err != nil {
fmt.Println(err.Error())
os.Exit(1)
}
}()
ctx, sigCxl := signal.NotifyContext(context.Background(), syscall.SIGINT, syscall.SIGTERM)
defer sigCxl()
req, err := http.NewRequestWithContext(ctx, http.MethodGet, "http://localhost:13133", nil)
if err != nil {
err = fmt.Errorf("failed to create request: %w", err)
return
}
http.DefaultClient.Timeout = time.Second
res, err := http.DefaultClient.Do(req)
if err != nil {
err = fmt.Errorf("failed to do request: %w", err)
return
}
defer func() { _ = res.Body.Close() }()
resBody, err := io.ReadAll(res.Body)
if err != nil {
err = fmt.Errorf("failed to read response body: %w", err)
return
}
if res.StatusCode != http.StatusOK {
err = fmt.Errorf("unexpected server status [%d]: %s", res.StatusCode, string(resBody))
return
}
fmt.Println("HEALTHY")
} Dockerfile: FROM golang:1.24.1 AS builder
# Make sure CGO is disabled (0) to ensure static linking
ENV GOOS=linux \
CGO_ENABLED=0
WORKDIR /go/src
COPY . .
RUN go build -o healthchecker ./cmd/healthchecker
FROM otel/opentelemetry-collector-contrib:0.122.1
COPY ./path/to/your/otel-col-config.yml /etc/otelcol-contrib/config.yml
COPY --chmod=755 --from=builder /go/src/healthchecker /bin/healthchecker
# ...
CMD ["--config=/etc/otelcol-contrib/config.yml"] Then it's just a matter of adding the health check binary to the task definition ...
healthCheck: {
command: ["CMD", "/bin/healthchecker"],
interval: 10,
timeout: 5,
retries: 10,
startPeriod: 10
}
... |
Component(s)
extension/healthcheck
Is your feature request related to a problem? Please describe.
We are setting up an OTel Collector as an AWS ECS Fargate service.
The task definitions support configuration for a container health check, and the results are surfaced when viewing the containers belonging to a task in an ECS cluster, and used by the ECS service to support managing the lifecycle of the container:
But because the collector is a
FROM scratch
image, we cannot use the typical configurations (e.g.curl -f http://localhost: 13133/ || exit 1
) as they rely on having a shell and a few CLIs like curl installed.Describe the solution you'd like
I would love for a way to configure a Docker container health check that works on ECS and other hosting platforms that utilise this Docker capability.
The convention with Go based images from what I've seen seems to be to include some sort of health check executable or extra command in the main CLI.
I note that this is resolved in the ADOT distribution with aws-observability/aws-otel-collector#1285.
Describe alternatives you've considered
Additional context
Sorry to ping you Juraci, but I couldn't think of a better component to select when creating this.
The text was updated successfully, but these errors were encountered: