-
Notifications
You must be signed in to change notification settings - Fork 353
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bug: Vector stops sending logs after StatefulSet restart due to headless service #1938
Comments
hey @StianOvrevage |
just issued release 0.8.15, you can override |
This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days. |
This issue was closed because it has been stalled for 5 days with no activity. |
Chart name and version
chart: victoria-logs-single
version: v0.8.13
Describe the bug
TL:DR; Vector gets stuck unable to send logs to VL after restart of VL StatefulSet due to Vector using direct IP for the VL Pod, which changes after restart, without Vector being able to reconnect.
We have just deployed VictoriaLogs and Vector using the helm chart and as many default values as possible.
In the cluster (GKE) we also have our own deployment of istio (v1.24).
As defined in the helm chart, we get a StatefulSet and a headless Service (request: add
-headless
suffix to the Service since this stumped us for a few minutes). It does not appear to be possible to make the Service not headless (i.e. to have it obtain a ClusterIP and thus let K8s manage routing).By default the helm chart produces config for Vector with endpoint
statefulset-pod-name-0.victorialogs-namespace.svc.cluster.local
. This makes Vector get the IP ofstatefulset-pod-name-0
directly. However ifstatefulset-pod-name-0
is ever restarted or rescheduled for any reason, it's IP will change. This change is not picked up by Vector (or possibly istio-proxy sidecar), causing it to be stuck unable to send logs until all Vector pods are restarted.The logs emitted by vector look like this
I'm not sure if this is 100% Vectors fault (it's docs say it does complete reconnects every time), istio's fault (this may indicate it's not entirely unrelated: istio/istio#54539 ), or a combination.
Proposed fix
If we can have an option in the helm chart to generate a Service that is not headless. Or choose to create an additional non-headless service, this would "fix" the problem.
I'm sure you have your reasons for using a headless Service with regards to HA and clustering etc. But for the time being and VictoriaLogs being a "Single instance" service I don't see any significant drawbacks of using a regular non-headless Service.
Custom values
Relevant excerpts of our values.yaml. This contains workarounds to the problems above by overriding the default Vector endpoint to a custom VictoriaLogs Service we've deployed.
The text was updated successfully, but these errors were encountered: