You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Existing open issues along with etcd frequently asked questions have been checked and this is not a duplicate.
What happened?
We have a three member etcd cluster, where pod-0 and pod-2 are running properly. pod-1 is restarting continuously and there is no connectivity for pod-1 with pod-0 and pod-2.
pod-0 tried to connect to pod-1
bash-4.4$ openssl s_client -connect 192.168.xxx.xxx:2379
139685380193280:error:0200206F:system library:connect:Connection refused:crypto/bio/b_sock2.c:110:
139685380193280:error:2008A067:BIO routines:BIO_connect:connect error:crypto/bio/b_sock2.c:111:
connect:errno=111
bash-4.4$ exit
pod-2 tried to connect to pod-1
bash-4.4$ openssl s_client -connect 192.168.xxx.xxx:2379
140555984509952:error:0200206F:system library:connect:Connection refused:crypto/bio/b_sock2.c:110:
140555984509952:error:2008A067:BIO routines:BIO_connect:connect error:crypto/bio/b_sock2.c:111:
connect:errno=111
bash-4.4$ exit
What did you expect to happen?
Pod-1 should run without restarts and no snap , WAL related errors inside pod-1 logs.
Please suggest us to recover pod-1 and also is this a known issue encountered before?
We are basically trying to understand , connectivity issue with other 2 pods causing the pod-1 to restart continuously and also corrupted SNAP , WAL files also affecting.
How can we reproduce it (as minimally and precisely as possible)?
There are no particular steps followed. We have observed this issue, some days after the installation of the deployment.
Anything else we need to know?
No response
Etcd version (please run commands below)
bash-4.4$ etcd --version
etcd Version: 3.5.12
Git SHA: e7b3bb6
Go Version: go1.20.13
Go OS/Arch: linux/amd64
bash-4.4$ etcdctl version
etcdctl version: 3.5.12
API version: 3.5
Etcd configuration (command line flags or environment variables)
paste your configuration here
Etcd debug information (please run commands below, feel free to obfuscate the IP address or FQDN in the output)
$ etcdctl member list -w table
# paste output here
$ etcdctl --endpoints=<member list> endpoint status -w table
# paste output here
Relevant log output
Errors observed in pod-1 logs
"extra_data":{"caller":"snap/snapshotter.go:249","path":"00000000000001d0-0000000000052f23.snap.broken"},"message":"found unexpected non-snap file; skipping","metadata":{"container_name":"dced","namespace":"pg-eda2mqv01","pod_name":"etcd-1"},"service_id":"etcd","severity":"warning","timestamp":"2025-03-04T16:06:16.486+01:00","version":"1.2.0"}
{"extra_data":{"caller":"snap/snapshotter.go:249","path":"00000000000001d7-0000000000055635.snap.broken"},"message":"found unexpected non-snap file; skipping","metadata":{"container_name":"dced","namespace":"pg-eda2mqv01","pod_name":"etcd-1"},"service_id":"etcd","severity":"warning","timestamp":"2025-03-04T16:06:16.487+01:00","version":"1.2.0"}
{"extra_data":{"caller":"snap/snapshotter.go:249","path":"00000000000001d1-00000000000542ac.snap.broken"},"message":"found unexpected non-snap file; skipping","metadata":{"container_name":"dced","namespace":"pg-eda2mqv01","pod_name":"etcd-1"},"service_id":"etcd","severity":"warning","timestamp":"2025-03-04T16:06:16.487+01:00","version":"1.2.0"}
{"extra_data":{"caller":"etcdserver/server.go:532"},"message":"No snapshot found. Recovering WAL from scratch!","metadata":{"container_name":"dced","namespace":"pg-eda2mqv01","pod_name":"etcd-1"},"service_id":"etcd","severity":"info","timestamp":"2025-03-04T16:06:16.487+01:00","version":"1.2.0"}
{"extra_data":{"caller":"etcdserver/storage.go:96","error":"wal: file not found","stacktrace":"go.etcd.io/etcd/server/v3/etcdserver.readWAL\n\tgo.etcd.io/etcd/server/v3/etcdserver/storage.go:96\ngo.etcd.io/etcd/server/v3/etcdserver.restartNode\n\tgo.etcd.io/etcd/server/v3/etcdserver/raft.go:528\ngo.etcd.io/etcd/server/v3/etcdserver.NewServer\n\tgo.etcd.io/etcd/server/v3/etcdserver/server.go:536\ngo.etcd.io/etcd/server/v3/embed.StartEtcd\n\tgo.etcd.io/etcd/server/v3/embed/etcd.go:245\ngo.etcd.io/etcd/server/v3/etcdmain.startEtcd\n\tgo.etcd.io/etcd/server/v3/etcdmain/etcd.go:228\ngo.etcd.io/etcd/server/v3/etcdmain.startEtcdOrProxyV2\n\tgo.etcd.io/etcd/server/v3/etcdmain/etcd.go:123\ngo.etcd.io/etcd/server/v3/etcdmain.Main\n\tgo.etcd.io/etcd/server/v3/etcdmain/main.go:40\nmain.main\n\tgo.etcd.io/etcd/server/v3/main.go:31\nruntime.main\n\truntime/proc.go:250"},"message":"failed to open WAL","metadata":{"container_name":"dced","namespace":"pg-eda2mqv01","pod_name":"etcd-1"},"service_id":"etcd","severity":"critical","timestamp":"2025-03-04T16:06:16.487+01:00","version":"1.2.0"}
The text was updated successfully, but these errors were encountered:
Bug report criteria
What happened?
We have a three member etcd cluster, where pod-0 and pod-2 are running properly. pod-1 is restarting continuously and there is no connectivity for pod-1 with pod-0 and pod-2.
pod-0 tried to connect to pod-1
bash-4.4$ openssl s_client -connect 192.168.xxx.xxx:2379
139685380193280:error:0200206F:system library:connect:Connection refused:crypto/bio/b_sock2.c:110:
139685380193280:error:2008A067:BIO routines:BIO_connect:connect error:crypto/bio/b_sock2.c:111:
connect:errno=111
bash-4.4$ exit
pod-2 tried to connect to pod-1
bash-4.4$ openssl s_client -connect 192.168.xxx.xxx:2379
140555984509952:error:0200206F:system library:connect:Connection refused:crypto/bio/b_sock2.c:110:
140555984509952:error:2008A067:BIO routines:BIO_connect:connect error:crypto/bio/b_sock2.c:111:
connect:errno=111
bash-4.4$ exit
What did you expect to happen?
Pod-1 should run without restarts and no snap , WAL related errors inside pod-1 logs.
Please suggest us to recover pod-1 and also is this a known issue encountered before?
We are basically trying to understand , connectivity issue with other 2 pods causing the pod-1 to restart continuously and also corrupted SNAP , WAL files also affecting.
How can we reproduce it (as minimally and precisely as possible)?
There are no particular steps followed. We have observed this issue, some days after the installation of the deployment.
Anything else we need to know?
No response
Etcd version (please run commands below)
bash-4.4$ etcd --version
etcd Version: 3.5.12
Git SHA: e7b3bb6
Go Version: go1.20.13
Go OS/Arch: linux/amd64
bash-4.4$ etcdctl version
etcdctl version: 3.5.12
API version: 3.5
Etcd configuration (command line flags or environment variables)
paste your configuration here
Etcd debug information (please run commands below, feel free to obfuscate the IP address or FQDN in the output)
Relevant log output
The text was updated successfully, but these errors were encountered: