Fleet Server preventing Elasticsearch node shutdown due to persistent HTTP connections

**Version:**  
8.18.0

**Operating System:**  
Ubuntu 22.04.5

**Discuss Forum URL:**  
https://discuss.elastic.co/t/376934

---

**Summary of the Issue:**  
Fleet Server maintains persistent HTTP connections to its configured Elasticsearch output nodes, which prevents those nodes from shutting down cleanly. When an Elasticsearch node is stopped, it hangs for 2–3 minutes, waiting for Fleet Server to release its connections. Restarting the Fleet Server (i.e., its `elastic-agent` process) allows the node to shut down immediately, confirming Fleet Server as the blocking party.

This behaviour can also be confirmed by monitoring active connections to port 9200 on the coordinating node during shutdown. Using `netstat` or `ss` shows that Fleet Server maintains open connections even as the Elasticsearch node attempts to stop.

This breaks high-availability expectations: since even a single running Fleet Server will prevent node shutdown, all Fleet Servers must be stopped before any of their Elasticsearch output nodes can shut down cleanly.

Note that other stack components (e.g., Logstash) cleanly disconnect on shutdown, suggesting Fleet Server does not correctly respond to connection termination signals from Elasticsearch.

---

**Steps to Reproduce:**

1. Deploy Fleet Server (a single instance is sufficient).
2. Ensure it is connected to Elasticsearch via HTTPS.
3. Attempt to shut down any of its Elasticsearch output nodes.
4. Observe the node hang in the `stopping` state for 2–3 minutes.
5. During shutdown, run `ss` or `netstat` and observe persistent connections to port 9200 from the Fleet Server.
6. Restart the Fleet Server while the ES node is still stopping.
7. Observe the node shut down immediately after Fleet Server exits.

---

**Expected Behavior:**

- Fleet Server should detect node shutdown and release its connections immediately, allowing Elasticsearch to stop cleanly.

---

**Actual Behavior:**

- The Elasticsearch node remains stuck in the `stopping` state until either Fleet Server is stopped, or 2–3 minutes have elapsed.
- This undermines redundancy in HA setups by requiring all Fleet Servers to be stopped for Elasticsearch node maintenance.

---

**Additional Information:**

- Reproducible in both production and development environments.
- Issue did not occur in some earlier versions, but the exact point of it started occurring is unknown.
- No relevant info appears in Elasticsearch logs during shutdown attempts.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fleet Server preventing Elasticsearch node shutdown due to persistent HTTP connections #4905

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Fleet Server preventing Elasticsearch node shutdown due to persistent HTTP connections #4905

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions