Skip to content

[BUG] Stats transport actions based on TransportNodeActions sends large payload of Discovery Nodes to all nodes #14713

Closed
@Pranshu-S

Description

@Pranshu-S

Describe the bug

In the current implementation, every transport action extending TransportNodesAction includes all discovery nodes in the transport request sent to each node in the cluster. This approach leads to performance bottlenecks in large clusters due to redundant data transmission. Specifically:

  1. Increased Network Traffic: The same list of discovery nodes is written n^2 times (where n is the number of nodes), causing unnecessary network traffic and increased IO.
  2. Write/Read Latency: The excessive data transmission contributes to higher overall latency for both write and read operations.
  3. NIO Buffer Bottleneck: When using plugins like Netty for inter-node communication, the buffer becomes overloaded with redundant discovery node information, increasing the size of the request and correspondingly reducing the amount of requests which can fit in the Netty buffer.

image

Related component

Other

To Reproduce

If NodeIDs are passed in the TransportNodeAction requests, we resolve them into DiscoveryNodes. This request is cloned by the individual requests which go to each node here which ends up write the discoveryNodes object.

Essentially for a 200 Node cluster, we are sending writing 200 discoveryNode objects for each request -> implying we write about 200x200 in the entire duration of the send path. This grows exponentially with number of nodes

Expected behavior

The request path should only be sending information that is to be required on the receive path.

Additional Details

Plugins
Please list all plugins currently enabled.

Screenshots
If applicable, add screenshots to help explain your problem.

Host/Environment (please complete the following information):

  • OS: [e.g. iOS]
  • Version [e.g. 22]

Additional context
Add any other context about the problem here.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Cluster ManagerbugSomething isn't workingv2.16.0Issues and PRs related to version 2.16.0

    Type

    No type

    Projects

    Status

    No status

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions