Skip to content

[BUG] Negative Search Stats #16598

Open
Open
@dzane17

Description

@dzane17

Describe the bug

We have identified a bug where search statistics are being incorrectly reported as negative, which is causing failures when accessing the nodes/stats API. This issue is specifically related to the handling of negative values in the SearchStats class. The exception occurs when the API attempts to serialize the statistics, causing a java.lang.IllegalStateException due to the use of VLong encoding, which does not support negative values.

Example Response from nodes/stats API:

{
    "_nodes": {
        "total": 6,
        "successful": 5,
        "failed": 1,
        "failures": [
            {
                "type": "failed_node_exception",
                "reason": "Failed node [gxu1I2AFSo2Co-gL3Cq14A]",
                "node_id": "gxu1I2AFSo2Co-gL3Cq14A",
                "caused_by": {
                    "type": "illegal_state_exception",
                    "reason": "Negative longs unsupported, use writeLong or writeZLong for negative numbers [-3]"
                }
            }
        ]
    },
...

Exception in elasticsearch.log:

[2024-10-16T19:55:29,039][WARN ][o.o.t.OutboundHandler    ] [8da3ab95536131cd3252613bab350bd7] send message failed [channel: Netty4TcpChannel{localAddress=/172.16.29.155:9300, remoteAddress=/172.16.29.122:40652}]
java.lang.IllegalStateException: Negative longs unsupported, use writeLong or writeZLong for negative numbers [-140]
    at org.opensearch.common.io.stream.StreamOutput.writeVLong(StreamOutput.java:307)
    at org.opensearch.index.search.stats.SearchStats$PhaseStatsLongHolder.writeTo(SearchStats.java:88)
    at org.opensearch.index.search.stats.SearchStats$Stats.lambda$writeTo$0(SearchStats.java:393)
    at org.opensearch.common.io.stream.StreamOutput.writeMap(StreamOutput.java:637)
    at org.opensearch.index.search.stats.SearchStats$Stats.writeTo(SearchStats.java:390)
    at org.opensearch.index.search.stats.SearchStats.writeTo(SearchStats.java:588)
    at org.opensearch.common.io.stream.StreamOutput.writeOptionalWriteable(StreamOutput.java:959)
    at org.opensearch.action.admin.indices.stats.CommonStats.writeTo(CommonStats.java:273)
    at org.opensearch.indices.NodeIndicesStats.writeTo(NodeIndicesStats.java:207)
    at org.opensearch.action.admin.cluster.node.stats.NodeStats.writeTo(NodeStats.java:429)
    at org.opensearch.transport.OutboundMessage.writeMessage(OutboundMessage.java:104)
    at org.opensearch.transport.OutboundMessage.serialize(OutboundMessage.java:81)
    at org.opensearch.transport.OutboundHandler$MessageSerializer.get(OutboundHandler.java:234)
    at org.opensearch.transport.OutboundHandler$MessageSerializer.get(OutboundHandler.java:220)

Root Cause

The issue occurs due to the use of StreamOutput.writeVLong() in the SearchStats class, which does not support negative long values. The stack trace shows that when a negative value is encountered, it triggers the exception in elasticsearch.log shown above.

The problematic line in the code is:

out.writeVLong(current);

Source: SearchStats.java#L89

Related component

Search

To Reproduce

This bug does not occur consistently across all domains or search queries. We are still investigating the specific conditions or types of searches that trigger this issue.

Expected behavior

Search stats should not be negative.

Additional Details

Plugins
Please list all plugins currently enabled.

Screenshots
If applicable, add screenshots to help explain your problem.

Host/Environment (please complete the following information):

  • OS: [e.g. iOS]
  • Version [e.g. 22]

Additional context
Add any other context about the problem here.

Metadata

Metadata

Assignees

Labels

SearchSearch query, autocomplete ...etcbugSomething isn't working

Type

No type

Projects

Status

🆕 New

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions