Description
Describe the bug
The search.max_buckets
setting (ref) is used to control the maximum number of aggregation buckets allowed in a single search response.
For terms aggregations the way in which the bucket count is calculated is that sub-aggregation buckets are counted first, and then if their parent bucket is pruned from the candidate list the sub-aggregation bucket count is then subtracted. This means that it is not really accurately counting the number of buckets, see reproduction section below for an example.
More broadly speaking, I'm not sure if this search.max_buckets
setting is actually useful. I think the setting can have 2 uses:
- Limit the response size of a given search request -- This isn't quite working correctly as shown by this issue
- Stop bad aggregations from taking up too many resources -- Most aggregation types do not enforce this
max_buckets
setting at the shard level, it's only evaluated duringreduce
on the coordinator level which is after a lot of the resource intensive portions of the search request are already completed.
Somewhat related:
Related component
Search:Resiliency
To Reproduce
- Go to '...'
- Click on '....'
- Scroll down to '....'
- See error
Expected behavior
The following was done with the noaa
opensearch-benchmarks workload but it's not specific to that data.
Set cluster setting:
{
"persistent": {
"search.max_buckets": 2
}
}
This search request does not hit the max buckets limit
{
"size": 0,
"aggs": {
"station": {
"terms": {
"field": "station.id",
"size": 1,
"shard_size": 1
},
"aggs": {
"date": {
"terms": {
"field": "date",
"size": 1,
"shard_size": 1
}
}
}
}
}
}
Neither does this one
{
"size": 0,
"aggs": {
"station": {
"terms": {
"field": "station.id",
"size": 1,
"shard_size": 1
},
"aggs": {
"date": {
"terms": {
"field": "date",
"size": 1,
"shard_size": 2
}
}
}
}
}
}
However, this one does:
{
"size": 0,
"aggs": {
"station": {
"terms": {
"field": "station.id",
"size": 1,
"shard_size": 2
},
"aggs": {
"date": {
"terms": {
"field": "date",
"size": 1,
"shard_size": 1
}
}
}
}
}
}
In all 3 of these cases the response size on the coordinator is only 2 buckets.
Additional Details
Plugins
Please list all plugins currently enabled.
Screenshots
If applicable, add screenshots to help explain your problem.
Host/Environment (please complete the following information):
- OS: [e.g. iOS]
- Version [e.g. 22]
Additional context
Add any other context about the problem here.
Metadata
Metadata
Assignees
Labels
Type
Projects
Status
Status