Description
Opensearch Version: 2.15
Environment: AWS OpenSearch
Issue Description
I am executing hybrid queries with three sub-queries on a large dataset containing tens to hundreds of thousands of documents. The queries are weighted as follows: [0.9998, 0.0001, 0.0001]
, with the first query having the highest weight. However, I am seeing unexpected results where a document with a high score from the first query is missing from the top results in the final ranking, while documents with lower scores from the same query are included.
Example:
- Documents: A, B, C, D
- Query 1 Scores (when run independently):
- Document A: 1200
- Document B: 1000
- Document C: 300
- Document D: 100
However, in the hybrid query, Document B does not appear in the top results, but Document C does, despite the heavily skewed weighting toward the first query (0.9998).
Pipeline Configuration:
{
"phase_results_processors": [
{
"normalization-processor": {
"combination": {
"parameters": {
"weights": [
0.9998,
0.0001,
0.0001
]
},
"technique": "arithmetic_mean"
},
"normalization": {
"technique": "min_max"
}
}
}
]
}
Observations:

Essentially, even if Document C returns the highest possible scores from queries 2 and 3, it cannot score higher than Document B. Given this, it seems impossible for Document B to not appear in the final results, and Document C should not rank higher.
Question:
How is it possible for Document B to be excluded from the top results while Document C is included, given the heavily skewed weights and expected normalization?
Related component
Search:Relevance
Expected behavior
I would expect Document B to appear in the hybrid query search results no matter what, given the weight we've assigned to the first query.
Metadata
Metadata
Assignees
Labels
Type
Projects
Status