Description
Describe the bug
Receive an error when omitting the model_id param in a function_score query after configuring a default_model_id for the index.
Related component
Search:Query Capabilities
To Reproduce
Create an index called my_index
PUT /my_index
{
"settings": {
"index": {"knn": True},
"number_of_shards": 1,
"number_of_replicas": 1,
"analysis": {
"analyzer": {
"default": {
"type": "standard"
}
}
}
},
"mappings": {
"properties": {
"id": {"type": "keyword"},
"chunks": {
"type": "nested",
"properties": {
"chunk_id": {"type": "keyword"},
"chunked_content": {"type": "text"},
"chunked_content_embedding": {"type": "rank_features"},
}
},
}
}
}
Update cluster settings
PUT /_cluster/settings
{
"persistent": {
"plugins": {
"ml_commons": {
"allow_registering_model_via_url": "true",
"only_run_on_ml_node": "false",
"model_access_control_enabled": "true",
"native_memory_threshold": "99"
}
}
}
}
Create the neural sparse model
POST /_plugins/_ml/model_groups/_register
{
"name": "my_model_group",
"description": "Models for search",
}
POST /_plugins/_ml/models/_register?deploy=true
{
"name": "neural-sparse/opensearch-neural-sparse-encoding-v1",
"version": "1.0.1",
"model_group_id": <model_group_id>
"description": "This is a neural sparse encoding model: It transfers text into sparse vector, and then extract nonzero index and value to entry and weights. It serves in both ingestion and search.",
"model_format": "TORCH_SCRIPT",
"function_name": "SPARSE_ENCODING",
"model_content_size_in_bytes": 492184214,
"model_content_hash_value": "d1ebaa26615090bdb0195a62b180afd2a8524c68c5d406a11ad787267f515ea8",
"url": "https://artifacts.opensearch.org/models/ml-models/amazon/neural-sparse/opensearch-neural-sparse-encoding-v1/1.0.1/torch_script/neural-sparse_opensearch-neural-sparse-encoding-v1-1.0.1-torch_script.zip",
"created_time": 1696913667239
}
Create the search pipeline with a neural query enricher
PUT /_search/pipeline/my_pipeline
{
"request_processors": [
{
"neural_query_enricher": {
"default_model_id": <model_id>
}
}
]
}
Update the index settings with the default pipeline
PUT /my_index/_settings
{
"index.search.default_pipeline" : "my_pipeline"
}
Search the index
POST /_search
{
"query": {
"function_score": {
"query": {
"neural_sparse": {
"chunks.chunked_content_embedding": {
"query_text": "contract" # NO MODEL ID!
}
}
}
}
}
}
Receive error:
{
"error": {
"root_cause": [
{
"type": "illegal_argument_exception",
"reason": "query_text and model_id cannot be null"
}
],
"type": "illegal_argument_exception",
"reason": "query_text and model_id cannot be null"
},
"status": 400
}
Expected behavior
There is no error and the default_model_id
configured on the search pipeline is used to embed the query.
Additional Details
Plugins
ml model plugin
Screenshots
If applicable, add screenshots to help explain your problem.
Host/Environment (please complete the following information):
- MacOS Sonoma 14.3
- Docker 4.28.0
- OpenSearch Version 2.16.0
Additional context
I opened this issue in neural-search as I thought this was an issue with the neural_sparse query type, however the root cause seems to be coming from function_score.
Metadata
Metadata
Assignees
Labels
Type
Projects
Status