Skip to content

[BUG] Cannot make use of default_model_id in function_score query type #15403

@jdnvn

Description

@jdnvn

Describe the bug

Receive an error when omitting the model_id param in a function_score query after configuring a default_model_id for the index.

Related component

Search:Query Capabilities

To Reproduce

Create an index called my_index

PUT /my_index

{
	"settings": {
		"index": {"knn": True},
		"number_of_shards": 1,
		"number_of_replicas": 1,
		"analysis": {
			"analyzer": {
				"default": {
					"type": "standard"
				}
			}
		}
	},
	"mappings": {
		"properties": {
			"id": {"type": "keyword"},
			"chunks": {
				"type": "nested",
				"properties": {
					"chunk_id": {"type": "keyword"},
					"chunked_content": {"type": "text"},
					"chunked_content_embedding": {"type": "rank_features"},
				}
			},
		}
	}
}

Update cluster settings

PUT /_cluster/settings
{
	"persistent": {
		"plugins": {
			"ml_commons": {
				"allow_registering_model_via_url": "true",
				"only_run_on_ml_node": "false",
				"model_access_control_enabled": "true",
				"native_memory_threshold": "99"
			}
		}
	}
}

Create the neural sparse model

POST /_plugins/_ml/model_groups/_register
{
	"name": "my_model_group",
	"description": "Models for search",
}

POST /_plugins/_ml/models/_register?deploy=true
{
	"name": "neural-sparse/opensearch-neural-sparse-encoding-v1",
	"version": "1.0.1",
	"model_group_id": <model_group_id>
	"description": "This is a neural sparse encoding model: It transfers text into sparse vector, and then extract nonzero index and value to entry and weights. It serves in both ingestion and search.",
	"model_format": "TORCH_SCRIPT",
	"function_name": "SPARSE_ENCODING",
	"model_content_size_in_bytes": 492184214,
	"model_content_hash_value": "d1ebaa26615090bdb0195a62b180afd2a8524c68c5d406a11ad787267f515ea8",
	"url": "https://artifacts.opensearch.org/models/ml-models/amazon/neural-sparse/opensearch-neural-sparse-encoding-v1/1.0.1/torch_script/neural-sparse_opensearch-neural-sparse-encoding-v1-1.0.1-torch_script.zip",
	"created_time": 1696913667239
}

Create the search pipeline with a neural query enricher

PUT /_search/pipeline/my_pipeline
{
	"request_processors": [
		{
			"neural_query_enricher": {
				"default_model_id": <model_id>
			}
		}
	]
}

Update the index settings with the default pipeline

PUT /my_index/_settings
{
  "index.search.default_pipeline" : "my_pipeline"
}

Search the index

POST /_search

{
	"query": {
		"function_score": {
			"query": {
				"neural_sparse": {
                    "chunks.chunked_content_embedding": {
                        "query_text": "contract" # NO MODEL ID!
                    }
                }
			}
		}
	}
}

Receive error:

{

	"error": {
		"root_cause": [
			{
				"type": "illegal_argument_exception",
				"reason": "query_text and model_id cannot be null"
			}
		],
		"type": "illegal_argument_exception",
		"reason": "query_text and model_id cannot be null"
	},
	"status": 400
}

Expected behavior

There is no error and the default_model_id configured on the search pipeline is used to embed the query.

Additional Details

Plugins
ml model plugin

Screenshots
If applicable, add screenshots to help explain your problem.

Host/Environment (please complete the following information):

  • MacOS Sonoma 14.3
  • Docker 4.28.0
  • OpenSearch Version 2.16.0

Additional context
I opened this issue in neural-search as I thought this was an issue with the neural_sparse query type, however the root cause seems to be coming from function_score.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    Status

    ✅ Done

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions