Add stats tracking for semantic field #1362

bzhangam · 2025-06-02T16:56:53Z

Description

Add stats tracking for semantic field

Below is an example response for create, index and query semantic field with sparse model.

{
	"_nodes": {
		"total": 1,
		"successful": 1,
		"failed": 0
	},
	"cluster_name": "integTest",
	"info": {
		"cluster_version": "3.1.0",
		"processors": {
			"search": {
				"hybrid": {
					"comb_geometric_processors": 0,
					"comb_rrf_processors": 0,
					"norm_l2_processors": 0,
					"norm_minmax_processors": 0,
					"comb_harmonic_processors": 0,
					"comb_arithmetic_processors": 0,
					"norm_zscore_processors": 0,
					"rank_based_normalization_processors": 0,
					"normalization_processors": 0
				}
			},
			"ingest": {
				"text_chunking_delimiter_processors": 0,
				"text_embedding_processors_in_pipelines": 0,
				"text_chunking_fixed_length_processors": 0,
				"text_embedding_skip_existing_processors": 0,
				"text_chunking_processors": 0
			}
		}
	},
	"all_nodes": {
		"query": {
			"hybrid": {
				"hybrid_query_with_pagination_requests": 0,
				"hybrid_query_with_filter_requests": 0,
				"hybrid_query_with_inner_hits_requests": 0,
				"hybrid_query_requests": 0
			},
			"neural": {
				"neural_query_against_semantic_sparse_requests": 1,
				"neural_query_requests": 1,
				"neural_query_against_semantic_dense_requests": 0,
				"neural_query_against_knn_requests": 0
			},
			"neural_sparse": {
				"neural_sparse_query_requests": 0
			}
		},
		"semantic_highlighting": {
			"semantic_highlighting_request_count": 0
		},
		"processors": {
			"search": {
				"hybrid": {
					"comb_harmonic_executions": 0,
					"norm_zscore_executions": 0,
					"comb_rrf_executions": 0,
					"norm_l2_executions": 0,
					"rank_based_normalization_processor_executions": 0,
					"comb_arithmetic_executions": 0,
					"normalization_processor_executions": 0,
					"comb_geometric_executions": 0,
					"norm_minmax_executions": 0
				}
			},
			"ingest": {
				"text_chunking_executions": 0,
				"text_embedding_executions": 0,
				"semantic_field_executions": 1,
				"semantic_field_chunking_executions": 1,
				"text_embedding_skip_existing_executions": 0,
				"text_chunking_fixed_length_executions": 0,
				"text_chunking_delimiter_executions": 0
			}
		}
	},
	"nodes": {
		"j6A0rlYBR7mK1R_k178qdg": {
			"query": {
				"hybrid": {
					"hybrid_query_with_pagination_requests": 0,
					"hybrid_query_with_filter_requests": 0,
					"hybrid_query_with_inner_hits_requests": 0,
					"hybrid_query_requests": 0
				},
				"neural": {
					"neural_query_against_semantic_sparse_requests": 1,
					"neural_query_requests": 1,
					"neural_query_against_semantic_dense_requests": 0,
					"neural_query_against_knn_requests": 0
				},
				"neural_sparse": {
					"neural_sparse_query_requests": 0
				}
			},
			"semantic_highlighting": {
				"semantic_highlighting_request_count": 0
			},
			"processors": {
				"search": {
					"hybrid": {
						"comb_harmonic_executions": 0,
						"norm_zscore_executions": 0,
						"comb_rrf_executions": 0,
						"norm_l2_executions": 0,
						"rank_based_normalization_processor_executions": 0,
						"comb_arithmetic_executions": 0,
						"normalization_processor_executions": 0,
						"comb_geometric_executions": 0,
						"norm_minmax_executions": 0
					}
				},
				"ingest": {
					"text_chunking_executions": 0,
					"text_embedding_executions": 0,
					"semantic_field_executions": 1,
					"semantic_field_chunking_executions": 1,
					"text_embedding_skip_existing_executions": 0,
					"text_chunking_fixed_length_executions": 0,
					"text_chunking_delimiter_executions": 0
				}
			}
		}
	}
}

Related Issues

N/A

Check List

New functionality includes testing.
New functionality has been documented.
API changes companion pull request created.
Commits are signed per the DCO using --signoff.
Public documentation issue/PR created.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Signed-off-by: Bo Zhang <[email protected]>

q-andy · 2025-06-02T21:29:16Z

Could you include an example of the new API response in the PR description?

src/main/java/org/opensearch/neuralsearch/processor/semantic/SemanticFieldProcessor.java

src/main/java/org/opensearch/neuralsearch/stats/events/EventStatName.java

src/main/java/org/opensearch/neuralsearch/processor/semantic/SemanticFieldProcessor.java

src/main/java/org/opensearch/neuralsearch/query/NeuralQueryBuilder.java

src/main/java/org/opensearch/neuralsearch/stats/events/EventStatName.java

q-andy · 2025-06-04T18:27:47Z

Could you update the neural query integ tests to verify the stats are only incremented once for each query type?

src/main/java/org/opensearch/neuralsearch/processor/semantic/SemanticFieldProcessor.java

bzhangam · 2025-06-04T18:42:24Z

Could you update the neural query integ tests to verify the stats are only incremented once for each query type?

Currently we cannot run integ tests for the semantic field related use cases. I think we can address in a separate PR where we add the integ tests for semantic fields.

src/main/java/org/opensearch/neuralsearch/stats/events/EventStatName.java

q-andy · 2025-06-04T19:01:40Z

Could you update the neural query integ tests to verify the stats are only incremented once for each query type?

Currently we cannot run integ tests for the semantic field related use cases. I think we can address in a separate PR where we add the integ tests for semantic fields.

For neural query request count against sparse/dense, don't those stats work without semantic field?

bzhangam · 2025-06-04T20:25:05Z

Could you update the neural query integ tests to verify the stats are only incremented once for each query type?

Currently we cannot run integ tests for the semantic field related use cases. I think we can address in a separate PR where we add the integ tests for semantic fields.

For neural query request count against sparse/dense, don't those stats work without semantic field?

ok added

q-andy · 2025-06-04T20:50:02Z

src/test/java/org/opensearch/neuralsearch/query/NeuralQueryIT.java

    public void testQueryWithBoostAndImageQueryAndRadialQuery() {
+        // Enable stats for the test
+        updateClusterSettings(NEURAL_STATS_ENABLED.getKey(), true);


Could we have this a separate testQueryWithBoostAndImageQueryAndRadialQuery_statsEnabled() test? Following the pattern of the other stats ITs. So we can validate both stats disabled and stats enabled happy case. Same with the neural sparse test.

I think it's not necessary to test stats disabled everywhere. One test to cover that should already be good enough since the disable logic is not related to this PR.

will-hwang · 2025-06-06T22:13:26Z

src/main/java/org/opensearch/neuralsearch/processor/semantic/SemanticFieldProcessor.java

@@ -173,7 +176,10 @@ private void process(
    ) {
        setModelInfo(ingestDocument, semanticFieldInfoList);

-        chunk(ingestDocument, semanticFieldInfoList);
+        boolean shouldRecordChunking = chunk(ingestDocument, semanticFieldInfoList);


[nit] chunk method simply returns whether the chunk is enabled or disabled. The variable name should reflect this behavior only in my opinion.

boolean chunked = chunk(ingestDocument, semanticFieldInfoList); if (chunked) { EventStatsManager.increment(EventStatName.SEMANTIC_FIELD_PROCESSOR_CHUNKING_EXECUTIONS); }

Signed-off-by: Bo Zhang <[email protected]>

codecov · 2025-06-09T16:48:39Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 0.00%. Comparing base (a6669e4) to head (301d6b5).
Report is 1 commits behind head on main.

Additional details and impacted files

@@             Coverage Diff              @@
##               main   #1362       +/-   ##
============================================
- Coverage     82.47%       0   -82.48%     
============================================
  Files           149       0      -149     
  Lines          7531       0     -7531     
  Branches       1211       0     -1211     
============================================
- Hits           6211       0     -6211     
+ Misses          859       0      -859     
+ Partials        461       0      -461

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

bzhangam added 6 commits May 13, 2025 15:37

Implement the query logic for the semantic field.

5deb96a

Signed-off-by: Bo Zhang <[email protected]>

Merge branch 'opensearch-project:main' into main

83d08c0

Merge branch 'opensearch-project:main' into main

7ad9d11

Enhance semantic field to allow to enable/disable chunking.

66eea0c

Signed-off-by: Bo Zhang <[email protected]>

Merge branch 'opensearch-project:main' into main

13d3c76

Merge branch 'opensearch-project:main' into main

5eade10

bzhangam requested review from heemin32, navneet1v, VijayanB, vamshin, jmazanec15, naveentatikonda, junqiu-lei, martin-gaievski, sean-zheng-amazon, model-collapse, zane-neo, vibrantvarun, zhichao-aws, yuye-aws and minalsha as code owners June 2, 2025 16:56

bzhangam force-pushed the main branch 2 times, most recently from fa40c43 to 777a5b9 Compare June 2, 2025 16:59

junqiu-lei reviewed Jun 2, 2025

View reviewed changes

src/main/java/org/opensearch/neuralsearch/processor/semantic/SemanticFieldProcessor.java Show resolved Hide resolved

junqiu-lei reviewed Jun 2, 2025

View reviewed changes

src/main/java/org/opensearch/neuralsearch/processor/semantic/SemanticFieldProcessor.java Outdated Show resolved Hide resolved

q-andy reviewed Jun 2, 2025

View reviewed changes

src/main/java/org/opensearch/neuralsearch/stats/events/EventStatName.java Outdated Show resolved Hide resolved

bzhangam force-pushed the main branch 2 times, most recently from f491352 to 5eade10 Compare June 4, 2025 17:41

Merge branch 'opensearch-project:main' into main

981411f

bzhangam force-pushed the main branch from e462617 to 14744c2 Compare June 4, 2025 17:45

heemin32 reviewed Jun 4, 2025

View reviewed changes

src/main/java/org/opensearch/neuralsearch/processor/semantic/SemanticFieldProcessor.java Outdated Show resolved Hide resolved

src/main/java/org/opensearch/neuralsearch/query/NeuralQueryBuilder.java Outdated Show resolved Hide resolved

q-andy reviewed Jun 4, 2025

View reviewed changes

src/main/java/org/opensearch/neuralsearch/stats/events/EventStatName.java Outdated Show resolved Hide resolved

bzhangam force-pushed the main branch from 14744c2 to 45b6130 Compare June 4, 2025 18:28

heemin32 reviewed Jun 4, 2025

View reviewed changes

src/main/java/org/opensearch/neuralsearch/processor/semantic/SemanticFieldProcessor.java Outdated Show resolved Hide resolved

bzhangam force-pushed the main branch from 45b6130 to ed4f9df Compare June 4, 2025 18:37

heemin32 reviewed Jun 4, 2025

View reviewed changes

src/main/java/org/opensearch/neuralsearch/stats/events/EventStatName.java Outdated Show resolved Hide resolved

bzhangam force-pushed the main branch 3 times, most recently from 12579dd to abbb5e2 Compare June 4, 2025 20:24

q-andy reviewed Jun 4, 2025

View reviewed changes

will-hwang reviewed Jun 6, 2025

View reviewed changes

bzhangam force-pushed the main branch from 3efc194 to 981411f Compare June 9, 2025 14:47

bzhangam added 2 commits June 9, 2025 07:48

Merge branch 'opensearch-project:main' into main

8068294

Add stats tracking for semantic field

301d6b5

Signed-off-by: Bo Zhang <[email protected]>

bzhangam force-pushed the main branch from b62d654 to 301d6b5 Compare June 9, 2025 15:19

junqiu-lei approved these changes Jun 9, 2025

View reviewed changes

heemin32 approved these changes Jun 9, 2025

View reviewed changes

heemin32 merged commit fe29ec5 into opensearch-project:main Jun 9, 2025
51 of 53 checks passed

This was referenced Jun 10, 2025

[FEATURE] Update neural-search stats API spec with new stats added in 3.1 opensearch-project/opensearch-api-specification#890

Open

[DOC] Update neural-search stats API docs with new stats added in 3.1 opensearch-project/documentation-website#9943

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add stats tracking for semantic field #1362

Add stats tracking for semantic field #1362

Uh oh!

bzhangam commented Jun 2, 2025 •

edited

Loading

Uh oh!

q-andy commented Jun 2, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

q-andy commented Jun 4, 2025

Uh oh!

Uh oh!

bzhangam commented Jun 4, 2025

Uh oh!

Uh oh!

q-andy commented Jun 4, 2025

Uh oh!

bzhangam commented Jun 4, 2025

Uh oh!

q-andy Jun 4, 2025 •

edited

Loading

Uh oh!

bzhangam Jun 4, 2025

Uh oh!

will-hwang Jun 6, 2025

Uh oh!

bzhangam Jun 9, 2025

Uh oh!

Uh oh!

codecov bot commented Jun 9, 2025

Uh oh!

Uh oh!

Add stats tracking for semantic field #1362

Add stats tracking for semantic field #1362

Uh oh!

Conversation

bzhangam commented Jun 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Related Issues

Check List

Uh oh!

q-andy commented Jun 2, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

q-andy commented Jun 4, 2025

Uh oh!

Uh oh!

bzhangam commented Jun 4, 2025

Uh oh!

Uh oh!

q-andy commented Jun 4, 2025

Uh oh!

bzhangam commented Jun 4, 2025

Uh oh!

q-andy Jun 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

bzhangam Jun 4, 2025

Choose a reason for hiding this comment

Uh oh!

will-hwang Jun 6, 2025

Choose a reason for hiding this comment

Uh oh!

bzhangam Jun 9, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

codecov bot commented Jun 9, 2025

Codecov Report

Uh oh!

Uh oh!

bzhangam commented Jun 2, 2025 •

edited

Loading

q-andy Jun 4, 2025 •

edited

Loading