Skip to content

Commit 2b7dbe0

Browse files
authored
Merge branch 'main' into rw-separation
2 parents fa0e620 + 43f3450 commit 2b7dbe0

File tree

100 files changed

+4104
-277
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

100 files changed

+4104
-277
lines changed

.github/vale/styles/Vocab/OpenSearch/Products/accept.txt

+3
Original file line numberDiff line numberDiff line change
@@ -5,10 +5,12 @@ Amazon
55
Amazon OpenSearch Serverless
66
Amazon OpenSearch Service
77
Amazon Bedrock
8+
Amazon Kinesis
89
Amazon SageMaker
910
AWS Secrets Manager
1011
Ansible
1112
Anthropic Claude
13+
Apache Kafka
1214
Auditbeat
1315
AWS Cloud
1416
Cohere Command
@@ -50,6 +52,7 @@ JSON Web Token
5052
Keycloak
5153
Kerberos
5254
Kibana
55+
Kinesis
5356
Kubernetes
5457
Lambda
5558
Langflow

.github/vale/styles/Vocab/OpenSearch/Words/accept.txt

+2-1
Original file line numberDiff line numberDiff line change
@@ -96,6 +96,7 @@ p\d{2}
9696
[Pp]repper
9797
[Pp]reprocess
9898
[Pp]retrain
99+
[Pp]rotobufs?
99100
[Pp]seudocode
100101
[Qq]uantiles?
101102
[Qq]uantiz(e|ation|ing|er)
@@ -154,4 +155,4 @@ tebibyte
154155
[Uu]pvote(s|d)?
155156
[Ww]alkthrough
156157
[Ww]ebpage
157-
xy
158+
xy

CONTRIBUTING.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -102,7 +102,7 @@ Follow these steps to set up your local copy of the repository:
102102

103103
##### Building by using containerization
104104

105-
Assuming you have `docker-compose` installed, run the following command:
105+
Assuming you have Docker installed, run the following command:
106106

107107
```
108108
docker compose -f docker-compose.dev.yml up

TERMS.md

-4
Original file line numberDiff line numberDiff line change
@@ -704,10 +704,6 @@ A piece of an index that consumes CPU and memory. Operates as a full Lucene inde
704704

705705
Don't use. Both *simple* and *simply* are not neutral in tone and might sound condescending to some users. If you mean *only*, use *only* instead.
706706

707-
**since**
708-
709-
Use only to describe time events. Don't use in place of *because*.
710-
711707
**slave**
712708

713709
Do not use. Use *replica*, *secondary*, or *standby* instead.

_about/breaking-changes.md

+16-1
Original file line numberDiff line numberDiff line change
@@ -163,4 +163,19 @@ The legacy notebooks feature has been removed from `dashboards-observability`. K
163163
- Only notebooks stored in the `.kibana` index (introduced in version 2.17) are supported.
164164
- You must migrate your notebooks to the new storage system before upgrading to version 3.0.
165165

166-
For more information, see issue [#2350](https://github.com/opensearch-project/dashboards-observability/issues/2350).
166+
For more information, see issue [#2350](https://github.com/opensearch-project/dashboards-observability/issues/2350).
167+
168+
### Searchable snapshots node role
169+
170+
Nodes that use searchable snapshots must have the `warm` node role. Key changes include the following:
171+
172+
- The `search` role no longer supports searchable snapshots.
173+
- Nodes that handle searchable snapshot shards must be assigned the warm role.
174+
- You must update node role configurations before upgrading to version 3.0 if your cluster uses searchable snapshots.
175+
176+
For more information, see pull request [#17573](https://github.com/opensearch-project/OpenSearch/pull/17573).
177+
178+
### ML Commons plugin
179+
180+
- The `CatIndexTool` is removed in favor of the `ListIndexTool`.
181+

_about/version-history.md

+1
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,7 @@ permalink: /version-history/
99

1010
OpenSearch version | Release highlights | Release date
1111
:--- | :--- | :---
12+
[2.19.2](https://github.com/opensearch-project/opensearch-build/blob/main/release-notes/opensearch-release-notes-2.19.2.md) | Improves query insights with better index handling, a new verbose API parameter, and a default index template. Fixes bugs across Query Insights, Observability, Flow Framework, and Dashboards. Includes multiple CVE fixes, test enhancements, and a new PGP key for artifact verification. For a full list of release highlights, see the Release Notes. | 29 April 2025
1213
[2.19.1](https://github.com/opensearch-project/opensearch-build/blob/main/release-notes/opensearch-release-notes-2.19.1.md) | Adds execution hint for cardinality aggregator. Includes bug fixes for ML Commons, Query Insights Dashboards, and Remote Metadata SDK. Contains maintenance updates for several components. For a full list of release highlights, see the Release Notes. | 27 February 2025
1314
[2.19.0](https://github.com/opensearch-project/opensearch-build/blob/main/release-notes/opensearch-release-notes-2.19.0.md) | Adds workload management, additional query insights, and template queries. Introduces a query insights page to OpenSearch Dashboards. Includes improvements and bug fixes to snapshots, search statistics, star-tree search, and index management. For a full list of release highlights, see the Release Notes. | 11 February 2025
1415
[2.18.0](https://github.com/opensearch-project/opensearch-build/blob/main/release-notes/opensearch-release-notes-2.18.0.md) | Adds a redesigned home page, updated Discover interface, and collaborative workspaces to OpenSearch Dashboards. Includes improvements to ML inference processor and query grouping. Introduces reranking by field and paginated CAT APIs. Includes experimental OpenSearch Dashboards Assistant capabilities. For a full list of release highlights, see the Release Notes. | 05 November 2024

_aggregations/metric/geobounds.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,7 @@ redirect_from:
99

1010
# Geobounds aggregation
1111

12-
The `geo_bounds` aggregation is a multi-value aggregation that calculates the [geographic bounding box](https://docs.ogc.org/is/12-063r5/12-063r5.html#30) encompassing a set of [`geo_point`](https://opensearch.org/docs/latest/field-types/supported-field-types/geo-point/) or [`geo_shape`](https://opensearch.org/docs/latest/field-types/supported-field-types/geo-shape/) objects. The bounding box is returned as the upper-left and lower-right vertices of the rectangle given as a decimal-encoded latitude-longitude (lat-lon) pair.
12+
The `geo_bounds` aggregation is a multi-value aggregation that calculates the [geographic bounding box](https://docs.ogc.org/is/12-063r5/12-063r5.html#30) encompassing a set of [`geo_point`]({{site.url}}{{site.baseurl}}/field-types/supported-field-types/geo-point/) or [`geo_shape`]({{site.url}}{{site.baseurl}}/field-types/supported-field-types/geo-shape/) objects. The bounding box is returned as the upper-left and lower-right vertices of the rectangle given as a decimal-encoded latitude-longitude (lat-lon) pair.
1313

1414
## Parameters
1515

_aggregations/metric/percentile-ranks.md

+27-1
Original file line numberDiff line numberDiff line change
@@ -43,4 +43,30 @@ GET opensearch_dashboards_sample_data_ecommerce/_search
4343
}
4444
}
4545
}
46-
```
46+
```
47+
48+
This response indicates that the value `10` is at the `5.5`th percentile and the value `15` is at the `8.3`rd percentile.
49+
50+
As with the `percentiles` aggregation, you can control the level of approximation by setting the optional `tdigest.compression` field. A larger value increases the precision of the approximation but uses more heap space. The default value is 100.
51+
52+
For example, use the following request to set `compression` to `200`:
53+
54+
```json
55+
GET opensearch_dashboards_sample_data_ecommerce/_search
56+
{
57+
"size": 0,
58+
"aggs": {
59+
"percentile_rank_taxful_total_price": {
60+
"percentile_ranks": {
61+
"field": "taxful_total_price",
62+
"values": [
63+
10,
64+
15
65+
],
66+
"tdigest": {
67+
"compression": 200
68+
}
69+
}
70+
}
71+
}
72+
}

_aggregations/metric/percentile.md

+21
Original file line numberDiff line numberDiff line change
@@ -51,3 +51,24 @@ GET opensearch_dashboards_sample_data_ecommerce/_search
5151
}
5252
}
5353
```
54+
55+
You can control the level of approximation using the optional `tdigest.compression` field. A larger value indicates that the data structure that approximates percentiles is more accurate but uses more heap space. The default value is 100.
56+
57+
For example, use the following request to set `compression` to `200`:
58+
59+
```json
60+
GET opensearch_dashboards_sample_data_ecommerce/_search
61+
{
62+
"size": 0,
63+
"aggs": {
64+
"percentile_taxful_total_price": {
65+
"percentiles": {
66+
"field": "taxful_total_price",
67+
"tdigest": {
68+
"compression": 200
69+
}
70+
}
71+
}
72+
}
73+
}
74+
```

_analyzers/search-analyzers.md

+102-15
Original file line numberDiff line numberDiff line change
@@ -22,14 +22,12 @@ To determine which analyzer to use for a query string at query time, OpenSearch
2222
In most cases, specifying a search analyzer that is different from the index analyzer is not necessary and could negatively impact search result relevance or lead to unexpected search results.
2323
{: .warning}
2424

25-
For information about verifying which analyzer is associated with which field, see [Verifying analyzer settings]({{site.url}}{{site.baseurl}}/analyzers/index/#verifying-analyzer-settings).
25+
## Specifying a search analyzer at query time
2626

27-
## Specifying a search analyzer for a query string
28-
29-
Specify the name of the analyzer you want to use at query time in the `analyzer` field:
27+
You can override the default analyzer behavior by explicitly setting the analyzer in the query. The following query uses the `english` analyzer to stem the input terms:
3028

3129
```json
32-
GET shakespeare/_search
30+
GET /shakespeare/_search
3331
{
3432
"query": {
3533
"match": {
@@ -43,16 +41,16 @@ GET shakespeare/_search
4341
```
4442
{% include copy-curl.html %}
4543

46-
For more information about supported analyzers, see [Analyzers]({{site.url}}{{site.baseurl}}/analyzers/supported-analyzers/index/).
44+
## Specifying a search analyzer in the mappings
4745

48-
## Specifying a search analyzer for a field
46+
When defining mappings, you can provide both the `analyzer` (used at index time) and `search_analyzer` (used at query time) for any [`text`]({{site.url}}{{site.baseurl}}/field-types/supported-field-types/text/) field.
4947

50-
When creating index mappings, you can provide the `search_analyzer` parameter for each [text]({{site.url}}{{site.baseurl}}/field-types/supported-field-types/text/) field. When providing the `search_analyzer`, you must also provide the `analyzer` parameter, which specifies the [index analyzer]({{site.url}}{{site.baseurl}}/analyzers/index-analyzers/) to be used at indexing time.
48+
### Example: Different analyzers for indexing and search
5149

52-
For example, the following request specifies the `simple` analyzer as the index analyzer and the `whitespace` analyzer as the search analyzer for the `text_entry` field:
50+
The following configuration allows different tokenization strategies for indexing and querying:
5351

5452
```json
55-
PUT testindex
53+
PUT /testindex
5654
{
5755
"mappings": {
5856
"properties": {
@@ -67,14 +65,100 @@ PUT testindex
6765
```
6866
{% include copy-curl.html %}
6967

70-
## Specifying the default search analyzer for an index
68+
### Example: Using the edge n-gram analyzer for indexing and the standard analyzer for search
7169

72-
If you want to analyze all query strings at search time with the same analyzer, you can specify the search analyzer in the `analysis.analyzer.default_search` setting. When providing the `analysis.analyzer.default_search`, you must also provide the `analysis.analyzer.default` parameter, which specifies the [index analyzer]({{site.url}}{{site.baseurl}}/analyzers/index-analyzers/) to be used at indexing time.
70+
The following configuration enables [autocomplete]({{site.url}}{{site.baseurl}}/search-plugins/searching-data/autocomplete/)-like behavior, where you can type the beginning of a word and still receive relevant matches:
7371

74-
For example, the following request specifies the `simple` analyzer as the index analyzer and the `whitespace` analyzer as the search analyzer for the `testindex` index:
72+
```json
73+
PUT /articles
74+
{
75+
"settings": {
76+
"analysis": {
77+
"analyzer": {
78+
"edge_ngram_analyzer": {
79+
"tokenizer": "edge_ngram_tokenizer",
80+
"filter": ["lowercase"]
81+
}
82+
},
83+
"tokenizer": {
84+
"edge_ngram_tokenizer": {
85+
"type": "edge_ngram",
86+
"min_gram": 2,
87+
"max_gram": 10,
88+
"token_chars": ["letter", "digit"]
89+
}
90+
}
91+
}
92+
},
93+
"mappings": {
94+
"properties": {
95+
"title": {
96+
"type": "text",
97+
"analyzer": "edge_ngram_analyzer",
98+
"search_analyzer": "standard"
99+
}
100+
}
101+
}
102+
}
103+
```
104+
{% include copy-curl.html %}
105+
106+
The `edge_ngram_analyzer` is applied at index time, breaking input strings into partial prefixes (n-grams), which allows the index to store fragments like "se", "sea", "sear", and so on.
107+
Use the following request to index a document:
75108

76109
```json
77-
PUT testindex
110+
PUT /articles/_doc/1
111+
{
112+
"title": "Search Analyzer in Action"
113+
}
114+
```
115+
{% include copy-curl.html %}
116+
117+
Use the following request to search for the partial word `sear` in the `title` field:
118+
119+
```json
120+
POST /articles/_search
121+
{
122+
"query": {
123+
"match": {
124+
"title": "sear"
125+
}
126+
}
127+
}
128+
```
129+
{% include copy-curl.html %}
130+
131+
The response demonstrates that the query containing "sear" matches the document "Search Analyzer in Action" because the n-gram tokens generated at index time include that prefix. This mirrors the [autocomplete functionality]({{site.url}}{{site.baseurl}}/search-plugins/searching-data/autocomplete/), in which typing a prefix can retrieve full matches:
132+
133+
```json
134+
{
135+
...
136+
"hits": {
137+
"total": {
138+
"value": 1,
139+
"relation": "eq"
140+
},
141+
"max_score": 0.2876821,
142+
"hits": [
143+
{
144+
"_index": "articles",
145+
"_id": "1",
146+
"_score": 0.2876821,
147+
"_source": {
148+
"title": "Search Analyzer in Action"
149+
}
150+
}
151+
]
152+
}
153+
}
154+
```
155+
156+
## Setting a default search analyzer for an index
157+
158+
Specify `analysis.analyzer.default_search` to define a search analyzer for all fields unless overridden:
159+
160+
```json
161+
PUT /testindex
78162
{
79163
"settings": {
80164
"analysis": {
@@ -89,6 +173,9 @@ PUT testindex
89173
}
90174
}
91175
}
92-
93176
```
94177
{% include copy-curl.html %}
178+
179+
This configuration ensures consistent behavior across multiple fields, especially when using custom analyzers.
180+
181+
For more information about supported analyzers, see [Analyzers]({{site.url}}{{site.baseurl}}/analyzers/supported-analyzers/index/).

_api-reference/document-apis/bulk.md

-3
Original file line numberDiff line numberDiff line change
@@ -44,9 +44,6 @@ require_alias | Boolean | Set to `true` to require that all actions target an in
4444
routing | String | Routes the request to the specified shard.
4545
timeout | Time | How long to wait for the request to return. Default is `1m`.
4646
wait_for_active_shards | String | Specifies the number of active shards that must be available before OpenSearch processes the bulk request. Default is `1` (only the primary shard). Set to `all` or a positive integer. Values greater than 1 require replicas. For example, if you specify a value of 3, the index must have 2 replicas distributed across 2 additional nodes in order for the request to succeed.
47-
{% comment %}_source | List | asdf
48-
_source_excludes | List | asdf
49-
_source_includes | List | asdf{% endcomment %}
5047

5148

5249
## Request body

0 commit comments

Comments
 (0)