opensearch-project
diff --git a/‎.github/vale/styles/Vocab/OpenSearch/Products/accept.txt
+3 b/‎.github/vale/styles/Vocab/OpenSearch/Products/accept.txt
+3
diff --git a/‎.github/vale/styles/Vocab/OpenSearch/Words/accept.txt
+2-1 b/‎.github/vale/styles/Vocab/OpenSearch/Words/accept.txt
+2-1
diff --git a/‎CONTRIBUTING.md
+1-1 b/‎CONTRIBUTING.md
+1-1
diff --git a/‎TERMS.md
-4 b/‎TERMS.md
-4
diff --git a/‎_about/breaking-changes.md
+16-1 b/‎_about/breaking-changes.md
+16-1
diff --git a/‎_about/version-history.md
+1 b/‎_about/version-history.md
+1
diff --git a/‎_aggregations/metric/geobounds.md
+1-1 b/‎_aggregations/metric/geobounds.md
+1-1
diff --git a/‎_aggregations/metric/percentile-ranks.md
+27-1 b/‎_aggregations/metric/percentile-ranks.md
+27-1
diff --git a/‎_aggregations/metric/percentile.md
+21 b/‎_aggregations/metric/percentile.md
+21
diff --git a/‎_analyzers/search-analyzers.md
+102-15 b/‎_analyzers/search-analyzers.md
+102-15
diff --git a/‎_api-reference/document-apis/bulk.md
-3 b/‎_api-reference/document-apis/bulk.md
-3
@@ -5,10 +5,12 @@ Amazon
 Amazon OpenSearch Serverless
 Amazon OpenSearch Service
 Amazon Bedrock
+Amazon Kinesis
 Amazon SageMaker
 AWS Secrets Manager
 Ansible
 Anthropic Claude
+Apache Kafka
 Auditbeat
 AWS Cloud
 Cohere Command
@@ -50,6 +52,7 @@ JSON Web Token
 Keycloak
 Kerberos
 Kibana
+Kinesis
 Kubernetes
 Lambda
 Langflow
 
@@ -96,6 +96,7 @@ p\d{2}
 [Pp]repper
 [Pp]reprocess
 [Pp]retrain
+[Pp]rotobufs?
 [Pp]seudocode
 [Qq]uantiles?
 [Qq]uantiz(e|ation|ing|er)
@@ -154,4 +155,4 @@ tebibyte
 [Uu]pvote(s|d)?
 [Ww]alkthrough
 [Ww]ebpage
-xy
+xy
@@ -102,7 +102,7 @@ Follow these steps to set up your local copy of the repository:
 
 ##### Building by using containerization
 
-Assuming you have `docker-compose` installed, run the following command:
+Assuming you have Docker installed, run the following command:
 
    ```
    docker compose -f docker-compose.dev.yml up
 
@@ -704,10 +704,6 @@ A piece of an index that consumes CPU and memory. Operates as a full Lucene inde
 
 Don't use. Both *simple* and *simply* are not neutral in tone and might sound condescending to some users. If you mean *only*, use *only* instead.
 
-**since**
-
-Use only to describe time events. Don't use in place of *because*.
-
 **slave**
 
 Do not use. Use *replica*, *secondary*, or *standby* instead.
 
@@ -163,4 +163,19 @@ The legacy notebooks feature has been removed from `dashboards-observability`. K
 - Only notebooks stored in the `.kibana` index (introduced in version 2.17) are supported.
 - You must migrate your notebooks to the new storage system before upgrading to version 3.0.
 
-For more information, see issue [#2350](https://github.com/opensearch-project/dashboards-observability/issues/2350).
+For more information, see issue [#2350](https://github.com/opensearch-project/dashboards-observability/issues/2350).
+
+### Searchable snapshots node role
+
+Nodes that use searchable snapshots must have the `warm` node role. Key changes include the following:
+
+- The `search` role no longer supports searchable snapshots.
+- Nodes that handle searchable snapshot shards must be assigned the warm role.
+- You must update node role configurations before upgrading to version 3.0 if your cluster uses searchable snapshots.
+
+For more information, see pull request [#17573](https://github.com/opensearch-project/OpenSearch/pull/17573).
+
+### ML Commons plugin
+
+- The `CatIndexTool` is removed in favor of the `ListIndexTool`.
+
@@ -9,6 +9,7 @@ permalink: /version-history/
 
 OpenSearch version | Release highlights | Release date  
 :--- | :--- | :--- 
+[2.19.2](https://github.com/opensearch-project/opensearch-build/blob/main/release-notes/opensearch-release-notes-2.19.2.md) |  Improves query insights with better index handling, a new verbose API parameter, and a default index template. Fixes bugs across Query Insights, Observability, Flow Framework, and Dashboards. Includes multiple CVE fixes, test enhancements, and a new PGP key for artifact verification. For a full list of release highlights, see the Release Notes. | 29 April 2025
 [2.19.1](https://github.com/opensearch-project/opensearch-build/blob/main/release-notes/opensearch-release-notes-2.19.1.md) |  Adds execution hint for cardinality aggregator. Includes bug fixes for ML Commons, Query Insights Dashboards, and Remote Metadata SDK. Contains maintenance updates for several components. For a full list of release highlights, see the Release Notes. | 27 February 2025
 [2.19.0](https://github.com/opensearch-project/opensearch-build/blob/main/release-notes/opensearch-release-notes-2.19.0.md) | Adds workload management, additional query insights, and template queries. Introduces a query insights page to OpenSearch Dashboards. Includes improvements and bug fixes to snapshots, search statistics, star-tree search, and index management. For a full list of release highlights, see the Release Notes. | 11 February 2025
 [2.18.0](https://github.com/opensearch-project/opensearch-build/blob/main/release-notes/opensearch-release-notes-2.18.0.md) | Adds a redesigned home page, updated Discover interface, and collaborative workspaces to OpenSearch Dashboards. Includes improvements to ML inference processor and query grouping. Introduces reranking by field and paginated CAT APIs. Includes experimental OpenSearch Dashboards Assistant capabilities. For a full list of release highlights, see the Release Notes. | 05 November 2024
 
@@ -9,7 +9,7 @@ redirect_from:
 
 # Geobounds aggregation
 
-The `geo_bounds` aggregation is a multi-value aggregation that calculates the [geographic bounding box](https://docs.ogc.org/is/12-063r5/12-063r5.html#30) encompassing a set of [`geo_point`](https://opensearch.org/docs/latest/field-types/supported-field-types/geo-point/) or [`geo_shape`](https://opensearch.org/docs/latest/field-types/supported-field-types/geo-shape/) objects. The bounding box is returned as the upper-left and lower-right vertices of the rectangle given as a decimal-encoded latitude-longitude (lat-lon) pair.
+The `geo_bounds` aggregation is a multi-value aggregation that calculates the [geographic bounding box](https://docs.ogc.org/is/12-063r5/12-063r5.html#30) encompassing a set of [`geo_point`]({{site.url}}{{site.baseurl}}/field-types/supported-field-types/geo-point/) or [`geo_shape`]({{site.url}}{{site.baseurl}}/field-types/supported-field-types/geo-shape/) objects. The bounding box is returned as the upper-left and lower-right vertices of the rectangle given as a decimal-encoded latitude-longitude (lat-lon) pair.
 
 ## Parameters
 
 
@@ -43,4 +43,30 @@ GET opensearch_dashboards_sample_data_ecommerce/_search
   }
  }
 }
-```
+```
+
+This response indicates that the value `10` is at the `5.5`th percentile and the value `15` is at the `8.3`rd percentile. 
+
+As with the `percentiles` aggregation, you can control the level of approximation by setting the optional `tdigest.compression` field. A larger value increases the precision of the approximation but uses more heap space. The default value is 100.
+
+For example, use the following request to set `compression` to `200`: 
+
+```json
+GET opensearch_dashboards_sample_data_ecommerce/_search
+{
+  "size": 0,
+  "aggs": {
+    "percentile_rank_taxful_total_price": {
+      "percentile_ranks": {
+        "field": "taxful_total_price",
+        "values": [
+          10,
+          15
+        ],
+        "tdigest": { 
+          "compression": 200
+        }
+      }
+    }
+  }
+}
@@ -51,3 +51,24 @@ GET opensearch_dashboards_sample_data_ecommerce/_search
  }
 }
 ```
+
+You can control the level of approximation using the optional `tdigest.compression` field. A larger value indicates that the data structure that approximates percentiles is more accurate but uses more heap space. The default value is 100. 
+
+For example, use the following request to set `compression` to `200`: 
+
+```json
+GET opensearch_dashboards_sample_data_ecommerce/_search
+{
+  "size": 0,
+  "aggs": {
+    "percentile_taxful_total_price": {
+      "percentiles": {
+        "field": "taxful_total_price",
+        "tdigest": { 
+          "compression": 200
+        }
+      }
+    }
+  }
+}
+```
@@ -22,14 +22,12 @@ To determine which analyzer to use for a query string at query time, OpenSearch
 In most cases, specifying a search analyzer that is different from the index analyzer is not necessary and could negatively impact search result relevance or lead to unexpected search results.
 {: .warning}
 
-For information about verifying which analyzer is associated with which field, see [Verifying analyzer settings]({{site.url}}{{site.baseurl}}/analyzers/index/#verifying-analyzer-settings).
+## Specifying a search analyzer at query time
 
-## Specifying a search analyzer for a query string
-
-Specify the name of the analyzer you want to use at query time in the `analyzer` field:
+You can override the default analyzer behavior by explicitly setting the analyzer in the query. The following query uses the `english` analyzer to stem the input terms:
 
 ```json
-GET shakespeare/_search
+GET /shakespeare/_search
 {
   "query": {
     "match": {
@@ -43,16 +41,16 @@ GET shakespeare/_search
 ```
 {% include copy-curl.html %}
 
-For more information about supported analyzers, see [Analyzers]({{site.url}}{{site.baseurl}}/analyzers/supported-analyzers/index/).
+## Specifying a search analyzer in the mappings
 
-## Specifying a search analyzer for a field
+When defining mappings, you can provide both the `analyzer` (used at index time) and `search_analyzer` (used at query time) for any [`text`]({{site.url}}{{site.baseurl}}/field-types/supported-field-types/text/) field.
 
-When creating index mappings, you can provide the `search_analyzer` parameter for each [text]({{site.url}}{{site.baseurl}}/field-types/supported-field-types/text/) field. When providing the `search_analyzer`, you must also provide the `analyzer` parameter, which specifies the [index analyzer]({{site.url}}{{site.baseurl}}/analyzers/index-analyzers/) to be used at indexing time.
+### Example: Different analyzers for indexing and search
 
-For example, the following request specifies the `simple` analyzer as the index analyzer and the `whitespace` analyzer as the search analyzer for the `text_entry` field:
+The following configuration allows different tokenization strategies for indexing and querying:
 
 ```json
-PUT testindex
+PUT /testindex
 {
   "mappings": {
     "properties": {
@@ -67,14 +65,100 @@ PUT testindex
 ```
 {% include copy-curl.html %}
 
-## Specifying the default search analyzer for an index
+### Example: Using the edge n-gram analyzer for indexing and the standard analyzer for search
 
-If you want to analyze all query strings at search time with the same analyzer, you can specify the search analyzer in the `analysis.analyzer.default_search` setting. When providing the `analysis.analyzer.default_search`, you must also provide the `analysis.analyzer.default` parameter, which specifies the [index analyzer]({{site.url}}{{site.baseurl}}/analyzers/index-analyzers/) to be used at indexing time.
+The following configuration enables [autocomplete]({{site.url}}{{site.baseurl}}/search-plugins/searching-data/autocomplete/)-like behavior, where you can type the beginning of a word and still receive relevant matches:
 
-For example, the following request specifies the `simple` analyzer as the index analyzer and the `whitespace` analyzer as the search analyzer for the `testindex` index:
+```json
+PUT /articles
+{
+  "settings": {
+    "analysis": {
+      "analyzer": {
+        "edge_ngram_analyzer": {
+          "tokenizer": "edge_ngram_tokenizer",
+          "filter": ["lowercase"]
+        }
+      },
+      "tokenizer": {
+        "edge_ngram_tokenizer": {
+          "type": "edge_ngram",
+          "min_gram": 2,
+          "max_gram": 10,
+          "token_chars": ["letter", "digit"]
+        }
+      }
+    }
+  },
+  "mappings": {
+    "properties": {
+      "title": {
+        "type": "text",
+        "analyzer": "edge_ngram_analyzer",
+        "search_analyzer": "standard"
+      }
+    }
+  }
+}
+```
+{% include copy-curl.html %}
+
+The `edge_ngram_analyzer` is applied at index time, breaking input strings into partial prefixes (n-grams), which allows the index to store fragments like "se", "sea", "sear", and so on. 
+Use the following request to index a document:
 
 ```json
-PUT testindex
+PUT /articles/_doc/1
+{
+  "title": "Search Analyzer in Action"
+}
+```
+{% include copy-curl.html %}
+
+Use the following request to search for the partial word `sear` in the `title` field:
+
+```json
+POST /articles/_search
+{
+  "query": {
+    "match": {
+      "title": "sear"
+    }
+  }
+}
+```
+{% include copy-curl.html %}
+
+The response demonstrates that the query containing "sear" matches the document "Search Analyzer in Action" because the n-gram tokens generated at index time include that prefix. This mirrors the [autocomplete functionality]({{site.url}}{{site.baseurl}}/search-plugins/searching-data/autocomplete/), in which typing a prefix can retrieve full matches:
+
+```json
+{
+  ...
+  "hits": {
+    "total": {
+      "value": 1,
+      "relation": "eq"
+    },
+    "max_score": 0.2876821,
+    "hits": [
+      {
+        "_index": "articles",
+        "_id": "1",
+        "_score": 0.2876821,
+        "_source": {
+          "title": "Search Analyzer in Action"
+        }
+      }
+    ]
+  }
+}
+```
+
+## Setting a default search analyzer for an index
+
+Specify `analysis.analyzer.default_search` to define a search analyzer for all fields unless overridden:
+
+```json
+PUT /testindex
 {
   "settings": {
     "analysis": {
@@ -89,6 +173,9 @@ PUT testindex
     }
   }
 }
-
 ```
 {% include copy-curl.html %}
+
+This configuration ensures consistent behavior across multiple fields, especially when using custom analyzers.
+
+For more information about supported analyzers, see [Analyzers]({{site.url}}{{site.baseurl}}/analyzers/supported-analyzers/index/).
@@ -44,9 +44,6 @@ require_alias | Boolean | Set to `true` to require that all actions target an in
 routing | String | Routes the request to the specified shard.
 timeout | Time | How long to wait for the request to return. Default is `1m`.
 wait_for_active_shards | String | Specifies the number of active shards that must be available before OpenSearch processes the bulk request. Default is `1` (only the primary shard). Set to `all` or a positive integer. Values greater than 1 require replicas. For example, if you specify a value of 3, the index must have 2 replicas distributed across 2 additional nodes in order for the request to succeed.
-{% comment %}_source | List | asdf
-_source_excludes | List | asdf
-_source_includes | List | asdf{% endcomment %}
 
 
 ## Request body