You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: _vector-search/api/knn.md
+31
Original file line number
Diff line number
Diff line change
@@ -57,6 +57,37 @@ Field | Description
57
57
Some statistics contain *graph* in the name. In these cases, *graph* is synonymous with *native library index*. The term *graph* is reflective of when the plugin only supported the HNSW algorithm, which consists of hierarchical graphs.
58
58
{: .note}
59
59
60
+
#### Remote index build stats
61
+
Introduced 3.0
62
+
{: .label .label-purple }
63
+
64
+
This is an experimental feature and is not recommended for use in a production environment. For updates on the progress of the feature or if you want to leave feedback, see the associated [GitHub issue](https://github.com/opensearch-project/k-NN/issues/2391).
65
+
{: .warning}
66
+
67
+
If [remote index build]({{site.url}}{{site.baseurl}}/vector-search/remote-index-build/) is enabled, the following statistics are included.
68
+
69
+
| Field | Description |
70
+
|:---|:---|
71
+
|`repository_stats.read_success_count`| The number of successful read operations from the repository. |
72
+
|`repository_stats.read_failure_count`| The number of failed read operations from the repository. |
73
+
|`repository_stats.successful_read_time_in_millis`| The total time, in milliseconds, spent on successful read operations. |
74
+
|`repository_stats.write_success_count`| The number of successful write operations to the repository. |
75
+
|`repository_stats.write_failure_count`| The number of failed write operations to the repository. |
76
+
|`repository_stats.successful_write_time_in_millis`| The total time, in milliseconds, spent on successful write operations. |
77
+
|`client_stats.build_request_success_count`| The number of successful build request operations. |
78
+
|`client_stats.build_request_failure_count`| The number of failed build request operations. |
79
+
|`client_stats.status_request_failure_count`| The number of failed status request operations. |
80
+
|`client_stats.status_request_success_count`| The number of successful status request operations. |
81
+
|`client_stats.index_build_success_count`| The number of successful index build operations. |
82
+
|`client_stats.index_build_failure_count`| The number of failed index build operations. |
83
+
|`client_stats.waiting_time_in_ms`| The total time, in milliseconds, that the client has spent awaiting completion of remote builds. |
84
+
|`build_stats.remote_index_build_flush_time_in_millis`| The total time, in milliseconds, spent on remote flush operations. |
85
+
|`build_stats.remote_index_build_merge_time_in_millis`| The total time, in milliseconds, spent on remote merge operations. |
86
+
|`build_stats.remote_index_build_current_merge_operations`| The current number of remote merge operations in progress. |
87
+
|`build_stats.remote_index_build_current_flush_operations`| The current number of remote flush operations in progress. |
88
+
|`build_stats.remote_index_build_current_merge_size`| The current size of remote merge operations. |
89
+
|`build_stats.remote_index_build_current_flush_size`| The current size of remote flush operations. |
90
+
60
91
#### Example request
61
92
62
93
The following examples demonstrate how to retrieve statistics related to the k-NN plugin.
This is an experimental feature and is not recommended for use in a production environment. For updates on the progress of the feature or if you want to leave feedback, see the associated [GitHub issue](https://github.com/opensearch-project/k-NN/issues/2391).
13
+
{: .warning}
14
+
15
+
Starting with version 3.0, OpenSearch supports building vector indexes using a GPU-accelerated remote index build service. Using GPUs dramatically reduces index build times and decreases costs. For benchmarking results, see [this blog post](https://opensearch.org/blog/GPU-Accelerated-Vector-Search-OpenSearch-New-Frontier/).
16
+
17
+
## Prerequisites
18
+
19
+
Before configuring the remote index build settings, ensure you fulfill the following prerequisites. For more information about updating dynamic settings, see [Dynamic settings]({{site.url}}{{site.baseurl}}/install-and-configure/configuring-opensearch/index/#dynamic-settings).
20
+
21
+
### Step 1: Enable the remote index build service
22
+
23
+
Enable the remote index build service for both the cluster and the chosen index by configuring the following settings.
24
+
25
+
Setting | Static/Dynamic | Default | Description
26
+
:--- | :--- | :--- | :---
27
+
`knn.feature.remote_index_build.enabled` | Dynamic | `false` | Enables remote vector index building for the cluster.
28
+
`index.knn.remote_index_build.enabled` | Dynamic | `false` | Enables remote index building for the index. Currently, the remote index build service supports [Faiss]({{site.url}}{{site.baseurl}}/field-types/supported-field-types/knn-methods-engines/#faiss-engine) indexes with the `hnsw` method and the default 32-bit floating-point (`FP32`) vectors.
29
+
30
+
### Step 2: Create and register the remote vector repository
31
+
32
+
The remote vector repository acts as an intermediate object store between the OpenSearch cluster and the remote build service. The cluster uploads vectors and document IDs to the repository. The remote build service retrieves the data, builds the index externally, and uploads the completed result back to the repository.
33
+
34
+
To create and register the repository, follow the steps in [Register repository]({{site.url}}{{site.baseurl}}/tuning-your-cluster/availability-and-recovery/snapshots/snapshot-restore/#register-repository). Then set the `knn.remote_index_build.vector_repo` dynamic setting to the name of the registered repository.
35
+
36
+
The remote build service currently only supports Amazon Simple Storage Service (Amazon S3) repositories.
37
+
{: .note}
38
+
39
+
### Step 3: Set up a remote vector index builder
40
+
41
+
Configure the remote endpoint in the k-NN settings by setting `knn.remote_index_build.client.endpoint` to a running [remote vector index builder](https://github.com/opensearch-project/remote-vector-index-builder) instance. For instructions on setting up the remote service, see [the user guide](https://github.com/opensearch-project/remote-vector-index-builder/blob/main/USER_GUIDE.md).
42
+
43
+
## Configuring remote index build settings
44
+
45
+
The remote index build service supports several additional, optional settings. For information about configuring any remaining remote index build settings, see [Remote index build settings]({{site.url}}{{site.baseurl}}/vector-search/settings/#remote-index-build-settings).
46
+
47
+
## Using the remote index build service
48
+
49
+
Once the remote index build service is configured, any index on which it is enabled will use the remote vector index builder for builds that meet the configured `index.knn.remote_index_build.size_threshold`.
An index created in OpenSearch version 2.11 or earlier will still use the previous `ef_construction` and `ef_search` values (`512`).
47
47
{: .note}
48
+
49
+
## Remote index build settings
50
+
Introduced 3.0
51
+
{: .label .label-purple }
52
+
53
+
This is an experimental feature and is not recommended for use in a production environment. For updates on the progress of the feature or if you want to leave feedback, see the associated [GitHub issue](https://github.com/opensearch-project/k-NN/issues/2391).
54
+
{: .warning}
55
+
56
+
The following settings control [remote vector index building]({{site.url}}{{site.baseurl}}/vector-search/remote-index-build/).
57
+
58
+
The `poll_interval`, `timeout`, and `size_threshold` are advanced settings. Their default values are set as a result of extensive benchmarking.
59
+
{: .important}
60
+
61
+
### Cluster settings
62
+
63
+
The following remote index build settings apply at the cluster level.
64
+
65
+
Setting | Static/Dynamic | Default | Description
66
+
:--- | :--- | :--- | :---
67
+
`knn.feature.remote_index_build.enabled` | Dynamic | `false` | Enables remote vector index building for the cluster.
68
+
`knn.remote_index_build.vector_repo` | Dynamic | None | The repository to which the remote index builder should write.
69
+
`knn.remote_index_build.client.endpoint` | Dynamic | None | The endpoint URL of the remote build service.
70
+
`knn.remote_index_build.client.poll_interval` | Dynamic | `5s` | How frequently the client should poll the remote build service for job status.
71
+
`knn.remote_index_build.client.timeout` | Dynamic | `60m` | The maximum amount of time to wait for remote build completion before falling back to a CPU-based build.
72
+
73
+
### Index settings
74
+
75
+
The following remote index build settings apply at the index level.
76
+
77
+
Setting | Static/Dynamic | Default | Description
78
+
:--- | :--- | :--- | :---
79
+
`index.knn.remote_index_build.enabled` | Dynamic | `false` | Enables remote index building for the index. Currently, the remote index build service supports [Faiss]({{site.url}}{{site.baseurl}}/field-types/supported-field-types/knn-methods-engines/#faiss-engine) indexes with the `hnsw` method and the default 32-bit floating-point (`FP32`) vectors.
80
+
`index.knn.remote_index_build.size_threshold` | Dynamic | `50mb` | The minimum size required to enable remote vector builds.
81
+
82
+
### Remote build authentication
83
+
84
+
The remote build service username and password are secure settings that must be set in the [OpenSearch keystore]({{site.url}}{{site.baseurl}}/security/configuration/opensearch-keystore/) as follows:
You can reload the secure settings without restarting the node by using the [Nodes Reload Secure]({{site.url}}{{site.baseurl}}/api-reference/nodes-apis/nodes-reload-secure/) API.
0 commit comments