Description
Is your feature request related to a problem? Please describe
Inspiration
Per effort of #6844 and benchmarking result (#10684 (comment)) (~20%), we can consider step further on adding support on gRPC-based API with protobuf as serializing/de-serializing. To validate our assumption on potential performance gain over protobuf which should be more efficient and compact compare to JSON, we performed PoC for client <> server protobuf on Search API with specific query types and we are able to see promising result from opensearch-project/opensearch-clients#69.
Proposal
With ongoing effort for node-to-node communication, which focuses more on Transport Layer with implementing StreamInput, StreamOutput with protobuf serializer/de-serializers. We can expand the effort and have client <> server protobuf support in parallel to achieve more significant performance gain.
The proto definition for search API and partial overlap with transport layer should follow opensearch-api-specification which is widely adopted by clients.
For server side change there are two options here:
-
Introduce new content-type and expose option to end-user send and receive protobuf binary payloads.
Pros: faster development cycle to begin with as potentially the extension on existing searchRequest/Response, builder
XContent.
Cons: potentially introduce significant code refactoring which introduces complexity alongside the development. -
Implement new streaming-style search API(gRPC) using protobuf and expose new grpc endpoint for search API.
Pros:
a) gRPC natively supports client-side, server-side, and bidirectional streaming, allowing for real-time
communication. This is more efficient than HTTP/1.1 used by REST
b) generates client and server code in multiple programming languages based on the proto files. This reduces
boilerplate code and ensures consistency across different languages and platforms.
c) less code refactoring
Cons:
a) the development cycle might not as fast as approach 1.
b) Though bringing up new grpc service and hook with the internal transport layer might not be too complicated,
there will be unknowns on the overall integration with existing ecosystem, e.g related plugins (security, knn,
sql, some other monitoring etc).
For client (Java, Go, Python etc), would have support to optionally use new protobuf-based server API with minimal changes (i.e. no need to rewrite an application already using the client)
Next Steps
- Generate proto from opensearch-api-specification (refer: https://github.com/nytimes/openapi2proto)
- bootstrap / create gRPC SearchService (SearchGRPCService) and hook with internal layer (clusterservice, actionlisterner etc)
- grpcHandlers for searchAction: add grpc/action/search and register in ActionModule
- There are ~ 40+ queryBuilder/types, need to target on knn related as . (? CorrelationQuery)
- ?? integrate with transport layer protobuf implementation (node-to-node)
Timeline
2.17 release: (09/03/2024 ~ 09/17/2024)
[Experimental Feature]
- protobuf definitions
- simple matchAll query for E2E poc.
- feature will be marked as experiment.
Related
Transport layer Protobuf support: #6844
Metadata
Metadata
Assignees
Labels
Type
Projects
Status
Status