Skip to content

Feat : Add Elasticsearch Document Reader | 添加 Elasticsearch 文档读取器 #390

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Feb 7, 2025

Conversation

brianxiadong
Copy link
Collaborator

Add Elasticsearch Document Reader | 添加 Elasticsearch 文档读取器

This PR adds a new document reader implementation for Elasticsearch, which allows Spring AI to retrieve and process documents from Elasticsearch indices.

本 PR 添加了一个新的 Elasticsearch 文档读取器实现,允许 Spring AI 从 Elasticsearch 索引中检索和处理文档。

Features | 功能特性

  • Support both single node and cluster mode Elasticsearch | 支持 Elasticsearch 单节点和集群模式
  • Support HTTPS and basic authentication | 支持 HTTPS 和基本认证
  • Customizable query field and result size | 可自定义查询字段和结果数量
  • Support document retrieval by ID and query | 支持通过 ID 和查询检索文档
  • Full test coverage with real Elasticsearch instance | 使用真实 Elasticsearch 实例的完整测试覆盖

Implementation Details | 实现细节

Core Components | 核心组件

  1. ElasticsearchConfig

    • Configuration class for Elasticsearch connection and query settings | Elasticsearch 连接和查询设置的配置类
    • Support both single node and cluster configurations | 支持单节点和集群配置
    • HTTPS and authentication settings | HTTPS 和认证设置
  2. ElasticsearchDocumentReader

    • Implements DocumentReader interface | 实现 DocumentReader 接口
    • Support document retrieval methods | 支持文档检索方法:
      • get(): Get all documents | 获取所有文档
      • getById(String id): Get document by ID | 通过 ID 获取文档
      • readWithQuery(String query): Search documents by query | 通过查询搜索文档
    • Handle SSL/TLS connections securely | 安全处理 SSL/TLS 连接

Key Features | 主要特性

  • Cluster Support | 集群支持: Can connect to multiple Elasticsearch nodes for high availability | 可以连接多个 Elasticsearch 节点以实现高可用
  • Secure Connection | 安全连接: Support HTTPS with proper SSL context and certificate handling | 支持 HTTPS,具有适当的 SSL 上下文和证书处理
  • Flexible Query | 灵活查询: Allow customization of search field and result size | 允许自定义搜索字段和结果数量
  • Error Handling | 错误处理: Proper exception handling and error messages | 适当的异常处理和错误消息
  • Documentation | 文档: Bilingual (English/Chinese) documentation with detailed examples | 双语(英文/中文)文档,包含详细示例

Testing | 测试

  • Comprehensive test cases covering all features | 覆盖所有功能的综合测试用例
  • Test with real Elasticsearch instance | 使用真实 Elasticsearch 实例进行测试
  • Test both single node and cluster modes | 测试单节点和集群模式
  • Test HTTPS and authentication | 测试 HTTPS 和认证
  • Test document CRUD operations | 测试文档的 CRUD 操作

Usage Example | 使用示例

// Configure Elasticsearch | 配置 Elasticsearch
ElasticsearchConfig config = new ElasticsearchConfig();
config.setHost("localhost");          // Host address | 主机地址
config.setPort(9200);                 // Port number | 端口号
config.setIndex("your-index");        // Index name | 索引名称
config.setScheme("https");           // Connection scheme | 连接方案
config.setUsername("elastic");        // Username | 用户名
config.setPassword("your-password");  // Password | 密码

// Create reader | 创建读取器
ElasticsearchDocumentReader reader = new ElasticsearchDocumentReader(config);

// Get documents | 获取文档
List<Document> documents = reader.get();

Dependencies | 依赖

  • Spring AI Core | Spring AI 核心库
  • Elasticsearch Java Client 8.x | Elasticsearch Java 客户端 8.x
  • Jackson for JSON processing | 用于 JSON 处理的 Jackson 库

Documentation | 文档

  • Detailed README in both English and Chinese | 详细的中英文 README
  • Configuration properties documentation | 配置属性文档
  • Usage examples for different scenarios | 不同场景的使用示例
  • Security considerations and best practices | 安全考虑和最佳实践

Future Improvements | 未来改进

  1. Add support for custom SSL certificates | 添加自定义 SSL 证书支持
  2. Add connection pooling configuration | 添加连接池配置
  3. Add support for Elasticsearch template queries | 添加 Elasticsearch 模板查询支持
  4. Add support for scroll API for large result sets | 添加大结果集的滚动 API 支持
  5. Add more query types support | 添加更多查询类型支持

Related Issues | 相关问题

  • None | 无

Breaking Changes | 破坏性变更

  • None (new module) | 无(新模块)

Copy link
Collaborator

@yuluo-yx yuluo-yx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@chickenlj chickenlj merged commit 44a0d5c into alibaba:main Feb 7, 2025
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants