Skip to content

[Pull-based Ingestion] Support versioning in pull-based ingestion #17918

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Apr 15, 2025

Conversation

varunbharadwaj
Copy link
Contributor

@varunbharadwaj varunbharadwaj commented Apr 14, 2025

Description

This PR adds external versioning support in pull-based ingestion for users to handle out-of-order updates when streaming sources do not guarantee order. Pull-based ingestion will only support external versioning. Users must maintain versions externally and set it in the _version field in the message.

  1. Messages with a lower version than the document version in the index will be dropped. Note that partial upserts are not supported in pull-based ingestion and therefore it is okay to drop messages due to version conflict.
  2. Pull-based ingestion will provide similar guarantees for deletions as push-based approach. Deletes will require a greater version to be provided, with the exception when index.gc_deletes duration has passed from the deletion.
  3. Version validation will be disabled by default if _version is not provided in the message.

The user is responsible to always send a version in every message when external versioning is used. If a version is not provided, the document is indexed irrespective of the document version found in the index.

Related Issues

Resolves #17913

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@github-actions github-actions bot added enhancement Enhancement or improvement to existing feature or request Indexing Indexing, Bulk Indexing and anything related to indexing labels Apr 14, 2025
@varunbharadwaj varunbharadwaj changed the title [Pull-based Ingestion] Support external versioning in pull-based ingestion [Pull-based Ingestion] Support versioning in pull-based ingestion Apr 14, 2025
Copy link
Contributor

❕ Gradle check result for 24a37c7: UNSTABLE

Please review all flaky tests that succeeded after retry and create an issue if one does not already exist to track the flaky failure.

Copy link

codecov bot commented Apr 14, 2025

Codecov Report

Attention: Patch coverage is 19.27711% with 67 lines in your changes missing coverage. Please review.

Project coverage is 72.48%. Comparing base (471acef) to head (6f9aa2a).
Report is 6 commits behind head on main.

Files with missing lines Patch % Lines
...a/org/opensearch/index/engine/IngestionEngine.java 17.56% 56 Missing and 5 partials ⚠️
...ndices/pollingingest/MessageProcessorRunnable.java 25.00% 5 Missing and 1 partial ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##               main   #17918      +/-   ##
============================================
- Coverage     72.52%   72.48%   -0.05%     
+ Complexity    67031    67002      -29     
============================================
  Files          5470     5470              
  Lines        309707   309780      +73     
  Branches      45052    45063      +11     
============================================
- Hits         224617   224536      -81     
- Misses        66774    66869      +95     
- Partials      18316    18375      +59     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Contributor

@yupeng9 yupeng9 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

overall looks good

Copy link
Contributor

❌ Gradle check result for 6f9aa2a: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Copy link
Contributor

❌ Gradle check result for 6f9aa2a: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Copy link
Contributor

❕ Gradle check result for 6f9aa2a: UNSTABLE

Please review all flaky tests that succeeded after retry and create an issue if one does not already exist to track the flaky failure.

@mch2 mch2 merged commit 1be3e46 into opensearch-project:main Apr 15, 2025
83 of 92 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Enhancement or improvement to existing feature or request Indexing Indexing, Bulk Indexing and anything related to indexing
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Feature Request] Support versioning in pull-based ingestion
3 participants