Skip to content

[improve][broker] Optimize subscription seek (cursor reset) by timestamp #22792

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 11 commits into from
Jan 9, 2025

Conversation

dao-jun
Copy link
Member

@dao-jun dao-jun commented May 28, 2024

Fixes #22129

Motivation

Pulsar uses binary search to find the message by timestamp, it will reduce the number of iterations to find the message, and make it more efficient and faster.

Even though the current implementation is correct, and using binary search to speed-up, but it's still not efficient enough.
The current implementation is to scan all the ledgers to find the message by timestamp.
This is a performance bottleneck, especially for large topics with many messages.
Say, if there is a topic which has 1m entries, through the binary search, it will take 20 iterations to find the message.

In some extreme cases, it may lead to a timeout, and the client will not be able to seeking by timestamp.

The motivation of this PR is to optimize the finding message by timestamp, to make it more efficient and faster.

Modifications

Before search entires, calculate the start, end position by LedgerInfo#timestamp and only search entries in the range to avoid search the entire ledgers.

Verifying this change

  • Make sure that the change passes the CI checks.

(Please pick either of the following options)

This change is a trivial rework / code cleanup without any test coverage.

(or)

This change is already covered by existing tests, such as (please describe tests).

(or)

This change added tests and can be verified as follows:

(example:)

  • Added integration tests for end-to-end deployment with large payloads (10MB)
  • Extended integration test for recovery after broker failure

Does this pull request potentially affect one of the following parts:

If the box was checked, please highlight the changes

  • Dependencies (add or upgrade a dependency)
  • The public API
  • The schema
  • The default values of configurations
  • The threading model
  • The binary protocol
  • The REST endpoints
  • The admin CLI options
  • The metrics
  • Anything that affects deployment

Documentation

  • doc
  • doc-required
  • doc-not-needed
  • doc-complete

Matching PR in forked repository

PR in forked repository:

@dao-jun dao-jun added area/broker ready-to-test category/performance Performance issues fix or improvements labels May 28, 2024
@dao-jun dao-jun added this to the 3.4.0 milestone May 28, 2024
@dao-jun dao-jun requested a review from codelipenghui May 28, 2024 15:17
@dao-jun dao-jun self-assigned this May 28, 2024
@github-actions github-actions bot added the doc-not-needed Your PR changes do not impact docs label May 28, 2024
@codecov-commenter
Copy link

codecov-commenter commented May 28, 2024

Codecov Report

Attention: Patch coverage is 77.77778% with 8 lines in your changes missing coverage. Please review.

Project coverage is 75.66%. Comparing base (bbc6224) to head (512fd0d).
Report is 835 commits behind head on master.

Files with missing lines Patch % Lines
...che/bookkeeper/mledger/impl/ManagedCursorImpl.java 75.00% 3 Missing and 2 partials ⚠️
...er/service/persistent/PersistentMessageFinder.java 81.25% 1 Missing and 2 partials ⚠️
Additional details and impacted files

Impacted file tree graph

@@             Coverage Diff              @@
##             master   #22792      +/-   ##
============================================
+ Coverage     73.57%   75.66%   +2.09%     
+ Complexity    32624     5895   -26729     
============================================
  Files          1877     1961      +84     
  Lines        139502   160677   +21175     
  Branches      15299    19557    +4258     
============================================
+ Hits         102638   121583   +18945     
- Misses        28908    30314    +1406     
- Partials       7956     8780     +824     
Flag Coverage Δ
inttests 29.57% <0.00%> (+4.99%) ⬆️
systests 25.62% <0.00%> (+1.30%) ⬆️
unittests 75.37% <77.77%> (+2.53%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
...a/org/apache/bookkeeper/mledger/ManagedCursor.java 31.57% <ø> (-3.72%) ⬇️
...er/service/persistent/PersistentMessageFinder.java 70.49% <81.25%> (+4.58%) ⬆️
...che/bookkeeper/mledger/impl/ManagedCursorImpl.java 82.03% <75.00%> (+2.74%) ⬆️

... and 703 files with indirect coverage changes

@KannarFr
Copy link
Contributor

Can I help merge this PR?

@dao-jun
Copy link
Member Author

dao-jun commented Jul 18, 2024

Can I help merge this PR?

This PR needs more review

@lhotari lhotari modified the milestones: 4.0.0, 4.1.0 Oct 11, 2024
@lhotari
Copy link
Member

lhotari commented Oct 14, 2024

@dao-jun please rebase

@lhotari lhotari added the triage/lhotari/important lhotari's triaging label for important issues or PRs label Oct 14, 2024
# Conflicts:
#	managed-ledger/src/main/java/org/apache/bookkeeper/mledger/impl/ManagedCursorImpl.java
#	managed-ledger/src/test/java/org/apache/bookkeeper/mledger/impl/ManagedCursorTest.java
@dao-jun
Copy link
Member Author

dao-jun commented Oct 17, 2024

@lhotari Rebased, please help review when you have time

@lhotari lhotari closed this Oct 17, 2024
@lhotari lhotari reopened this Oct 17, 2024
@lhotari
Copy link
Member

lhotari commented Oct 17, 2024

Closed and reopened to get the change to unblock CI.

@FlorentinDUBOIS
Copy link

Hello folks, do you have any news about this pull request?
We hit this issue on production and we (Clever Cloud) could help to test if necessary.

@dao-jun dao-jun closed this Nov 7, 2024
@dao-jun dao-jun reopened this Nov 7, 2024
@dao-jun
Copy link
Member Author

dao-jun commented Nov 7, 2024

close reopen to trigger the CI checks

@lhotari lhotari changed the title [improve][broker] Optimize seeking by timestamp [improve][broker] Optimize subscription seek (cursor reset) by timestamp Jan 9, 2025
@lhotari lhotari merged commit 2eb4eab into apache:master Jan 9, 2025
54 of 55 checks passed
lhotari added a commit that referenced this pull request Jan 9, 2025
…amp (#22792)

Co-authored-by: Lari Hotari <[email protected]>
(cherry picked from commit 2eb4eab)
@KannarFr
Copy link
Contributor

KannarFr commented Jan 9, 2025

Thanks a lot for the work done here @dao-jun @lhotari!

@dao-jun dao-jun deleted the dev/optimize_seeking branch January 9, 2025 13:21
poorbarcode pushed a commit to poorbarcode/pulsar that referenced this pull request Jan 23, 2025
@lhotari
Copy link
Member

lhotari commented Jan 29, 2025

@dao-jun There's a bug report that might be related to this change. Do you have a chance to check #23910? Thanks

@dao-jun
Copy link
Member Author

dao-jun commented Jan 30, 2025

@dao-jun There's a bug report that might be related to this change. Do you have a chance to check #23910? Thanks

Will do it after a few days~

hanmz pushed a commit to hanmz/pulsar that referenced this pull request Feb 12, 2025
BewareMyPower added a commit to BewareMyPower/pulsar that referenced this pull request Feb 26, 2025
… LTS

- Remove `managedLedgerOffloadReadThreads` from apache#24025
- Remove `managedLedgerMaxReadsInFlightPermitsAcquireTimeoutMillis` and
  `managedLedgerMaxReadsInFlightPermitsAcquireQueueSize` from apache#23901
- Remove `managedLedgerCursorResetLedgerCloseTimestampMaxClockSkewMillis` from apache#22792
The configs above only increase the complexity and are hard to configure.

Add more comments to `managedLedgerPersistIndividualAckAsLongArray` from apache#23759.
This config was added to keep the compatibility from 3.x or earlier so
it has value to retain. 3.x users must configure it with false when
upgrading to 4.0.
poorbarcode pushed a commit that referenced this pull request May 7, 2025
poorbarcode pushed a commit that referenced this pull request May 7, 2025
@poorbarcode
Copy link
Contributor

manas-ctds pushed a commit to datastax/pulsar that referenced this pull request May 7, 2025
…amp (apache#22792)

Co-authored-by: Lari Hotari <[email protected]>
(cherry picked from commit 2eb4eab)
(cherry picked from commit dd4e289)
manas-ctds pushed a commit to datastax/pulsar that referenced this pull request May 7, 2025
…amp (apache#22792)

Co-authored-by: Lari Hotari <[email protected]>
(cherry picked from commit 2eb4eab)
(cherry picked from commit dd4e289)
dao-jun added a commit to ascentstream/pulsar that referenced this pull request May 8, 2025
srinath-ctds pushed a commit to datastax/pulsar that referenced this pull request May 12, 2025
…amp (apache#22792)

Co-authored-by: Lari Hotari <[email protected]>
(cherry picked from commit 2eb4eab)
(cherry picked from commit dd4e289)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Optimise seeking by timestamp
7 participants