[improve][admin] PIP-416: Add a new topic method to implement trigger offload by size threshold #24420

JunFu0814 · 2025-06-17T06:18:46Z

Main Issue: #24276

Motivation

For pip #24276 , add a new admin api for trigger offload with size threshold.

Modifications

Add new admin apis for trigger offload with size threshold.
Optimize formatting.
Enhance testing.

Verifying this change

Make sure that the change passes the CI checks.

(Please pick either of the following options)

This change is a trivial rework / code cleanup without any test coverage.

(or)

This change is already covered by existing tests, such as (please describe tests).

(or)

This change added tests and can be verified as follows:

(example:)

Added integration tests for end-to-end deployment with large payloads (10MB)
Extended integration test for recovery after broker failure

Does this pull request potentially affect one of the following parts:

If the box was checked, please highlight the changes

Documentation

doc
doc-required
doc-not-needed
doc-complete

Matching PR in forked repository

PR in forked repository:

github-actions · 2025-06-17T06:19:17Z

@JunFu0814 Please add the following content to your PR description and select a checkbox:

- [ ] `doc` <!-- Your PR contains doc changes -->
- [ ] `doc-required` <!-- Your PR changes impact docs and you will update later -->
- [ ] `doc-not-needed` <!-- Your PR changes do not impact docs -->
- [ ] `doc-complete` <!-- Docs have been already added -->

nodece · 2025-06-18T02:55:43Z

pulsar-client-admin/src/main/java/org/apache/pulsar/client/admin/internal/TopicsImpl.java

+    public CompletableFuture<Void> triggerOffloadAsync(String topic, long sizeThreshold) {
+        CompletableFuture<Void> future = new CompletableFuture<>();
+        try {
+            PersistentTopicInternalStats stats = getInternalStats(topic);


Please use getInternalStatsAsync instead of getInternalStats to avoid thread blocking.

nice suggestion

@nodece please review again bc9d735

pulsar-client-admin-api/src/main/java/org/apache/pulsar/client/admin/Topics.java

codelipenghui · 2025-06-19T02:07:39Z

pulsar-client-admin/src/main/java/org/apache/pulsar/client/admin/internal/TopicsImpl.java

+    private MessageId findFirstLedgerWithinThreshold(List<PersistentTopicInternalStats.LedgerInfo> ledgers,
+                                                     long sizeThreshold) {
+        long suffixSize = 0L;
+
+        ledgers = Lists.reverse(ledgers);
+        long previousLedger = ledgers.get(0).ledgerId;
+        for (PersistentTopicInternalStats.LedgerInfo l : ledgers) {
+            suffixSize += l.size;
+            if (suffixSize > sizeThreshold) {
+                return new MessageIdImpl(previousLedger, 0L, -1);
+            }
+            previousLedger = l.ledgerId;
+        }
+        return null;
+    }


It's better to move to the broker side which can provide consistent behavior from the admin CLI and the admin REST API.

hi @codelipenghui , It is a better way to move to the broker side, which can ensure consistent behavior of cli and clients of any language type, but of course this will also bring more workload. In the current pip, I will first ensure that cli and java client use the same findFirstLedgerWithinThreshold logic. Please help review on this ff002be.

but of course this will also bring more workload

While both approaches require iterating over ledgers, the scope of that iteration differs:

Client-side: You'd have to iterate over all ledgers to determine what needs offloading.

Broker-side: The broker only needs to iterate over the specific ledgers that are being offloaded, directly accessing data already in memory.

Even if the client pre-calculates message IDs and sends them to the broker, the broker still needs to iterate the ledger map and decide ledgers should be offloaded. The key is that the broker can do this much more efficiently, leveraging its existing in-memory structures without the constant creation and disposal of new objects.

BTW, the broker side already has most of the implementation

pulsar/managed-ledger/src/main/java/org/apache/bookkeeper/mledger/impl/ManagedLedgerImpl.java

Lines 2619 to 2696 in 73a4ae4

private void maybeOffload(long offloadThresholdInBytes, long offloadThresholdInSeconds,

CompletableFuture<Position> finalPromise) {

if (getOffloadPoliciesIfAppendable().isEmpty()) {

String msg = String.format("[%s] Nothing to offload due to offloader or offloadPolicies is NULL", name);

finalPromise.completeExceptionally(new IllegalArgumentException(msg));

return;

}

if (offloadThresholdInBytes < 0 && offloadThresholdInSeconds < 0) {

String msg = String.format("[%s] Nothing to offload due to [managedLedgerOffloadThresholdInBytes] and "

+ "[managedLedgerOffloadThresholdInSeconds] less than 0.", name);

finalPromise.completeExceptionally(new IllegalArgumentException(msg));

return;

}

if (!offloadMutex.tryLock()) {

scheduledExecutor.schedule(() -> maybeOffloadInBackground(finalPromise),

100, TimeUnit.MILLISECONDS);

return;

}

CompletableFuture<Position> unlockingPromise = new CompletableFuture<>();

unlockingPromise.whenComplete((res, ex) -> {

offloadMutex.unlock();

if (ex != null) {

finalPromise.completeExceptionally(ex);

} else {

finalPromise.complete(res);

}

});

long sizeSummed = 0;

long toOffloadSize = 0;

long alreadyOffloadedSize = 0;

ConcurrentLinkedDeque<LedgerInfo> toOffload = new ConcurrentLinkedDeque<>();

final long offloadTimeThresholdMillis = TimeUnit.SECONDS.toMillis(offloadThresholdInSeconds);

for (Map.Entry<Long, LedgerInfo> e : ledgers.descendingMap().entrySet()) {

final LedgerInfo info = e.getValue();

// Skip current active ledger, an active ledger can't be offloaded.

// Can't `info.getLedgerId() == currentLedger.getId()` here, trigger offloading is before create ledger.

if (info.getTimestamp() == 0L) {

continue;

}

final long size = info.getSize();

final long timestamp = info.getTimestamp();

final long now = System.currentTimeMillis();

sizeSummed += size;

final boolean alreadyOffloaded = info.hasOffloadContext() && info.getOffloadContext().getComplete();

if (alreadyOffloaded) {

alreadyOffloadedSize += size;

} else {

if ((offloadThresholdInBytes >= 0 && sizeSummed > offloadThresholdInBytes)

|| (offloadTimeThresholdMillis >= 0 && now - timestamp >= offloadTimeThresholdMillis)) {

toOffloadSize += size;

toOffload.addFirst(info);

}

}

}

if (toOffload.size() > 0) {

log.info("[{}] Going to automatically offload ledgers {}"

+ ", total size = {}, already offloaded = {}, to offload = {}",

name, toOffload.stream().map(LedgerInfo::getLedgerId).collect(Collectors.toList()),

sizeSummed, alreadyOffloadedSize, toOffloadSize);

offloadLoop(unlockingPromise, toOffload, PositionFactory.LATEST, Optional.empty());

} else {

// offloadLoop will complete immediately with an empty list to offload

log.debug("[{}] Nothing to offload, total size = {}, already offloaded = {}, "

+ "threshold = [managedLedgerOffloadThresholdInBytes:{}, "

+ "managedLedgerOffloadThresholdInSeconds:{}]",

name, sizeSummed, alreadyOffloadedSize, offloadThresholdInBytes,

TimeUnit.MILLISECONDS.toSeconds(offloadTimeThresholdMillis));

unlockingPromise.complete(PositionFactory.LATEST);

}

}

There is no need to have duplicated codes for like findFirstLedgerWithinThreshold.

@codelipenghui Thank you for your suggestion, I will understand the logic here in depth frist

…dmin

magicfujun added 3 commits April 23, 2025 16:44

[feat][admin] add triggerOffload with sizeThreshold api

34e83b5

[feat][admin] update triggerOffload with sizeThreshold api

a4fef99

Merge branch 'master' into feat-trigger-offload-with-size-threshold

9c0fd30

github-actions bot added the doc-label-missing label Jun 17, 2025

JunFu0814 changed the title ~~[feat][client] clinet apitrigger offload with size threshold~~ [feat][admin] new admin api for trigger offload with size threshold Jun 17, 2025

github-actions bot added doc-not-needed Your PR changes do not impact docs and removed doc-label-missing labels Jun 17, 2025

codelipenghui assigned JunFu0814 Jun 17, 2025

codelipenghui added this to the 4.1.0 milestone Jun 17, 2025

codelipenghui added type/feature The PR added a new feature or issue requested a new feature area/broker area/tieredstorage labels Jun 17, 2025

JunFu0814 removed their assignment Jun 18, 2025

nodece reviewed Jun 18, 2025

View reviewed changes

nodece changed the title ~~[feat][admin] new admin api for trigger offload with size threshold~~ [improve][admin] PIP-416: Add a new topic method to implement trigger offload by size threshold Jun 18, 2025

nodece assigned lhotari, codelipenghui and dao-jun and unassigned lhotari, codelipenghui and dao-jun Jun 18, 2025

nodece requested review from lhotari, codelipenghui and dao-jun June 18, 2025 02:57

nodece assigned JunFu0814 Jun 18, 2025

codelipenghui requested changes Jun 19, 2025

View reviewed changes

magicfujun added 3 commits June 19, 2025 17:21

[feat][admin] Optimize API parameter explanation

cac116f

[feat][admin] use getInternalStatsAsync instead of getInternalStats

bc9d735

[feat][admin] Unify findFirstLedgerWithinThreshold logic in cli and a…

ff002be

…dmin

magicfujun added 2 commits June 20, 2025 10:15

[feat][admin] 修复ci test问题

1cf7969

[feat][admin] 修复ci test问题

f7c72ba

JunFu0814 requested a review from nodece June 20, 2025 06:45

dao-jun approved these changes Jun 24, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[improve][admin] PIP-416: Add a new topic method to implement trigger offload by size threshold #24420

[improve][admin] PIP-416: Add a new topic method to implement trigger offload by size threshold #24420

Uh oh!

JunFu0814 commented Jun 17, 2025 •

edited

Loading

Uh oh!

github-actions bot commented Jun 17, 2025

Uh oh!

nodece Jun 18, 2025

Uh oh!

JunFu0814 Jun 19, 2025

Uh oh!

JunFu0814 Jun 20, 2025

Uh oh!

Uh oh!

codelipenghui Jun 19, 2025

Uh oh!

JunFu0814 Jun 20, 2025

Uh oh!

codelipenghui Jun 20, 2025

Uh oh!

codelipenghui Jun 20, 2025

Uh oh!

JunFu0814 Jun 24, 2025

Uh oh!

Uh oh!

	private void maybeOffload(long offloadThresholdInBytes, long offloadThresholdInSeconds,
	CompletableFuture<Position> finalPromise) {
	if (getOffloadPoliciesIfAppendable().isEmpty()) {
	String msg = String.format("[%s] Nothing to offload due to offloader or offloadPolicies is NULL", name);
	finalPromise.completeExceptionally(new IllegalArgumentException(msg));
	return;
	}

	if (offloadThresholdInBytes < 0 && offloadThresholdInSeconds < 0) {
	String msg = String.format("[%s] Nothing to offload due to [managedLedgerOffloadThresholdInBytes] and "
	+ "[managedLedgerOffloadThresholdInSeconds] less than 0.", name);
	finalPromise.completeExceptionally(new IllegalArgumentException(msg));
	return;
	}

	if (!offloadMutex.tryLock()) {
	scheduledExecutor.schedule(() -> maybeOffloadInBackground(finalPromise),
	100, TimeUnit.MILLISECONDS);
	return;
	}

	CompletableFuture<Position> unlockingPromise = new CompletableFuture<>();
	unlockingPromise.whenComplete((res, ex) -> {
	offloadMutex.unlock();
	if (ex != null) {
	finalPromise.completeExceptionally(ex);
	} else {
	finalPromise.complete(res);
	}
	});

	long sizeSummed = 0;
	long toOffloadSize = 0;
	long alreadyOffloadedSize = 0;
	ConcurrentLinkedDeque<LedgerInfo> toOffload = new ConcurrentLinkedDeque<>();
	final long offloadTimeThresholdMillis = TimeUnit.SECONDS.toMillis(offloadThresholdInSeconds);

	for (Map.Entry<Long, LedgerInfo> e : ledgers.descendingMap().entrySet()) {
	final LedgerInfo info = e.getValue();
	// Skip current active ledger, an active ledger can't be offloaded.
	// Can't `info.getLedgerId() == currentLedger.getId()` here, trigger offloading is before create ledger.
	if (info.getTimestamp() == 0L) {
	continue;
	}

	final long size = info.getSize();
	final long timestamp = info.getTimestamp();
	final long now = System.currentTimeMillis();
	sizeSummed += size;

	final boolean alreadyOffloaded = info.hasOffloadContext() && info.getOffloadContext().getComplete();
	if (alreadyOffloaded) {
	alreadyOffloadedSize += size;
	} else {
	if ((offloadThresholdInBytes >= 0 && sizeSummed > offloadThresholdInBytes)
	\|\| (offloadTimeThresholdMillis >= 0 && now - timestamp >= offloadTimeThresholdMillis)) {
	toOffloadSize += size;
	toOffload.addFirst(info);
	}
	}
	}

	if (toOffload.size() > 0) {
	log.info("[{}] Going to automatically offload ledgers {}"
	+ ", total size = {}, already offloaded = {}, to offload = {}",
	name, toOffload.stream().map(LedgerInfo::getLedgerId).collect(Collectors.toList()),
	sizeSummed, alreadyOffloadedSize, toOffloadSize);
	offloadLoop(unlockingPromise, toOffload, PositionFactory.LATEST, Optional.empty());
	} else {
	// offloadLoop will complete immediately with an empty list to offload
	log.debug("[{}] Nothing to offload, total size = {}, already offloaded = {}, "
	+ "threshold = [managedLedgerOffloadThresholdInBytes:{}, "
	+ "managedLedgerOffloadThresholdInSeconds:{}]",
	name, sizeSummed, alreadyOffloadedSize, offloadThresholdInBytes,
	TimeUnit.MILLISECONDS.toSeconds(offloadTimeThresholdMillis));
	unlockingPromise.complete(PositionFactory.LATEST);
	}
	}

[improve][admin] PIP-416: Add a new topic method to implement trigger offload by size threshold #24420

Are you sure you want to change the base?

[improve][admin] PIP-416: Add a new topic method to implement trigger offload by size threshold #24420

Uh oh!

Conversation

JunFu0814 commented Jun 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivation

Modifications

Verifying this change

Does this pull request potentially affect one of the following parts:

Documentation

Matching PR in forked repository

Uh oh!

github-actions bot commented Jun 17, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

JunFu0814 commented Jun 17, 2025 •

edited

Loading