Skip to content

[fix][proxy] Fix proxy OOM by replacing TopicName with a simple conversion method #24465

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

BewareMyPower
Copy link
Contributor

@BewareMyPower BewareMyPower commented Jun 25, 2025

Fixes #24445

Motivation

See #24458 for details. However

After reviewing the usages of TopicName in pulsar-proxy module, I found the only case is to convert a user-provided topic to the full topic name. Using TopicName has many unnecessary overhead like parsing a NamespaceName (see the cost from the benchmark here), or maintaining the TopicName object in a global cache that requires a thread-safe access via ConcurrentHashMap.

Modifications

Introduce a new method TopicName#toFullTopicName for the purpose of TopicName in pulsar-proxy module. Instead of using a global cache, just call this method directly for the conversion.

It does not introduce any new configuration. From the benchmark, it's only 2.2x slower than the TopicName#get, when there are only 200k keys. But it's faster than TopicName#get.

However, it saves memory and the performance should be acceptable that it can be executed 8.5 millions in 1 second.

Documentation

  • doc
  • doc-required
  • doc-not-needed
  • doc-complete

Matching PR in forked repository

PR in forked repository: BewareMyPower#46

Comment on lines 90 to 91
// Maps the topic name from the request to the full topic name
private final Map<String, String> topicNameCache = new HashMap<>();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When multiple connections access the same topic, will it cause more heap memory occupation?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. However, once a connection is disconnected, the cache will be garbage collected. In extreme cases, if many consumers subscribe

How about removing the cache? This conversion is actually very fast. It's only about 2.2x slower than reading from a TopicName cache.

TopicNameBenchmark.testConvert        thrpt       284.114          ops/s
TopicNameBenchmark.testReadFromCache  thrpt       644.718          ops/s

and it can execute 30000*284 operations in 1 second.

image

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i.e. each operation only spends only 0.12 MICROSECONDS. The "space-for-time tradeoff" might not worth here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed the cache and updated the PR description

Copy link
Contributor Author

@BewareMyPower BewareMyPower Jun 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

image

Well, if there are 500k keys, TopicName#get is slower.

TopicNameBenchmark.testConvert        thrpt       5.857          ops/s
TopicNameBenchmark.testReadFromCache  thrpt       5.326          ops/s

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

200k keys:

TopicNameBenchmark.testConvert        thrpt       14.706          ops/s
TopicNameBenchmark.testReadFromCache  thrpt       16.639          ops/s

@BewareMyPower BewareMyPower changed the title [fix][proxy] Fix proxy OOM by using a topic cache per channel [fix][proxy] Fix proxy OOM by replacing TopicName with a simple conversion method Jun 26, 2025
Copy link
Member

@nodece nodece left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, I think we can move toFullTopicName to the TopicName class.

@BewareMyPower
Copy link
Contributor Author

I think we can move toFullTopicName to the TopicName class.

Yes. But let's keep it in TopicNameUtils for now in case there are some conflicts.

@BewareMyPower
Copy link
Contributor Author

I changed my idea and moved the method to TopicName because it has a duplicated method with #24463

@codecov-commenter
Copy link

Codecov Report

Attention: Patch coverage is 80.76923% with 10 lines in your changes missing coverage. Please review.

Project coverage is 74.28%. Comparing base (bbc6224) to head (83d9e82).
Report is 1167 commits behind head on master.

Files with missing lines Patch % Lines
...ava/org/apache/pulsar/common/naming/TopicName.java 80.48% 4 Missing and 4 partials ⚠️
...rg/apache/pulsar/common/api/raw/MessageParser.java 66.66% 1 Missing ⚠️
...apache/pulsar/proxy/server/LookupProxyHandler.java 66.66% 1 Missing ⚠️
Additional details and impacted files

Impacted file tree graph

@@             Coverage Diff              @@
##             master   #24465      +/-   ##
============================================
+ Coverage     73.57%   74.28%   +0.71%     
- Complexity    32624    32765     +141     
============================================
  Files          1877     1868       -9     
  Lines        139502   145547    +6045     
  Branches      15299    16657    +1358     
============================================
+ Hits         102638   108126    +5488     
+ Misses        28908    28863      -45     
- Partials       7956     8558     +602     
Flag Coverage Δ
inttests 26.87% <26.92%> (+2.28%) ⬆️
systests 23.35% <26.92%> (-0.98%) ⬇️
unittests 73.77% <80.76%> (+0.93%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
...apache/pulsar/proxy/server/ParserProxyHandler.java 91.95% <100.00%> (+0.09%) ⬆️
...rg/apache/pulsar/common/api/raw/MessageParser.java 52.94% <66.66%> (+1.42%) ⬆️
...apache/pulsar/proxy/server/LookupProxyHandler.java 60.41% <66.66%> (-0.18%) ⬇️
...ava/org/apache/pulsar/common/naming/TopicName.java 91.90% <80.48%> (-4.16%) ⬇️

... and 1085 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
doc-not-needed Your PR changes do not impact docs release/3.0.13 release/3.3.8 release/4.0.6 type/bug The PR fixed a bug or issue reported a bug
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Bug] Memory leak in Pulsar Proxy with TopicName
4 participants