-
Notifications
You must be signed in to change notification settings - Fork 3.6k
[fix][proxy] Fix proxy OOM by replacing TopicName with a simple conversion method #24465
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
[fix][proxy] Fix proxy OOM by replacing TopicName with a simple conversion method #24465
Conversation
// Maps the topic name from the request to the full topic name | ||
private final Map<String, String> topicNameCache = new HashMap<>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When multiple connections access the same topic, will it cause more heap memory occupation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. However, once a connection is disconnected, the cache will be garbage collected. In extreme cases, if many consumers subscribe
How about removing the cache? This conversion is actually very fast. It's only about 2.2x slower than reading from a TopicName
cache.
TopicNameBenchmark.testConvert thrpt 284.114 ops/s
TopicNameBenchmark.testReadFromCache thrpt 644.718 ops/s
and it can execute 30000*284 operations in 1 second.

There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i.e. each operation only spends only 0.12 MICROSECONDS. The "space-for-time tradeoff" might not worth here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I removed the cache and updated the PR description
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
200k keys:
TopicNameBenchmark.testConvert thrpt 14.706 ops/s
TopicNameBenchmark.testReadFromCache thrpt 16.639 ops/s
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, I think we can move toFullTopicName
to the TopicName
class.
Yes. But let's keep it in |
I changed my idea and moved the method to |
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #24465 +/- ##
============================================
+ Coverage 73.57% 74.28% +0.71%
- Complexity 32624 32765 +141
============================================
Files 1877 1868 -9
Lines 139502 145547 +6045
Branches 15299 16657 +1358
============================================
+ Hits 102638 108126 +5488
+ Misses 28908 28863 -45
- Partials 7956 8558 +602
Flags with carried forward coverage won't be shown. Click here to find out more.
🚀 New features to boost your workflow:
|
Fixes #24445
Motivation
See #24458 for details. However
After reviewing the usages of
TopicName
inpulsar-proxy
module, I found the only case is to convert a user-provided topic to the full topic name. UsingTopicName
has many unnecessary overhead like parsing aNamespaceName
(see the cost from the benchmark here), or maintaining theTopicName
object in a global cache that requires a thread-safe access viaConcurrentHashMap
.Modifications
Introduce a new method
TopicName#toFullTopicName
for the purpose ofTopicName
inpulsar-proxy
module. Instead of using a global cache, just call this method directly for the conversion.It does not introduce any new configuration. From the benchmark, it's only 2.2x slower than the
TopicName#get
, when there are only 200k keys. But it's faster thanTopicName#get
.However, it saves memory and the performance should be acceptable that it can be executed 8.5 millions in 1 second.
Documentation
doc
doc-required
doc-not-needed
doc-complete
Matching PR in forked repository
PR in forked repository: BewareMyPower#46