-
Notifications
You must be signed in to change notification settings - Fork 172
Use ByteBuffer's for serialization/deserialization. #356
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
3b188fa
to
17088a6
Compare
47678a3
to
7f7c74a
Compare
static final ByteBuffer OP_PING = ByteBuffer.wrap(new byte[] { 'P', 'I', 'N', 'G' }); | ||
static final ByteBuffer OP_PONG = ByteBuffer.wrap(new byte[] { 'P', 'O', 'N', 'G' }); | ||
static final ByteBuffer OP_OK = ByteBuffer.wrap(new byte[] { '+', 'O', 'K' }); | ||
static final ByteBuffer OP_ERR = ByteBuffer.wrap(new byte[] { '-', 'E', 'R', 'R' }); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know important it is to squeeze every bit of performance while initializing, and I don't know how much difference there would be if any, but it might be more readable to just do:
static final ByteBuffer OP_ERR = ByteBuffer.wrap("-ERR".getBytes());
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did it this way mostly since it's pretty obvious from the variables names what they are, and since I'm wasn't sure there wouldn't be any unexpected performance issues due to allocation weirdness when calling getBytes()
. I've seen instances where bytebuffers get sized weirdly when converting from strings sometimes since Java strings are essentially UTF-16 and not ASCII/UTF-8(which is what we need here), I figured this version was more obviously correct since it wouldn't be prone to those side effects as it entirely avoids Java UTF-16 formatted strings.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. Better safe than sorry!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I've tried to fairly aggressively minimize the usage of strings wherever feasible, especially in the lower level serialization/de-serialization code as string conversions are relatively expensive operations that need to be aggressively minimized for performance. Strings are also just difficult to work with in general with lower level Java networking code since they make accurate buffer sizing difficult, hence why I'm trying to have everything immediately normalized to ByteBuffer
's internally as soon as feasible.
@@ -34,36 +33,25 @@ | |||
NatsMessage next; // for linked list | |||
|
|||
// Create a message to publish | |||
NatsMessage(String subject, String replyTo, ByteBuffer data, boolean utf8mode) { | |||
NatsMessage(ByteBuffer subject, ByteBuffer replyTo, ByteBuffer data, boolean utf8mode) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since everything is already in buffers by here, can utf8mode flag be completely removed from NatsMessage?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably, I left it there since I wasn't sure if this was a public API or not though.
083d95c
to
9195448
Compare
So while I was a waiting for this to be resolved since my work directly depends on it, I took the time to build a rudimentary benchmark. My concern was the buffer change adds complexity that outweighs the benefits. I believe the test support that opinion. The test was new-ing 3 NatsMessage which just basically does a little string or buffer manipulation. I don't do anything with the message except create it. The 3 creates were 1) a protocol only message, 2) a publish message without utf8 mode and 3) a publish method with utf8mode. I do that because the String version has slightly different pathway with the publish modes. The buffer version does an encoding both modes. The string version uses builders as I had already written that code. I ran those tests 2 billion times each. On my laptop. I made sure nothing else was running. I tried to write the code to time only create code. It's not perfect, but I think it supports my point. In this case, buffers actually seem slower. The raw numbers were :
The buffers version of the benchmark:
The string/builders version:
|
That's expected because I moved a lot of serialization/format conversions out of |
9195448
to
3fe57fe
Compare
Yes, agreed. The test is far from perfect and does not take into account all your changes. So maybe we should build a better benchmark. But it still seems to support not adding complexity until we know it's really worth it. |
This while somewhat more verbose does make it significantly easier to size buffers properly, this is rather important for optimizing a lot of the low level message handling code, it also allows us to eliminate redundant codepaths in |
this.gatherProtocol(bytesRead); | ||
} else { | ||
this.gatherMessageProtocol(bytesRead); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is eliminated.
CharBuffer buff = StandardCharsets.UTF_8.decode(protocolBuffer); | ||
protocolLength = buff.remaining(); | ||
buff.get(this.msgLineChars, 0, protocolLength); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This as well.
} | ||
} else { | ||
return UNKNOWN_OP; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also was able to eliminate this scary function.
So let's pretend we fix the benchmark I wrote to consider the new code better. And lets say instead of being 3 minutes slower, the buffer version now runs X minutes faster.
|
I don't really consider this to be an overall complexity increase, but more of a complexity shift that moves a large amount of the string message serialization/de-serialization complexity out of lower level layers and into higher layers, the idea is that this significantly decreases complexity around certain areas in order to make further refactoring and optimizations such as using Fundamentally at their lowest level network protocols are byte oriented, not string oriented, I can see why this change may look more complex as a lot of the I should also add that one of the goals for this change is to make it possible to add additional public API's that allow for the nats library to be used without any string encoding/decoding operations whatsoever for certain hot-paths, in order to make that possible without a large amount of code duplication we need to move all string encoding/decoding out of the low level connection reader/writer code. In terms of use case a lot of what I intend to use nats for is for load balanced services that act as protocol converters/proxies, which essentially ingest a nats message, do some conversion operations/filtering and forward the request on to other services that may not speak nats protocol directly. For example this would be used for gateway services that essentially would act as high performance legacy protocol conversion gateway proxies that wrap legacy backend services which in many cases may not be able to be made to easily speak nats protocol directly. In order to efficiently interface with these legacy network protocols in a low latency manner we want to avoid any unnecessary string encoding operations and pass bytes/buffers directly from one network socket to the other with both protocols using non-blocking IO without doing encoding/decoding of any part of the nats message to/from strings. Using a buffer passing approach here should make it significantly easier to heavily optimize for these performance critical codepaths as Java's async non-blocking IO support is largely built around
This set of changes by itself is not really intended to significantly improve performance by itself but rather act as groundwork for further refactoring that will do so, it's probably a bit early to be micro-benching performance as further refactoring will be needed to see a significant benefit. That can be done more incrementally I think while this set of changes was much more difficult to split up as intermixing |
I now have SSL working in #357 which is based off of this. |
I wanted to give an update for this PR. There are many PRs that will need to be merged about the same time, so would like to take a methodical approach to this. #362 and #354 contain missing functionality that is delaying POCs of JetStream so these should be merged first. At that point I'd like to cut a beta release to unblock these user and then look at byte buffers (this PR and #364). Optimization is very important to us but it's more important to to unblock our users waiting use JetStream, with a minimum of risk. This PR will get reviewed and merged - we just need to get the headers and the Jetstream functionality in first. I apologize in advance for the rebase that will need to happen, but from a product perspective we need to prioritize this way. Thanks again for all of this work and please be patient as we move forward with merging these PRs. |
Yeah, I'm going over and refactoring the headers PR a bit to make the re-base this easier at the moment. |
FYI - #354 will likely change quite a bit. |
I have done quite a bit of work on the client, used bytebuffers in some places, wrapped with ByteArrayBuilders. I have also done some profiling. We have some places in the client where we need to improve performance, but string manipulation is extremely low on the places that profile poorly. Closing this PR |
Some more groundwork for #347.