THRIFT-5862: Validate the message size at the endpoint transport only #3127

dengzhhu653 · 2025-04-22T00:51:57Z

Did you create an Apache Jira ticket? (Request account here, not required for trivial changes)
If a ticket exists: Does your pull request title follow the pattern "THRIFT-NNNN: describe my issue"?
Did you squash your changes to a single commit? (not required, but preferred)
Did you do your best to avoid breaking changes? If one was needed, did you label the Jira ticket with "Breaking-Change"?
If your change does not involve any code, include [skip ci] anywhere in the commit message to free up build resources.

Jens-G · 2025-04-22T08:29:59Z

Would love to hear the reason for this. Really.

dengzhhu653 · 2025-04-22T09:36:48Z

-1 Would love to hear the reason for this. Really.

Hi @Jens-G, mainly two reasons:

In a chain of transports, every transport(the transport extends TEndpointTransport, except the TLayeredTransport) updates and checks the knownMessageSize/remainingMessageSize based on the bytes fed, this is not straightforward and confused, which one should take effect, what if a transport updates the max message size via TTransport#getConfiguration.setMaxMessageSize.
For a service, we should always have an endpoint transport which gets fed from external source in a chain, it's more reasonable to validate the limit here instead of all the downstream transports.

For a chain which has only TSocket, this configured limit doesn't take effect, and no clear way to mark the boundary of the message, i.e, assumption on TTransport#flush to set the message end, though this is the check on the message read.

Jens-G · 2025-04-22T20:57:08Z

@ all: Opinions?

Jens-G · 2025-04-29T20:48:09Z

I still have problems with this one. Some layered transports do implement specific limits, the most prominent one is TFramedTransport. If we don't want to check that anymore, the whole point of the validation gets lost.

Another example would be compression implemented as a layered transport. Endpoint raeds a 42 KB message which all of a sudden produces petabytes during decompression.

this configured limit doesn't take effect, and no clear way to mark the boundary of the message,

That's why we introduce artificial limits to validate against. If the limit is (just as an example) 10 MB then reading a message of 10 MB plus one byte should fail with the appropriate error message, no matter how much more data the socket would be able to deliver. This is by design.

dengzhhu653 · 2025-04-29T23:19:25Z

Thank you @Jens-G! Here is my understanding, please correct me if wrong.
The TFramedTransport takes in innerTransport as the source transport, the message limit will be checked at the source innerTransport(If the innerTransport is a TEndpointTransport, otherwise the limit check would be propagated to the source of innerTransport, continue...), this TFramedTransport will validate the frame limit only. For the frame limit we can get from the innerTransport's configuration if any, otherwise default, it's missing in current PR.

Another example would be compression implemented as a layered transport

This is a good case. TZlibTransport for example, it still checks the message limit after the de-compression. we still allow the user to implement their own message limit based on their needs, for example:

thrift/lib/java/src/test/java/org/apache/thrift/protocol/ProtocolTestBase.java

Lines 563 to 567 in 05b0659

    
           public int read(byte[] buf, int off, int len) throws TTransportException { 
        
             int iBytesRead = bufferTrans.read(buf, off, len); 
        
             consumeReadMessageBytes(iBytesRead); 
        
             return iBytesRead; 
        
           }

no matter how much more data the socket would be able to deliver. This is by design.

An issue I can think of is how to distinguish different messages a socket might send, a Hive client can re-use the same socket for future uses, either for a different request or polling the request's status from the server.

Jens-G · 2025-05-01T20:30:24Z

For the frame limit we can get from the innerTransport's configuration if any, otherwise default, it's missing in current PR.

I was thinking about the frame size delivered with the message. That one is usually way lower than the allowed limit. That's what I meant by

Some layered transports do implement specific limits

Jens-G · 2025-05-01T20:53:35Z

for example:

Your example is broken. If someone passes in int.Max for len the function will try to read it w/o complaint. Exactly that should not happen. Asking consumeReadMessageBytes() for validation after you already did the read() it is rather pointless, because the test is intended to happen before we even try to read.

While I appreciate the efforts you put into this, I'm still not convinced (a) we should do this at all and (b) with the quality of the patch. If there are good arguments for a patch that let me believe in it and convince me that my doubts are unfounded, I am perfectly fine with it. But so far, you only managed to increased the latter.

dengzhhu653 · 2025-05-05T03:42:55Z

While I was trying to debug a problem, I found every transport had knownMessageSize and remainingMessageSize, in this example, TSaslServerTransport, TSocket and TMemoryInputTransport:

When "MaxMessageSize reached", only remainingMessageSize in TMemoryInputTransport was reaching the limit, the stack:

org.apache.thrift.transport.TTransportException: MaxMessageSize reached
at org.apache.thrift.transport.TEndpointTransport.countConsumedMessageBytes(TEndpointTransport.java:96)
at org.apache.thrift.transport.TMemoryInputTransport.read(TMemoryInputTransport.java:97)
at org.apache.thrift.transport.TSaslTransport.read(TSaslTransport.java:390)
at org.apache.thrift.transport.TSaslServerTransport.read(TSaslServerTransport.java:44)
at org.apache.thrift.transport.TTransport.readAll(TTransport.java:109)
at org.apache.thrift.protocol.TBinaryProtocol.readStringBody(TBinaryProtocol.java:417)
at org.apache.thrift.protocol.TBinaryProtocol.readString(TBinaryProtocol.java:411)
at org.apache.hadoop.hive.metastore.api.StorageDescriptor$StorageDescriptorStandardScheme.read(StorageDescriptor.java:1289)

Looks it's of no use to maintain the knownMessageSize and remainingMessageSize in the TSaslServerTransport, TSocket, so I'm thinking of how to improve and trim the validation from the chain. Instead of putting the limit check on the top of the chain(in reality we don't know who will at the top), I try to put it at the bottom(the TSocket) as this PR suggests.

the test is intended to happen before we even try to read.

yeah, this happens on the read on a string or a container as far as I know, which we have some checks on them on the protocol lawyer.

THRIFT-5862: Validate the message size at the endpoint transport only

05b0659

Jens-G added the java Pull requests that update Java code label Apr 23, 2025

dengzhhu653 mentioned this pull request Apr 29, 2025

THRIFT-5863: Add method for customizing the TConfiguration in TServer… #3131

Open

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

THRIFT-5862: Validate the message size at the endpoint transport only #3127

THRIFT-5862: Validate the message size at the endpoint transport only #3127

dengzhhu653 commented Apr 22, 2025 •

edited

Loading

Jens-G commented Apr 22, 2025 •

edited

Loading

dengzhhu653 commented Apr 22, 2025 •

edited

Loading

Jens-G commented Apr 22, 2025 •

edited

Loading

Jens-G commented Apr 29, 2025 •

edited

Loading

dengzhhu653 commented Apr 29, 2025 •

edited

Loading

Jens-G commented May 1, 2025 •

edited

Loading

Jens-G commented May 1, 2025

dengzhhu653 commented May 5, 2025

THRIFT-5862: Validate the message size at the endpoint transport only #3127

Are you sure you want to change the base?

THRIFT-5862: Validate the message size at the endpoint transport only #3127

Conversation

dengzhhu653 commented Apr 22, 2025 • edited Loading

Jens-G commented Apr 22, 2025 • edited Loading

dengzhhu653 commented Apr 22, 2025 • edited Loading

Jens-G commented Apr 22, 2025 • edited Loading

Jens-G commented Apr 29, 2025 • edited Loading

dengzhhu653 commented Apr 29, 2025 • edited Loading

Jens-G commented May 1, 2025 • edited Loading

Jens-G commented May 1, 2025

dengzhhu653 commented May 5, 2025

dengzhhu653 commented Apr 22, 2025 •

edited

Loading

Jens-G commented Apr 22, 2025 •

edited

Loading

dengzhhu653 commented Apr 22, 2025 •

edited

Loading

Jens-G commented Apr 22, 2025 •

edited

Loading

Jens-G commented Apr 29, 2025 •

edited

Loading

dengzhhu653 commented Apr 29, 2025 •

edited

Loading

Jens-G commented May 1, 2025 •

edited

Loading