Skip to content

fix(java): Use (long, long, byte) key for MetaStringBytes cache to prevent collisions #2308

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 20 commits into from
Jun 7, 2025

Conversation

LouisLou2
Copy link
Contributor

What does this PR do?

This PR fixes a ClassCastException during deserialization caused by an incorrect MetaStringBytes being retrieved from the cache. The previous cache key (long, long) in MetaStringResolver (for small meta strings) didn't include the string's encoding byte, leading to collisions when different strings (e.g., "aclass" vs "Aclass") hashed to the same two longs.

Changes:

  1. Introduced LongLongByteMap: A new map using a (long k1, long k2, byte k3) key.
  2. Updated MetaStringResolver: Modified relevant methods (e.g., createSmallMetaStringBytes, readSmallMetaStringBytes) to use LongLongByteMap, incorporating the encoding byte into the cache key for small meta strings.
  3. Renamed Test: LongLongMapTest.java renamed to LongLongByteMapTest.java and updated to test the new map.

Related issues

Does this PR introduce any user-facing change?

  • Does this PR introduce any public API change?
  • Does this PR introduce any binary protocol compatibility change?

Benchmark

@LouisLou2 LouisLou2 requested a review from chaokunyang as a code owner June 7, 2025 05:53
@LouisLou2 LouisLou2 changed the title fix(java): Use (long, long, byte) key for MetaStringBytes cache to prevent collisions` fix(java): Use (long, long, byte) key for MetaStringBytes cache to prevent collisions Jun 7, 2025
Copy link
Collaborator

@chaokunyang chaokunyang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great!

@chaokunyang chaokunyang merged commit e832804 into apache:main Jun 7, 2025
50 checks passed
chaokunyang pushed a commit to chaokunyang/fury that referenced this pull request Jun 7, 2025
…event collisions (apache#2308)

## What does this PR do?

This PR fixes a `ClassCastException` during deserialization caused by an
incorrect `MetaStringBytes` being retrieved from the cache. The previous
cache key `(long, long)` in `MetaStringResolver` (for small meta
strings) didn't include the string's `encoding` byte, leading to
collisions when different strings (e.g., "aclass" vs "Aclass") hashed to
the same two longs.

**Changes:**

1. **Introduced `LongLongByteMap`:** A new map using a `(long k1, long
k2, byte k3)` key.
2. **Updated `MetaStringResolver`:** Modified relevant methods (e.g.,
`createSmallMetaStringBytes`, `readSmallMetaStringBytes`) to use
`LongLongByteMap`, incorporating the `encoding` byte into the cache key
for small meta strings.
3. **Renamed Test:** `LongLongMapTest.java` renamed to
`LongLongByteMapTest.java` and updated to test the new map.

## Related issues

- apache#2307 

## Does this PR introduce any user-facing change?

<!--
If any user-facing interface changes, please [open an
issue](https://github.com/apache/fory/issues/new/choose) describing the
need to do so and update the document if necessary.
-->

- [ ] Does this PR introduce any public API change?
- [ ] Does this PR introduce any binary protocol compatibility change?

## Benchmark

<!--
When the PR has an impact on performance (if you don't know whether the
PR will have an impact on performance, you can submit the PR first, and
if it will have impact on performance, the code reviewer will explain
it), be sure to attach a benchmark data here.
-->
chaokunyang pushed a commit that referenced this pull request Jun 7, 2025
…event collisions (#2308)

## What does this PR do?

This PR fixes a `ClassCastException` during deserialization caused by an
incorrect `MetaStringBytes` being retrieved from the cache. The previous
cache key `(long, long)` in `MetaStringResolver` (for small meta
strings) didn't include the string's `encoding` byte, leading to
collisions when different strings (e.g., "aclass" vs "Aclass") hashed to
the same two longs.

**Changes:**

1. **Introduced `LongLongByteMap`:** A new map using a `(long k1, long
k2, byte k3)` key.
2. **Updated `MetaStringResolver`:** Modified relevant methods (e.g.,
`createSmallMetaStringBytes`, `readSmallMetaStringBytes`) to use
`LongLongByteMap`, incorporating the `encoding` byte into the cache key
for small meta strings.
3. **Renamed Test:** `LongLongMapTest.java` renamed to
`LongLongByteMapTest.java` and updated to test the new map.

## Related issues

- #2307 

## Does this PR introduce any user-facing change?

<!--
If any user-facing interface changes, please [open an
issue](https://github.com/apache/fory/issues/new/choose) describing the
need to do so and update the document if necessary.
-->

- [ ] Does this PR introduce any public API change?
- [ ] Does this PR introduce any binary protocol compatibility change?

## Benchmark

<!--
When the PR has an impact on performance (if you don't know whether the
PR will have an impact on performance, you can submit the PR first, and
if it will have impact on performance, the code reviewer will explain
it), be sure to attach a benchmark data here.
-->
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants