Skip to content

chore: Fury header add language field #1612

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
May 8, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 4 additions & 3 deletions docs/specification/xlang_serialization_spec.md
Original file line number Diff line number Diff line change
Expand Up @@ -139,16 +139,17 @@ Fury will write the byte order for that object into the data instead of converti
Fury header consists starts one byte:

```
| 2 bytes | 4 bits | 1 bit | 1 bit | 1 bit | 1 bit | optional 4 bytes |
+--------------+---------------+-------+-------+--------+-------+------------------------------------+
| magic number | reserved bits | oob | xlang | endian | null | unsigned int for meta start offset |
| 2 bytes | 4 bits | 1 bit | 1 bit | 1 bit | 1 bit | 1 byte | optional 4 bytes |
+--------------+---------------+-------+-------+--------+-------+------------+------------------------------------+
| magic number | reserved bits | oob | xlang | endian | null | language | unsigned int for meta start offset |
```

- magic number: used to identify fury serialization protocol, current version use `0x62d4`.
- null flag: 1 when object is null, 0 otherwise. If an object is null, other bits won't be set.
- endian flag: 1 when data is encoded by little endian, 0 for big endian.
- xlang flag: 1 when serialization uses xlang format, 0 when serialization uses Fury java format.
- oob flag: 1 when passed `BufferCallback` is not null, 0 otherwise.
- language: the language when serializing objects, such as JAVA, PYTHON, GO, etc. Fury can use this flag to determine whether spend more time on serialization to make the deserialization faster for dynamic languages.

If meta share mode is enabled, an uncompressed unsigned int is appended to indicate the start offset of metadata.

Expand Down
3 changes: 3 additions & 0 deletions go/fury/fury.go
Original file line number Diff line number Diff line change
Expand Up @@ -80,7 +80,10 @@ const (
XLANG Language = iota
JAVA
PYTHON
CPP
GO
JAVASCRIPT
RUST
)

const (
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -26,4 +26,6 @@ public enum Language {
PYTHON,
CPP,
GO,
JAVASCRIPT,
RUST,
}
2 changes: 1 addition & 1 deletion javascript/packages/fury/lib/fury.ts
Original file line number Diff line number Diff line change
Expand Up @@ -111,7 +111,7 @@ export default class {
bitmap |= ConfigFlags.isCrossLanguageFlag;
this.binaryWriter.int16(MAGIC_NUMBER);
this.binaryWriter.uint8(bitmap);
this.binaryWriter.uint8(Language.XLANG);
this.binaryWriter.uint8(Language.JAVASCRIPT);
const cursor = this.binaryWriter.getCursor();
this.binaryWriter.skip(4); // preserve 4-byte for nativeObjects start offsets.
this.binaryWriter.uint32(0); // nativeObjects length.
Expand Down
2 changes: 2 additions & 0 deletions javascript/packages/fury/lib/type.ts
Original file line number Diff line number Diff line change
Expand Up @@ -115,6 +115,8 @@ export enum Language {
PYTHON = 2,
CPP = 3,
GO = 4,
JAVASCRIPT = 5,
RUST = 6,
}

export const MAGIC_NUMBER = 0x62D4;
2 changes: 2 additions & 0 deletions python/pyfury/_fury.py
Original file line number Diff line number Diff line change
Expand Up @@ -586,6 +586,8 @@ class Language(enum.Enum):
PYTHON = 2
CPP = 3
GO = 4
JAVA_SCRIPT = 5
RUST = 6


@dataclass
Expand Down
5 changes: 1 addition & 4 deletions rust/fury/src/deserializer.rs
Original file line number Diff line number Diff line change
Expand Up @@ -241,10 +241,7 @@ impl<'de, 'bf: 'de> DeserializerState<'de, 'bf> {

fn head(&mut self) -> Result<(), Error> {
let _bitmap = self.reader.u8();
let language: Language = self.reader.u8().try_into()?;
if Language::XLANG != language {
return Err(Error::UnsupportLanguage { language });
}
let _language: Language = self.reader.u8().try_into()?;
self.reader.skip(8); // native offset and size
Ok(())
}
Expand Down
2 changes: 1 addition & 1 deletion rust/fury/src/serializer.rs
Original file line number Diff line number Diff line change
Expand Up @@ -347,7 +347,7 @@ impl<'de> SerializerState<'de> {
bitmap |= config_flags::IS_LITTLE_ENDIAN_FLAG;
bitmap |= config_flags::IS_CROSS_LANGUAGE_FLAG;
self.writer.u8(bitmap);
self.writer.u8(Language::XLANG as u8);
self.writer.u8(Language::RUST as u8);
self.writer.skip(4); // native offset
self.writer.skip(4); // native size
self
Expand Down
Loading