Implement UniqueId #256

Dekkonot · 2023-03-24T06:14:13Z

Implements the UniqueId datatype. This has three main components:

implementing UniqueId as a type in rbx_types
implementing (de)serializing in rbx_binary
implementing (de)serializing in rbx_xml

The implementation in rbx_binary is unusual in that it manually performs interleaving for the datatype, but I feel comfortable doing it because the other options aren't desirable and this is the only instance of it in the codebase. I can change it if we really want, but I'd prefer we didn't have a read_interleaved_u128 function.

Additionally, I've made the call that the XML representation of UniqueId's time field is more sane than the binary one. This is under the assumption that Roblox is not writing negative timestamps, since that's nonsense. This should be effectively invisible to anyone who isn't manually checking timestamps.

Currently, we do not have any properties in the database that utilize UniqueId. This means this code will probably only run in the background of deserializing files. It does however fix an annoying "unknown property type" warning in Rojo and related tools.

This relies on an upstream change to rbx-test-files I haven't merged yet but exists on my fork. Provided the tests pass and nobody has serious concerns with it, I'll get it merged and fix any issues it causes. Update: This change has been merged upstream and commit 10daba0 updated the submodule to point to the commit in rbx-test-files.

Dekkonot · 2023-03-24T06:39:02Z

...Oh the misery. Give me a bit.

Dekkonot · 2023-03-24T07:36:53Z

I've gone ahead and given up on resolving that conflict. I'm just gonna force push this and we can deal with the consequences. The commits were made in the same order and whatnot but it's either this or I reopen the PR and that sucks.

kennethloeffler

I'd prefer we didn't have a read_interleaved_u128 function.

Disagree, I don't want interleaving code in the serializer and deserializer states. I want read_interleaved_bytes_16 and write_interleaved_bytes_16 in the core, something like:

// They could maybe use different names... 🤷
    fn read_interleaved_bytes_16(&mut self, out: &mut [[u8; 16]]) -> io::Result<()> {
        let len = out.len();
        let mut buffer = vec![0; len * mem::size_of::<[u8; 16]>()];
        self.read_exact(&mut buffer)?;

        for (i, array) in out.into_iter().enumerate() {
            for (j, byte) in array.iter_mut().enumerate() {
                *byte = buffer[i + len * j];
            }
        }

        Ok(())

    }

// ...

    fn write_interleaved_bytes_16<I>(&mut self, values: I) -> io::Result<()>
    where
        I: Iterator<Item = [u8; 16]>,
    {
        let values: Vec<_> = values.collect();

        for i in 0..16 {
            for value in values.iter() {
                self.write_u8(value[i])?;
            }
        }

        Ok(())
    }

I think they're easier to understand and follow more simply from existing code. It's true that they'll only be used in few places, but I can't see the harm in keeping them separate. I dunno about you, but sometimes I really really like dumb, predictable code.

I also think that whatever generic methods we end up writing if/when we genericize the interleaving implementation will be very similar to these, so they'll be a good jumping off point.

Since I'm being nitpicky I'm adding suggestions incorporating these 👇👇👇

rbx_binary/src/deserializer/state.rs

rbx_binary/src/serializer/state.rs

rbx_binary/src/text_deserializer.rs

Dekkonot · 2023-05-04T16:42:08Z

Disagree, I don't want interleaving code in the serializer and deserializer states. I want read_interleaved_bytes_16 and write_interleaved_bytes_16 in the core, something like:

I suppose I didn't consider having a function to read X interleaved bytes; that does make a lot more sense. It seems really practical to implement it as a generic function now though, since it's not a significant change. Here's my proposal for reading, as an example:

fn read_interleaved_bytes<const N: usize>(&mut self, output: &mut [[u8; N]]) -> io::Result<()> {
    let len = output.len();
    let mut buffer = vec![0; len * mem::size_of::<[u8; N]>()];
    self.read_exact(&mut buffer)?;

    for (i, array) in output.into_iter().enumerate() {
        for (j, byte) in array.iter_mut().enumerate() {
            *byte = buffer[i + len * j];
        }
    }

    Ok(())
}

It might not be necessary at this time, but there's not really a downside to doing it like this beyond calls being read_interleaved_bytes::<16> instead of read_interleaved_bytes_16 which I think looks nicer anyway. The upside is that it's easy to swap over to using it for other functions if we want and it significantly improves their readability.

We would go from this:

fn read_interleaved_i64_array(&mut self, output: &mut [i64]) -> io::Result<()> {
    let mut buf = vec![0; output.len() * mem::size_of::<i64>()];
    self.read_exact(&mut buf)?;

    for i in 0..output.len() {
        let z0 = buf[i] as i64;
        let z1 = buf[i + output.len()] as i64;
        let z2 = buf[i + output.len() * 2] as i64;
        let z3 = buf[i + output.len() * 3] as i64;
        let z4 = buf[i + output.len() * 4] as i64;
        let z5 = buf[i + output.len() * 5] as i64;
        let z6 = buf[i + output.len() * 6] as i64;
        let z7 = buf[i + output.len() * 7] as i64;

        output[i] = untransform_i64(
            (z0 << 56)
                | (z1 << 48)
                | (z2 << 40)
                | (z3 << 32)
                | (z4 << 24)
                | (z5 << 16)
                | (z6 << 8)
                | z7,
        );
    }

    Ok(())
}

to this:

fn read_interleaved_i64_array(&mut self, output: &mut [i64]) -> io::Result<()> {
    let mut buffer = vec![[0; mem::size_of::<i64>()]; output.len()];
    self.read_interleaved_bytes(&mut buffer)?;

    for (i, bytes) in buffer.into_iter().enumerate() {
        output[i] = untransform_i64(i64::from_be_bytes(bytes));
    }

    Ok(())
}

kennethloeffler · 2023-05-05T00:28:20Z

I suppose I didn't consider having a function to read X interleaved bytes; that does make a lot more sense. It seems really practical to implement it as a generic function now though, since it's not a significant change. Here's my proposal for reading, as an example: [...] It might not be necessary at this time, but there's not really a downside to doing it like this beyond calls being read_interleaved_bytes::<16> instead of read_interleaved_bytes_16 which I think looks nicer anyway. The upside is that it's easy to swap over to using it for other functions if we want and it significantly improves their readability.

I think it's fine if they're written as generic function right now but let's hold any refactors for another PR

Dekkonot · 2023-05-08T16:57:39Z

...I don't even know how to address the fact that time has bumped their MSRV because we don't depend upon it so we can't downgrade it. In love with the fact that they did that with a patch version bump and that this is evidently their policy so it'll be a problem forever. :-)

With regards to the test failure in stable, that's something that I'll address Soon:tm: but the failure is a patch file not being applied and nothing serious.

kennethloeffler · 2023-05-08T19:35:50Z

rbx_binary/src/deserializer/state.rs

+                VariantType::UniqueId => {
+                    let n = type_info.referents.len();
+                    let mut values = vec![[0; 16]; n];
+                    chunk.read_interleaved_bytes::<16>(&mut values)?;


Suggested change

chunk.read_interleaved_bytes::<16>(&mut values)?;

chunk.read_interleaved_bytes(&mut values)?;

Rust can infer the type

I left it in deliberately. While yes, we should in general favor eliding generic arguments and type annotations, I don't think we get the normal benefit (readability) out of doing that here and it guards against silent type errors in the future if we manually specify what the argument is meant to be.

kennethloeffler · 2023-05-08T19:36:13Z

rbx_binary/src/serializer/state.rs

+                            }
+                        }
+
+                        chunk.write_interleaved_bytes::<16>(&blobs)?;


Suggested change

chunk.write_interleaved_bytes::<16>(&blobs)?;

chunk.write_interleaved_bytes(&blobs)?;

kennethloeffler · 2023-05-08T20:04:48Z

...I don't even know how to address the fact that time has bumped their MSRV because we don't depend upon it so we can't downgrade it. In love with the fact that they did that with a patch version bump and that this is evidently their policy so it'll be a problem forever. :-)

This is annoying but it just means we'll have to bump our MSRV in lockstep with time. Kinda blows, but there are worse fates...

Otherwise this is looking good, I just want to see the serde stuff and also some more tests for the current functionality of the UniqueId type

…ojo-rbx#258) * Clean up chunk section + document zstd compression * Document Bytecode data type * List Bytecode in README * Add Bytecode type to table of contents * Remove quotes around `ZSTD "frame"`

In the past, Roblox serialized `Font` properties as empty tags. Although they've stopped doing that, there are still models in the wild with that format for tags. This adds support for empty Font tags by just returning the default value for `Font` when the tag is empty. Relies on rojo-rbx/rbx-test-files#19.

Filled out the spec file for the XML file format since it's long overdue. Intentionally does not document `QDir` or `QFont` and ignores debugger-specific elements.

@Kampfkarren

As requested by @Kampfkarren, I've made a fairly straightforward list of things that could result in a breaking of `rbx-dom` without any code of ours changing. This was compiled off the top of my head and I may be forgetting something. I've also taken some liberties here, so feedback on the layout or inclusion of something is welcome. Rendered compatibility document is [here](https://github.com/Dekkonot/rbx-dom/blob/compatibility-doc/docs/compatibility.md).

Closes rojo-rbx#243.

At the moment, block quotes are quite frankly unreasonably large on the doc site. This patches the CSS for it to make their text equivalent to an H3 instead of being larger than H2. See PR for before and after comparison.

Dekkonot · 2023-05-10T16:22:18Z

...That wasn't what I meant to do with merging master into this branch but I guess we'll just have to deal with it.

git log lied to me.

Dekkonot · 2023-05-10T17:29:47Z

The MSRV issue will be resolved when #249 is merged since it stems from generate_reflection depending on tiny_http v0.11. It dropped that dependency in v0.12 which is what rbx_reflector depends on.

Dekkonot · 2023-05-18T17:05:36Z

Seeing as I messed this one up pretty badly I've just gone ahead and reopened this. See linked PR.

Dekkonot requested a review from LPGhatguy as a code owner March 24, 2023 06:14

Dekkonot added 4 commits March 24, 2023 00:21

Implement UniqueId as a type

ceb9aab

Add UniqueId support for rbx_binary

00862e3

Implement UniqueId for rbx_xml

2652351

Enable yaml feature in rbx_xml

b7cae0d

Dekkonot force-pushed the uniqueid-impl branch from 8ea1722 to 66c748c Compare March 24, 2023 07:37

Dekkonot added 3 commits March 24, 2023 00:37

Update test files

66c748c

Derive Eq for UniqueId

c007c23

Update changelogs for rbx_types, rbx_xml, and rbx_binary

16e5eed

This comment was marked as off-topic.

Sign in to view

Update test-files

10daba0

Dekkonot requested review from Kampfkarren and removed request for LPGhatguy April 29, 2023 19:54

kennethloeffler requested changes May 3, 2023

View reviewed changes

rbx_binary/src/deserializer/state.rs Outdated Show resolved Hide resolved

rbx_binary/src/serializer/state.rs Outdated Show resolved Hide resolved

rbx_binary/src/text_deserializer.rs Outdated Show resolved Hide resolved

rbx_binary/src/text_deserializer.rs Outdated Show resolved Hide resolved

Dekkonot added 3 commits May 8, 2023 09:28

Implement generic interleaving read/write methods

d8ab1be

Swap (de)serialization of UniqueId to use methods for (de)interleaving

ceea328

Update test-files to what it's meant to be

f544844

kennethloeffler reviewed May 8, 2023

View reviewed changes

Dekkonot and others added 6 commits May 10, 2023 09:15

Implement serde manually for UniqueId

64b7076

Actually use UniqueId in text deserializer

18f8dba

Update snapshots with new serde implementation

acbcb93

Document ZSTD compression and Bytecode data type in binary spec file (r…

0aad9e2

…ojo-rbx#258) * Clean up chunk section + document zstd compression * Document Bytecode data type * List Bytecode in README * Add Bytecode type to table of contents * Remove quotes around `ZSTD "frame"`

Release rbx_types v1.5.0

84d1dfa

Update database and release rbx_reflection_database 0.2.6+roblox-572

8255445

LPGhatguy and others added 9 commits May 10, 2023 09:20

Release rbx_binary v0.7.0

7d4cdc9

Release rbx_xml v0.13.0

4841077

Fill out XML spec document (rojo-rbx#247)

f24c432

Filled out the spec file for the XML file format since it's long overdue. Intentionally does not document `QDir` or `QFont` and ignores debugger-specific elements.

Patch MaterialService.Use2022Materials (rojo-rbx#259)

39e6bfb

Closes rojo-rbx#243.

Update reflection database to v573

11a93df

Include latest database and patches in snapshots

eca438c

Dekkonot mentioned this pull request May 18, 2023

Implement UniqueId #271

Merged

Dekkonot closed this May 18, 2023

Dekkonot deleted the uniqueid-impl branch July 20, 2024 20:35

	chunk.read_interleaved_bytes::<16>(&mut values)?;
	chunk.read_interleaved_bytes(&mut values)?;

	chunk.write_interleaved_bytes::<16>(&blobs)?;
	chunk.write_interleaved_bytes(&blobs)?;

Implement UniqueId #256

Implement UniqueId #256

Uh oh!

Conversation

Dekkonot commented Mar 24, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Dekkonot commented Mar 24, 2023

Uh oh!

Dekkonot commented Mar 24, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

This comment was marked as off-topic.

kennethloeffler left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Dekkonot commented May 4, 2023

Uh oh!

kennethloeffler commented May 5, 2023

Uh oh!

Dekkonot commented May 8, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

kennethloeffler May 8, 2023

Choose a reason for hiding this comment

Uh oh!

Dekkonot May 9, 2023

Choose a reason for hiding this comment

Uh oh!

kennethloeffler May 8, 2023

Choose a reason for hiding this comment

Uh oh!

kennethloeffler commented May 8, 2023

Uh oh!

Dekkonot commented May 10, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Dekkonot commented May 10, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Dekkonot commented May 18, 2023

Uh oh!

Uh oh!

Dekkonot commented Mar 24, 2023 •

edited

Loading

Dekkonot commented Mar 24, 2023 •

edited

Loading

kennethloeffler left a comment •

edited

Loading

Dekkonot commented May 8, 2023 •

edited

Loading

Dekkonot commented May 10, 2023 •

edited

Loading

Dekkonot commented May 10, 2023 •

edited

Loading