Skip to content

Implement UniqueId #256

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 26 commits into from
Closed

Implement UniqueId #256

wants to merge 26 commits into from

Conversation

Dekkonot
Copy link
Member

@Dekkonot Dekkonot commented Mar 24, 2023

Implements the UniqueId datatype. This has three main components:

  • implementing UniqueId as a type in rbx_types
  • implementing (de)serializing in rbx_binary
  • implementing (de)serializing in rbx_xml

The implementation in rbx_binary is unusual in that it manually performs interleaving for the datatype, but I feel comfortable doing it because the other options aren't desirable and this is the only instance of it in the codebase. I can change it if we really want, but I'd prefer we didn't have a read_interleaved_u128 function.

Additionally, I've made the call that the XML representation of UniqueId's time field is more sane than the binary one. This is under the assumption that Roblox is not writing negative timestamps, since that's nonsense. This should be effectively invisible to anyone who isn't manually checking timestamps.

Currently, we do not have any properties in the database that utilize UniqueId. This means this code will probably only run in the background of deserializing files. It does however fix an annoying "unknown property type" warning in Rojo and related tools.


This relies on an upstream change to rbx-test-files I haven't merged yet but exists on my fork. Provided the tests pass and nobody has serious concerns with it, I'll get it merged and fix any issues it causes. Update: This change has been merged upstream and commit 10daba0 updated the submodule to point to the commit in rbx-test-files.

@Dekkonot Dekkonot requested a review from LPGhatguy as a code owner March 24, 2023 06:14
@Dekkonot
Copy link
Member Author

...Oh the misery. Give me a bit.

@Dekkonot
Copy link
Member Author

Dekkonot commented Mar 24, 2023

I've gone ahead and given up on resolving that conflict. I'm just gonna force push this and we can deal with the consequences. The commits were made in the same order and whatnot but it's either this or I reopen the PR and that sucks.

@bondhemlig

This comment was marked as off-topic.

@Dekkonot Dekkonot requested review from Kampfkarren and removed request for LPGhatguy April 29, 2023 19:54
Copy link
Member

@kennethloeffler kennethloeffler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer we didn't have a read_interleaved_u128 function.

Disagree, I don't want interleaving code in the serializer and deserializer states. I want read_interleaved_bytes_16 and write_interleaved_bytes_16 in the core, something like:

// They could maybe use different names... 🤷
    fn read_interleaved_bytes_16(&mut self, out: &mut [[u8; 16]]) -> io::Result<()> {
        let len = out.len();
        let mut buffer = vec![0; len * mem::size_of::<[u8; 16]>()];
        self.read_exact(&mut buffer)?;

        for (i, array) in out.into_iter().enumerate() {
            for (j, byte) in array.iter_mut().enumerate() {
                *byte = buffer[i + len * j];
            }
        }

        Ok(())

    }

// ...

    fn write_interleaved_bytes_16<I>(&mut self, values: I) -> io::Result<()>
    where
        I: Iterator<Item = [u8; 16]>,
    {
        let values: Vec<_> = values.collect();

        for i in 0..16 {
            for value in values.iter() {
                self.write_u8(value[i])?;
            }
        }

        Ok(())
    }

I think they're easier to understand and follow more simply from existing code. It's true that they'll only be used in few places, but I can't see the harm in keeping them separate. I dunno about you, but sometimes I really really like dumb, predictable code.

I also think that whatever generic methods we end up writing if/when we genericize the interleaving implementation will be very similar to these, so they'll be a good jumping off point.

Since I'm being nitpicky I'm adding suggestions incorporating these 👇👇👇

@Dekkonot
Copy link
Member Author

Dekkonot commented May 4, 2023

Disagree, I don't want interleaving code in the serializer and deserializer states. I want read_interleaved_bytes_16 and write_interleaved_bytes_16 in the core, something like:

I suppose I didn't consider having a function to read X interleaved bytes; that does make a lot more sense. It seems really practical to implement it as a generic function now though, since it's not a significant change. Here's my proposal for reading, as an example:

fn read_interleaved_bytes<const N: usize>(&mut self, output: &mut [[u8; N]]) -> io::Result<()> {
    let len = output.len();
    let mut buffer = vec![0; len * mem::size_of::<[u8; N]>()];
    self.read_exact(&mut buffer)?;

    for (i, array) in output.into_iter().enumerate() {
        for (j, byte) in array.iter_mut().enumerate() {
            *byte = buffer[i + len * j];
        }
    }

    Ok(())
}

It might not be necessary at this time, but there's not really a downside to doing it like this beyond calls being read_interleaved_bytes::<16> instead of read_interleaved_bytes_16 which I think looks nicer anyway. The upside is that it's easy to swap over to using it for other functions if we want and it significantly improves their readability.

We would go from this:

fn read_interleaved_i64_array(&mut self, output: &mut [i64]) -> io::Result<()> {
    let mut buf = vec![0; output.len() * mem::size_of::<i64>()];
    self.read_exact(&mut buf)?;

    for i in 0..output.len() {
        let z0 = buf[i] as i64;
        let z1 = buf[i + output.len()] as i64;
        let z2 = buf[i + output.len() * 2] as i64;
        let z3 = buf[i + output.len() * 3] as i64;
        let z4 = buf[i + output.len() * 4] as i64;
        let z5 = buf[i + output.len() * 5] as i64;
        let z6 = buf[i + output.len() * 6] as i64;
        let z7 = buf[i + output.len() * 7] as i64;

        output[i] = untransform_i64(
            (z0 << 56)
                | (z1 << 48)
                | (z2 << 40)
                | (z3 << 32)
                | (z4 << 24)
                | (z5 << 16)
                | (z6 << 8)
                | z7,
        );
    }

    Ok(())
}

to this:

fn read_interleaved_i64_array(&mut self, output: &mut [i64]) -> io::Result<()> {
    let mut buffer = vec![[0; mem::size_of::<i64>()]; output.len()];
    self.read_interleaved_bytes(&mut buffer)?;

    for (i, bytes) in buffer.into_iter().enumerate() {
        output[i] = untransform_i64(i64::from_be_bytes(bytes));
    }

    Ok(())
}

@kennethloeffler
Copy link
Member

I suppose I didn't consider having a function to read X interleaved bytes; that does make a lot more sense. It seems really practical to implement it as a generic function now though, since it's not a significant change. Here's my proposal for reading, as an example: [...] It might not be necessary at this time, but there's not really a downside to doing it like this beyond calls being read_interleaved_bytes::<16> instead of read_interleaved_bytes_16 which I think looks nicer anyway. The upside is that it's easy to swap over to using it for other functions if we want and it significantly improves their readability.

I think it's fine if they're written as generic function right now but let's hold any refactors for another PR

@Dekkonot
Copy link
Member Author

Dekkonot commented May 8, 2023

...I don't even know how to address the fact that time has bumped their MSRV because we don't depend upon it so we can't downgrade it. In love with the fact that they did that with a patch version bump and that this is evidently their policy so it'll be a problem forever. :-)

With regards to the test failure in stable, that's something that I'll address Soon:tm: but the failure is a patch file not being applied and nothing serious.

VariantType::UniqueId => {
let n = type_info.referents.len();
let mut values = vec![[0; 16]; n];
chunk.read_interleaved_bytes::<16>(&mut values)?;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
chunk.read_interleaved_bytes::<16>(&mut values)?;
chunk.read_interleaved_bytes(&mut values)?;

Rust can infer the type

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left it in deliberately. While yes, we should in general favor eliding generic arguments and type annotations, I don't think we get the normal benefit (readability) out of doing that here and it guards against silent type errors in the future if we manually specify what the argument is meant to be.

}
}

chunk.write_interleaved_bytes::<16>(&blobs)?;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
chunk.write_interleaved_bytes::<16>(&blobs)?;
chunk.write_interleaved_bytes(&blobs)?;

@kennethloeffler
Copy link
Member

...I don't even know how to address the fact that time has bumped their MSRV because we don't depend upon it so we can't downgrade it. In love with the fact that they did that with a patch version bump and that this is evidently their policy so it'll be a problem forever. :-)

This is annoying but it just means we'll have to bump our MSRV in lockstep with time. Kinda blows, but there are worse fates...

Otherwise this is looking good, I just want to see the serde stuff and also some more tests for the current functionality of the UniqueId type

LPGhatguy and others added 9 commits May 10, 2023 09:20
In the past, Roblox serialized `Font` properties as empty tags. Although
they've stopped doing that, there are still models in the wild with that
format for tags. This adds support for empty Font tags by just returning
the default value for `Font` when the tag is empty.

Relies on rojo-rbx/rbx-test-files#19.
Filled out the spec file for the XML file format since it's long overdue.

Intentionally does not document `QDir` or `QFont` and ignores debugger-specific elements.
As requested by @Kampfkarren, I've made a fairly straightforward list of
things that could result in a breaking of `rbx-dom` without any code of
ours changing.

This was compiled off the top of my head and I may be forgetting
something. I've also taken some liberties here, so feedback on the
layout or inclusion of something is welcome.

Rendered compatibility document is
[here](https://github.com/Dekkonot/rbx-dom/blob/compatibility-doc/docs/compatibility.md).
At the moment, block quotes are quite frankly unreasonably large on the
doc site. This patches the CSS for it to make their text equivalent to
an H3 instead of being larger than H2.

See PR for before and after comparison.
@Dekkonot
Copy link
Member Author

Dekkonot commented May 10, 2023

...That wasn't what I meant to do with merging master into this branch but I guess we'll just have to deal with it.

git log lied to me.

image

@Dekkonot
Copy link
Member Author

Dekkonot commented May 10, 2023

The MSRV issue will be resolved when #249 is merged since it stems from generate_reflection depending on tiny_http v0.11. It dropped that dependency in v0.12 which is what rbx_reflector depends on.

@Dekkonot Dekkonot mentioned this pull request May 18, 2023
@Dekkonot
Copy link
Member Author

Seeing as I messed this one up pretty badly I've just gone ahead and reopened this. See linked PR.

@Dekkonot Dekkonot closed this May 18, 2023
@Dekkonot Dekkonot deleted the uniqueid-impl branch July 20, 2024 20:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants