Add .rrd files to `git lfs` to test backwards compatibility #9110

emilk · 2025-02-22T10:21:12Z

We want to have automated tests to make sure we have backwards compatible chunk loaders.

Ideally we should add .rrd files to git lfs for all previous versions of all components and archetypes.

What to test?

They load without error/warning
Passes rerun rrd verify
- Add rerun rrd verify #9128
Produce the same rerun print output?
- Maybe just the column headers (names, meta-data)?
- That's a useful sanity check when we update the .rrd:s in the test set, that the new ones covers approximately the same things
Produce the same visuals? Difficult to get right

What .rrd files?

Snippets?
Examples?
One .rrd file containing all known components/archetypes/datatypes?

When and how do we add more .rrd files to the test set?

What if something does break?

We do not yet promise full backwards compatibility, but any CI failure regarding this should be a red flag, and we should only accept it (and upload a new .rrd file) if we are really ok with this breaking change.

What remains to do?

Decide on exactly what .rrd files to test
Automatically verify that all snippets and examples etc have .rrd:s that are checked in
Add .rrds under folders with version string
Compare all versions of old .rrd files (e.g. from 0.22, 0.23, 0.24…)
Ensure we add new versions of snippet and examples iff needed
Improve the script for updating said .rrd files to:
- Not be a bash script (-> Python or Rust)
- Add missing files (e.g. when adding new snippets) without changing the existing files
- Update all existing files (when intentionally breaking backwards compatibility)
Ensure we migrate deprecated types on ingestion #9370

Potential improvements

Migrate the checked in snippet rrd files, then compare them to the output of the latest snippet

The text was updated successfully, but these errors were encountered:

## What * Fix formatting of chunks with zero rows * Include metadata in `rerun rrd print -vv` (same fix as above) * Simplify formatting of chunks with no metadata ## Related * Part of #9110

### Related * Part of #9110 ### What Add `rerun rrd verify some.rrd` which verifies that the current rerun version can load and understand the given .rrd file. It goes through each component column in each record batch, find the corresponding component, and then tries to deserialize the arrow data within.

### Related * Part of #9110 ### What Adds a bunch of `.rrd`:s to `git lfs`, that we should keep there as a test of backwards compatibility. If any new code breaks loading of these old .rrd files, the CI will complain. Running locally: > pixi run check-backwards-compatibility ### What if the CI complains about my PR? Then your code broke backwards compatibility. Can you make the change so that it doesn't? If not, consult with me! We are not yet promising backwards compatibility, but we should at least make a reasonable effort. ### What does this PR cover? All Rust snippets are tested, as are the main examples (as of today). This should include most components, but does not cover _everything_. So even if `pixi run check-backwards-compatibility` passes, it is possible that we have broken backwards compatibility in some subtle way. But this is at least a start. ### TODO * [x] Add a single pixi command to verify the files ### Future work * Add more files, for improved coverage * Automatically detect if some snippets/examples are missing from the test dataset --------- Co-authored-by: Clement Rey <[email protected]>

### Related * Part of #9110 * In preparation for #9338 ### What Ensures that after loading an .rrd, there are no deprecated components or archetypes remaining. All deprecated types should have been migrated to non-deprecated types.

emilk · 2025-04-15T11:36:57Z

compare_snippet_output.py should compare its output with the .rrd files checked in to git lfs:

If missing, it is an error (a new snippet needs to check in an .rrd for its output)
If it differs, it is an error

Complications

If we change a snippet (so that it outputs something different), we need to either:

Update the file on git lfs
Add a new file to git lfs alongside the old one
Add an opt-out flag for comparing to the file on git lfs to snippets.toml

Since this should be a rather rare occurrence (our snippets tend to stay pretty stable), any one solution will do.

### Related * Closes #3235 * Part of #9110 * Unblocks #9751 * Unblocks #9588 ### What In our snippet comparisons, ignore small numeric differences (that often occur because of differences in programming language, or compiler/interpreter version) --------- Co-authored-by: Clement Rey <[email protected]>

## Related * Requires #9750 * Part of #9110 ## What This restructures the folder of .rrds that we use for backwards-compatibility checks, so that it has a `snippets` folder, matching the structure of our snippets. These are then read by `compare_snippets_output.py` and are compared to the latest output of the snippets. They should be equal to each other (since the old .rrds should be migrated on ingestion). ### Consequences If we add a new snippet, the CI will fail until the output .rrd of that snippet is checked in to CI (using `compare_snippets_output.py --write-missing-backward-assets`) If we _change_ a snippet (so that it outputs something different), we need to either: * Update the file on `git lfs` * Add a _new_ file to `git lfs` alongside the old one * Add an opt-out flag for comparing to the file on `git lfs` to `snippets.toml` Since this should be a rather rare occurrence (our snippets tend to stay pretty stable), any one solution will do, and I have not decided on one yet. Maybe cross that bridge if and when we get to it?

emilk added the 🔩 data model Sorbet label Feb 22, 2025

This was referenced Feb 22, 2025

Backwards compatible data loaders for dataplatform #9091

Closed

Backwards compatible .rrd _container_ format #9124

Closed

emilk changed the title ~~Add data files to git lfs to test backwards compatibility~~ Add .rrd files to git lfs to test backwards compatibility Feb 25, 2025

emilk self-assigned this Feb 25, 2025

This was referenced Feb 25, 2025

Add rerun rrd verify #9128

Merged

Fix formatting of empty record batches #9130

Merged

Test that we don't break backwards compatibility of .rrd files #9133

Merged

This was referenced Mar 25, 2025

Deprecate SeriesLine/SeriesPoint/Scalar in favor of SeriesLines/SeriesPoints/Scalars #9338

Merged

Ensure we migrate deprecated types on ingestion #9370

Merged

emilk removed their assignment Apr 11, 2025

emilk self-assigned this Apr 15, 2025

This was referenced Apr 15, 2025

Compare snippets to legacy rrds #9751

Merged

Use approximate equals when comparing floats in snippets #9750

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add .rrd files to `git lfs` to test backwards compatibility #9110

Add .rrd files to `git lfs` to test backwards compatibility #9110

emilk commented Feb 22, 2025 •

edited

Loading

emilk commented Apr 15, 2025

Add .rrd files to git lfs to test backwards compatibility #9110

Add .rrd files to git lfs to test backwards compatibility #9110

Comments

emilk commented Feb 22, 2025 • edited Loading

What to test?

What .rrd files?

What if something does break?

What remains to do?

Potential improvements

emilk commented Apr 15, 2025

Complications

Add .rrd files to `git lfs` to test backwards compatibility #9110

Add .rrd files to `git lfs` to test backwards compatibility #9110

emilk commented Feb 22, 2025 •

edited

Loading