Add frame level task #693

Cadene · 2025-02-07T14:09:36Z

What this does

Motivations: We want to start supporting datasets with multiple tasks per episode. We also want to iterate on a simpler API for adding new frames to a dataset, where all features are added using dataset.add_frame(frame: dict).

For instance:

2D features like a sequence of x,y positions (waypoints)
string features like captions

TODO

Add annotation per frame
Remove task argument of save_episode . add_frame is now the only way to add the task (even if it's the same for every frame)
Add first test for add_frame
Updated examples/port_datasets/pusht_zarr.py accordingly

How it was tested

Ran python examples/port_datasets/pusht_zarr.py
Ran tests

How to checkout & try? (for the reviewer)

pytest -sx tests/test_datasets.py::test_add_frame

aliberts

LGTM with some comments

aliberts · 2025-02-13T16:25:05Z

lerobot/common/datasets/lerobot_dataset.py

+            task_index = self.get_task_index(task)
+            if task_index not in self.tasks:


Equivalent but a bit faster and easier to read I think

Suggested change

task_index = self.get_task_index(task)

if task_index not in self.tasks:

if task not in self.tasks.values():

task_index = self.get_task_index(task)

I found this logic quite confusing, so I refactor it.

tests/test_datasets.py

aliberts · 2025-02-13T17:28:09Z

tests/test_datasets.py

+        dataset.add_frame({"1d": torch.randn(1)})
+
+
+def test_add_frame(tmp_path):


This test is quite long for what it's doing (~0.5s)
Did you check if this new add_frame is slower than before?

0.13s on my side

Co-authored-by: Simon Alibert <[email protected]>

This reverts commit 7170819.

This reverts commit 16bb53f.

Cadene · 2025-02-13T18:06:37Z

reverting commit:

commit fd6436af9de41a8436396afc296ebb53b0d7bc2c (HEAD -> user/rcadene/2025_01_27_dataset_v2.1)
Author: Remi Cadene <[email protected]>
Date:   Thu Feb 13 19:05:20 2025 +0100

    Revert "Update tests/test_datasets.py"

    This reverts commit 16bb53f8d6e89214b83d3f706e86f40f9128c5f7.

commit 3abc897ca23329301333aec3ee085ca732e99679
Author: Remi Cadene <[email protected]>
Date:   Thu Feb 13 19:05:14 2025 +0100

    Revert "Update tests/test_datasets.py"

    This reverts commit 7170819c605a429402aed904d868b5eff21c0d2a.

commit 7170819c605a429402aed904d868b5eff21c0d2a (origin/user/rcadene/2025_01_27_dataset_v2.1)
Author: Remi <[email protected]>
Date:   Thu Feb 13 18:55:43 2025 +0100

    Update tests/test_datasets.py

    Co-authored-by: Simon Alibert <[email protected]>

commit 16bb53f8d6e89214b83d3f706e86f40f9128c5f7
Author: Remi <[email protected]>
Date:   Thu Feb 13 18:31:51 2025 +0100

    Update tests/test_datasets.py

    Co-authored-by: Simon Alibert <[email protected]>

Because getting:

__________________________________________________________________________________________________ test_add_frame_no_task ___________________________________________________________________________________________________

tmp_path = PosixPath('/private/var/folders/kj/4wjq7mtn6xdb3hz4bwd4db_c0000gn/T/pytest-of-rcadene/pytest-81/test_add_frame_no_task0')

    def test_add_frame_no_task(tmp_path):
        features = {"1d": {"dtype": "float32", "shape": (1,), "names": None}}
>       dataset = LeRobotDataset.create(repo_id=DUMMY_REPO_ID, fps=30, root=tmp_path, features=features)

tests/test_datasets.py:98:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
lerobot/common/datasets/lerobot_dataset.py:957: in create
    obj.meta = LeRobotDatasetMetadata.create(
lerobot/common/datasets/lerobot_dataset.py:296: in create
    obj.root.mkdir(parents=True, exist_ok=False)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

self = PosixPath('/private/var/folders/kj/4wjq7mtn6xdb3hz4bwd4db_c0000gn/T/pytest-of-rcadene/pytest-81/test_add_frame_no_task0'), mode = 511, parents = True, exist_ok = False

    def mkdir(self, mode=0o777, parents=False, exist_ok=False):
        """
        Create a new directory at this given path.
        """
        try:
>           self._accessor.mkdir(self, mode)
E           FileExistsError: [Errno 17] File exists: '/private/var/folders/kj/4wjq7mtn6xdb3hz4bwd4db_c0000gn/T/pytest-of-rcadene/pytest-81/test_add_frame_no_task0'

../../miniconda3/envs/lerobot/lib/python3.10/pathlib.py:1175: FileExistsError

aliberts force-pushed the user/aliberts/2024_11_25_compute_stats_v2 branch from 8feeede to 0c55461 Compare February 9, 2025 13:26

aliberts mentioned this pull request Feb 10, 2025

LeRobotDataset v2.1 #711

Merged

3 tasks

Cadene changed the base branch from user/aliberts/2024_11_25_compute_stats_v2 to user/aliberts/2025_02_10_dataset_v2.1 February 10, 2025 15:51

Cadene changed the title ~~LeRobotDataset v2.1~~ Add text_features Feb 10, 2025

Cadene force-pushed the user/rcadene/2025_01_27_dataset_v2.1 branch from 34f201c to 8e98c79 Compare February 11, 2025 15:53

Cadene changed the title ~~Add text_features~~ Add frame level task Feb 11, 2025

Cadene requested a review from aliberts February 11, 2025 16:13

Cadene added the dataset Issues regarding data inputs, processing, or datasets label Feb 11, 2025

Cadene marked this pull request as ready for review February 11, 2025 16:14

Cadene added 2 commits February 11, 2025 17:20

Add possibility to add task per frame

b8ca7e7

Add tests add_frame

3281e66

Cadene force-pushed the user/rcadene/2025_01_27_dataset_v2.1 branch from ef18a95 to 3281e66 Compare February 11, 2025 16:20

Fix unit tests

450b04d

aliberts approved these changes Feb 13, 2025

View reviewed changes

Cadene and others added 4 commits February 13, 2025 18:31

Update tests/test_datasets.py

16bb53f

Co-authored-by: Simon Alibert <[email protected]>

Update tests/test_datasets.py

7170819

Co-authored-by: Simon Alibert <[email protected]>

Revert "Update tests/test_datasets.py"

3abc897

This reverts commit 7170819.

Revert "Update tests/test_datasets.py"

fd6436a

This reverts commit 16bb53f.

Cadene added 2 commits February 14, 2025 12:19

Speedup task logic by removing task_to_task_index as property

e7ad7ee

revert debug change

e6e96fe

Cadene merged commit 9d6886d into user/aliberts/2025_02_10_dataset_v2.1 Feb 14, 2025
7 checks passed

Cadene deleted the user/rcadene/2025_01_27_dataset_v2.1 branch February 14, 2025 13:22

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add frame level task #693

Add frame level task #693

Uh oh!

Cadene commented Feb 7, 2025 •

edited

Loading

Uh oh!

aliberts left a comment

Uh oh!

aliberts Feb 13, 2025

Uh oh!

Cadene Feb 14, 2025 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

aliberts Feb 13, 2025

Uh oh!

Cadene Feb 13, 2025

Uh oh!

Cadene commented Feb 13, 2025

Uh oh!

Uh oh!

Uh oh!

		task_index = self.get_task_index(task)
		if task_index not in self.tasks:

		dataset.add_frame({"1d": torch.randn(1)})


		def test_add_frame(tmp_path):

Add frame level task #693

Add frame level task #693

Uh oh!

Conversation

Cadene commented Feb 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What this does

How it was tested

How to checkout & try? (for the reviewer)

Uh oh!

aliberts left a comment

Choose a reason for hiding this comment

Uh oh!

aliberts Feb 13, 2025

Choose a reason for hiding this comment

Uh oh!

Cadene Feb 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

aliberts Feb 13, 2025

Choose a reason for hiding this comment

Uh oh!

Cadene Feb 13, 2025

Choose a reason for hiding this comment

Uh oh!

Cadene commented Feb 13, 2025

Uh oh!

Uh oh!

Uh oh!

Cadene commented Feb 7, 2025 •

edited

Loading

Cadene Feb 14, 2025 •

edited

Loading