You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: lerobot-goes-to-driving-school.md
+12-2
Original file line number
Diff line number
Diff line change
@@ -9,6 +9,16 @@ authors:
9
9
10
10
---
11
11
12
+
# TL;DR
13
+
14
+
A snapshot of [L2D](https://huggingface.co/datasets/yaak-ai/L2D), the world's largest self-driving dataset!
15
+
- 90+ TeraBytes of multimodal data (5000+ hours of driving) from 30 cities in Germany.
16
+
- 6 surrounding HD cameras + vehicle state (Speed/Heading/GPS/IMU)
17
+
- Continuous (Gas/Brake/Steering) and discrete actions (Gear/Turn Signals)
18
+
- OpenStreetMap [matched waypoints](#OpenStreetMap) from birds-eye-view.
19
+
- Natural language instructions. F.ex ["When the light turns green, drive over the tram tracks and then through the roundabout"](https://huggingface.co/spaces/lerobot/visualize_dataset?dataset=yaak-ai%2FL2D&episode=82))
20
+
- Expert (driving instructors) and student (learner drivers) policies
21
+
12
22
# LeRobot goes to driving school
13
23
14
24
State-of-the art [Vision Language Models](https://huggingface.co/blog/vlms) and Large Language Models are trained on open-source
@@ -249,8 +259,8 @@ information about the episodes. Each release **R1+** is a superset of the previo
-[Most important next steps](#most-important-next-steps)
39
42
40
43
41
44
## What are Agent frameworks and why they matter?
@@ -112,17 +115,18 @@ From building `smolagents` we can also cite a notable additional advantage, whic
112
115
113
116
Now we need to provide the agent with the right set of tools.
114
117
115
-
**1.** A web browser. While a fully fledged web browser interaction like [Operator](https://openai.com/index/introducing-operator/) will be needed to reach full performance, we started with an extremely simple text-based web browser for now for our first proof-of-concept. You can find the code [here](https://github.com/huggingface/smolagents/blob/gaia-submission-r1/examples/open_deep_research/scripts/text_web_browser.py)
118
+
**1.** A web browser. While a fully fledged web browser interaction like [Operator](https://openai.com/index/introducing-operator/) will be needed to reach full performance, we started with an extremely simple text-based web browser for now for our first proof-of-concept. You can find the code [here](https://github.com/huggingface/smolagents/tree/main/examples/open_deep_research/scripts/text_web_browser.py)
116
119
117
-
**2.** A simple text inspector, to be able to **read a bunch of text file format**, find it [here](https://github.com/huggingface/smolagents/blob/gaia-submission-r1/examples/open_deep_research/scripts/text_inspector_tool.py).
120
+
121
+
**2.** A simple text inspector, to be able to **read a bunch of text file format**, find it [here](https://github.com/huggingface/smolagents/tree/main/examples/open_deep_research/scripts/text_inspector_tool.py).
118
122
119
123
These tools were taken from the excellent [Magentic-One](https://www.microsoft.com/en-us/research/articles/magentic-one-a-generalist-multi-agent-system-for-solving-complex-tasks/) agent by Microsoft Research, kudos to them! We didn’t change them much, as our goal was to get as high a performance as we can with the lowest complexity possible.
120
124
121
125
Here is a short roadmap of improvements which we feel would really improve these tools’ performance (feel free to open a PR and contribute!):
122
126
123
127
- extending the number of file formats which can be read.
124
128
- proposing a more fine-grained handling of files.
125
-
- replacing the web browser with a vision-based one, which we’ve started doing [here](https://github.com/huggingface/smolagents/blob/gaia-submission-r1/src/smolagents/vision_web_browser.py).
129
+
- replacing the web browser with a vision-based one, which we’ve started doing [here](https://github.com/huggingface/smolagents/tree/main/src/smolagents/vision_web_browser.py).
126
130
127
131
## Results 🏅
128
132
@@ -169,6 +173,6 @@ So we’re tackling that next! In a more general problem: we’re going to build
169
173
170
174
We’re also [hiring a full time engineer](https://apply.workable.com/huggingface/j/AF1D4E3FEB/) to help us work on this and more, apply if you’re interested 🙂
171
175
172
-
- To get started with Open Deep Research, try the examples [here](https://github.com/huggingface/smolagents/tree/gaia-submission-r1/examples/open_deep_research).
176
+
- To get started with Open Deep Research, try the examples [here](https://github.com/huggingface/smolagents/tree/main/examples/open_deep_research).
173
177
- Check the [smolagents](https://github.com/huggingface/smolagents) repo.
174
178
- Read more about smolagents [docs](https://huggingface.co/docs/smolagents/index), [introduction blog post](https://huggingface.co/blog/smolagents).
0 commit comments