Skip to content

Commit 3e64387

Browse files
committed
docs: updated docs and linked
1 parent e9ea171 commit 3e64387

File tree

2 files changed

+4
-1
lines changed

2 files changed

+4
-1
lines changed

docs/simulation_and_benchmarking/rai_bench.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -163,6 +163,6 @@ class TaskArgs(BaseModel):
163163
Descriptive prompts provides guidance and tips.
164164

165165
- extra_tool_calls - How many extra tool calls an agent can make and still pass the Task, example:
166-
- `GetROS2RGBCameraTask` has 1 required tool call and 1 optional. When `extra_tool_calls` set to 5, agent can correct himself couple times and still pass even with 7 tool calls.
166+
- `GetROS2RGBCameraTask` has 1 required tool call and 1 optional. When `extra_tool_calls` set to 5, agent can correct himself couple times and still pass even with 7 tool calls. There can be 2 types of invalid tool calls, first when the tool is used incorrectly and agent receives an error - this allows him to correct himself easier. Second type is when tool is called properly but it is not the tool that should be called or it is called with wrong params. In this case agent won't get any error so it will be harder for him to correct, but BOTH of these cases are counted as `extra tool call`.
167167

168168
If you want to know details about every task, visit `rai_bench/tool_calling_agent/tasks`

src/rai_bench/rai_bench/test_models.py

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -83,6 +83,9 @@ class ToolCallingAgentBenchmarkConfig(BenchmarkConfig):
8383
task_types : List[Literal["basic", "manipulation", "navigation", "custom_interfaces", "spatial_reasoning"]], optional
8484
types of tasks to include in the benchmark, by default all types are included:
8585
["basic", "manipulation", "navigation", "custom_interfaces", "spatial_reasoning"]
86+
87+
For more detailed explanation of parameters, see the documentation:
88+
(https://robotecai.github.io/rai/simulation_and_benchmarking/rai_bench/)
8689
"""
8790

8891
extra_tool_calls: List[int] = [0]

0 commit comments

Comments
 (0)