Tool retrieval 2 #417

WonderPG · 2025-07-09T08:25:36Z

Benchmark

Query speed (from sending to stop streaming)

This PR

Hi: ~4-5s
Find one morphology in the thalamus, plot it, get its assets and download the swc one.: ~37s
Show me papers about neuron morphologies in the thalamus of rodents: ~16s

Main

Hi: ~8s
Find one morphology in the thalamus, plot it, get its assets and download the swc one.: ~42s (the download link was wrong because way shorter than it should have been, tried twice)
Show me papers about neuron morphologies in the thalamus of rodents: ~20s

In general the latency is significantly reduced with this approach.

Token count

This PR

Currently the selection LLM has a system prompt with 7,545 tokens. After selection, the main LLM has roughly 10,000 tokens worth of tool description when selecting 10 tools.
This indeed depends on the selected tools and the number of tools. Add to that the main LLM's system prompt worth 1,164 tokens for a total of ~19,000 tokens. Bear in mind that the main LLM only has ~11k tokens injected per request since the tokens of the selection model don't make it to the main model, i.e. 7k tokens are cheaper and run faster while the remaining 11-12k are more expensive and a bit slower.

Main

Currently we have 88,022 tokens from the tools and 1,164 tokens coming from the system prompt. In total we have a minimum of 89,186 tokens injected in every query.

In general the cost goes down by a factor 8.

The following points are also to be considered:

I observed that the LLM follows better instructions with reduced amount of token per request.
This PR introduces a risk of not having the relevant tools make it to the main LLM. I rarely had that happen but it is a possibility. Ideally we should experiment and strike the right balance by selecting: The best intelligence/latency model for the task and the minimum amount of tools that the selection model should output everytime.
I am currently putting the tool name and tool description into the system prompt of the selection model. More things could be added for improved performance, feel free to challenge this.

backend/src/neuroagent/app/dependencies.py

WonderPG · 2025-07-09T10:08:27Z

backend/src/neuroagent/app/dependencies.py

+    # Rest of your code remains the same
+    response = await openai_client.beta.chat.completions.parse(
+        messages=[{"role": "system", "content": system_prompt}, *openai_messages],  # type: ignore
+        model="gpt-4o-mini",


This could be turned into an env var but we have enough already. Please let me know if you would prefer an env var.

backend/src/neuroagent/app/dependencies.py

BoBer78 · 2025-07-09T13:37:58Z

One comment, the selection LLM has some trouble to select the exa crawling tool. Might want to give it a better description.

backend/src/neuroagent/app/app_utils.py

jankrepl · 2025-07-09T14:18:42Z

One comment, the selection LLM has some trouble to select the exa crawling tool. Might want to give it a better description.

Great point. Not sure if directly applicable in this PR but I added a comment here: #415

WonderPG · 2025-07-09T14:21:16Z

Is it okay if we address this in #415 ?

backend/src/neuroagent/app/app_utils.py

jankrepl

Works perfectly:) Thank you @WonderPG

WonderPG added 10 commits July 7, 2025 16:26

Temp

8168898

Edit rule

0b5ada3

Merge branch 'main' into tool-retrieval-2

825b9a0

Finish

9597d75

Add env var

96410cc

Final fixes

648c266

Update github action

c855e19

Update ruff linting

e837864

Fix tests

bd34212

Merge main

acf1d05

WonderPG commented Jul 9, 2025

View reviewed changes

backend/src/neuroagent/app/dependencies.py Outdated Show resolved Hide resolved

Default to 10 tools min

dfcc247

WonderPG commented Jul 9, 2025

View reviewed changes

WonderPG marked this pull request as ready for review July 9, 2025 10:12

jankrepl reviewed Jul 9, 2025

View reviewed changes

backend/src/neuroagent/app/dependencies.py Outdated Show resolved Hide resolved

jankrepl reviewed Jul 9, 2025

View reviewed changes

backend/src/neuroagent/app/dependencies.py Show resolved Hide resolved

WonderPG added 2 commits July 9, 2025 13:56

Ignore dependency if tool list is empty

b10e36a

Small fix

adf0208

WonderPG added 3 commits July 9, 2025 15:43

Split logic into helper + dependency

720b663

Merge branch 'main' into tool-retrieval-2

e8acf49

Create tool_selection table and store tool selection in db

5ffd05e

jankrepl reviewed Jul 9, 2025

View reviewed changes

backend/src/neuroagent/app/app_utils.py Outdated Show resolved Hide resolved

Fix alembic types

86e7411

Turn docstring into numpy doc

d634f2c

WonderPG added 2 commits July 9, 2025 16:28

Merge branch 'main' into tool-retrieval-2

5f2edae

Turn logger into debug

f54379f

jankrepl reviewed Jul 9, 2025

View reviewed changes

backend/src/neuroagent/app/app_utils.py Show resolved Hide resolved

WonderPG added 5 commits July 9, 2025 16:39

Remove duplicated if

92881d8

Return if len(tool_list) == min_selection

2fda127

Unify if statements

06a18a2

Force min number of tools to be positive

4e25024

revert

e0e36ca

jankrepl approved these changes Jul 10, 2025

View reviewed changes

WonderPG merged commit 670d1ba into main Jul 10, 2025
6 checks passed

WonderPG deleted the tool-retrieval-2 branch July 10, 2025 09:46

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Tool retrieval 2 #417

Tool retrieval 2 #417

Uh oh!

WonderPG commented Jul 9, 2025 •

edited

Loading

Uh oh!

Uh oh!

WonderPG Jul 9, 2025

Uh oh!

Uh oh!

Uh oh!

BoBer78 commented Jul 9, 2025

Uh oh!

Uh oh!

jankrepl commented Jul 9, 2025

Uh oh!

WonderPG commented Jul 9, 2025 •

edited

Loading

Uh oh!

Uh oh!

jankrepl left a comment

Uh oh!

Uh oh!

Uh oh!

Tool retrieval 2 #417

Tool retrieval 2 #417

Uh oh!

Conversation

WonderPG commented Jul 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Benchmark

Query speed (from sending to stop streaming)

This PR

Main

Token count

This PR

Main

Uh oh!

Uh oh!

WonderPG Jul 9, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

BoBer78 commented Jul 9, 2025

Uh oh!

Uh oh!

jankrepl commented Jul 9, 2025

Uh oh!

WonderPG commented Jul 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

jankrepl left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

WonderPG commented Jul 9, 2025 •

edited

Loading

WonderPG commented Jul 9, 2025 •

edited

Loading