-
-
Notifications
You must be signed in to change notification settings - Fork 490
Toolbox instantiation should be logged to DB when logging toolbox methods #1089
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
The key information that needs to be stored in the database here is the configuration that was used to instantiate the Toolbox (for tools that take configuration). I should probably enforce somewhere that this configuration must be JSON serializable - but is that strictly necessary for people who are using tools exclusively in Python and not via the CLI? What should the database logging stuff do if a user attempts to serialize a response that used a class instance that didn't use serializable configuration? Cheapest trick would be to do Then later what happens if you try to re-inflate a conversation from the database that has invalid serialized configuration? I guess you get an error. |
LLM may need to know if a Toolbox has stayed resident in memory or if it's been discarded and re-inflated. That's material to how tool calling work, otherwise you confuse the models by occasionally clearing state on them (though I've seen Code Interpreter and tools like it handle this OK in the past, they generally decide to try and reconstruct their state.) |
Let's use https://github.com/simonw/llm-tools-datasette as an example. It currently has one argument: llm --tool 'Datasette("https://datasette.io/content")' "Show tables" --td I'm contemplating adding a second optional one for authentication (a key to lookup in `llm.get_key()): If someone runs the above If the tool class is instantiated but no methods on it are called should we still record that we instantiated it? The logs need to know "the model had the option to use these tools and chose not to". I think the configuration of the class IS relevant to that, since maybe the model decided not to use tools based on how they had been configured (but can it see that?) |
Part of the challenge here is where this happens. CLI users have their toolboxes instantiated for them here: Line 807 in e23e13e
Or here in Line 1091 in e23e13e
But Python API users instantiate tools directly themselves: import llm
from llm_tools_datasette import Datasette
model = llm.get_model("gpt-4.1-mini")
result = model.chain(
"Show most interesting tables",
tools=[Datasette("https://datasette.io/content")],
).text() How do we record the "this class was instantiated at this moment and with these arguments" information such that, later on, this code in Lines 815 to 845 in e23e13e
|
I'm inclined to have the |
llm -m claude-4-opus -f issue:https://github.com/simonw/llm/issues/1089 \
-f llm/models.py -f llm/cli.py \
-s 'Suggest potential code approaches for this issue' https://gist.github.com/simonw/7b42c18a87f7caed5d5016a14f4e7c99#response - not hugely useful. 87.1755 cents. Ran it again against
|
Notable that there's currently no Lines 378 to 407 in e23e13e
|
The main impact this has is that |
o4-mini's reply made me think that maybe I could have this table: CREATE TABLE [toolbox_config] (
[id] INTEGER PRIMARY KEY,
[hash] TEXT, -- sha256 hash of name and config and plugin
[name] TEXT, -- e.g. "Datasette"
[config] TEXT, -- serialized JSON
[plugin] TEXT
); Then individual Would this give me what I want? Could I use this to later re-inflate a Toolbox on |
Maybe Or throw an error so it's not an invisible failure. I'm having to do quite a lot of work to support the edge-case of |
I think I'm going to implement logging before I ship a stable release, but maybe I'll punt on supporting this until a future release: |
I'm going to prototype up the |
Actually, no, that is not recording enough information. The key thing that we need to record is it a bunch of tool calls all acted on the same instance - a shared Python interpreter or JavaScript session, for example. That is important context for understanding exactly what happened later from the logs, and is a slightly separate concern from remembering what the initial configuration arguments were. |
So maybe a That table could also record the serialized constructor arguments. |
New SQL (new CREATE TABLE [tool_instances] (
[id] TEXT PRIMARY KEY, -- I'll use ULID for this so databases can be combined later
[plugin] TEXT,
[name] TEXT,
[arguments] TEXT
);
CREATE TABLE "tool_results" (
[id] INTEGER PRIMARY KEY,
[response_id] TEXT REFERENCES [responses]([id]),
[tool_id] INTEGER REFERENCES [tools]([id]),
[name] TEXT,
[output] TEXT,
[tool_call_id] TEXT,
[instance_id] TEXT REFERENCES [tool_instances]([id])
); |
I'm not 100% sure the ULID is necessary here - database combinations could be done without it, and |
Am I OK with spinning up a brand now Yeah I think so. I could de-dupe the |
... but in that case I won't use ULID, it's just extra storage space for no reason. |
Had to perform heart surgery on my database! I updated the new migration and did: llm logs backup /tmp/logs.db
sqlite-utils "$(llm logs path)" "delete from _llm_migrations where name = 'm018_tool_instances'"
sqlite-utils transform "$(llm logs path)" tool_results --drop instance_id
sqlite-utils drop-table "$(llm logs path)" tool_instances Then this to re-run the migration: llm logs -c And now: sqlite-utils schema "$(llm logs path)" Returns: CREATE TABLE [tool_instances] (
[id] INTEGER PRIMARY KEY,
[plugin] TEXT,
[name] TEXT,
[arguments] TEXT
);
CREATE TABLE "tool_results" (
[id] INTEGER PRIMARY KEY,
[response_id] TEXT REFERENCES [responses]([id]),
[tool_id] INTEGER REFERENCES [tools]([id]),
[name] TEXT,
[output] TEXT,
[tool_call_id] TEXT,
[instance_id] INTEGER REFERENCES [tool_instances]([id])
); |
Now I need a mechanism where by the Maybe the |
Here's where Lines 907 to 926 in e23e13e
And here's where that happens for async models - that is in two places. This is for async methods: Lines 1067 to 1081 in e23e13e
And for sync methods: Lines 1100 to 1114 in e23e13e
|
All of those places have access to |
Might be as simple as adding an @dataclass
class ToolResult:
name: str
output: str
tool_call_id: Optional[str] = None
instance: Optional[llm.Toolbox] = None # New It's up to the things that create Lines 836 to 845 in e23e13e
|
Got this from Claude Opus 4: https://claude.ai/share/d7b4b6a0-9904-4663-a3fc-fdbf9c4afd1b class Toolbox:
def __init_subclass__(cls, **kwargs):
super().__init_subclass__(**kwargs)
original_init = cls.__init__
def wrapped_init(self, *args, **kwargs):
self._config = {
'args': args,
'kwargs': kwargs,
'class': cls.__name__
}
original_init(self, *args, **kwargs)
cls.__init__ = wrapped_init |
That seems to work nicely: >>> from llm_tools_datasette import Datasette
>>> d = Datasette('https://datasette.io/content')
>>> d
<llm_tools_datasette.Datasette object at 0x107528a50>
>>> d._config
{'args': ('https://datasette.io/content',), 'kwargs': {}, 'class': 'Datasette'} |
Got a better version: https://claude.ai/share/e5439534-28b3-46b8-9aed-d76ee5a1985b Now it gives me back |
I'll finish this in a PR. |
Uh oh!
There was an error while loading. Please reload this page.
Split from:
Also refs:
The text was updated successfully, but these errors were encountered: