-
Notifications
You must be signed in to change notification settings - Fork 5.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
(fix): Condensation events to reconstruct contexts added to event stream #7353
(fix): Condensation events to reconstruct contexts added to event stream #7353
Conversation
Thank you for this! It seems to do exactly what I hit a problem with, in the other PR. I started at the other end of this issue (put condensation events in the stream), and meant to keep it minimal, just to see what it looks like, but such refactoring needs also something like this PR, in order to actually work. 😅 |
I totally missed that PR! Had my head down digging into some metrics. This approach turned out way less intensive than I expected. I've got the condensers doing the right thing, just now getting to the point of testing the solution with the agent controller in the mix. Also need to figure out how to test the failure case we were seeing... I haven't dug through your PR yet. Do you think our implementations can just be smashed together? |
On the latter, maybe you want to see the 2 comments with debugging info. Idk if it's useful, they're literally like random notes though. The thing is, I got a great event stream, totally by accident, it was exactly a few tokens away from ContextWindowExceededError 😅. So on restore, if the first message is like 10-ish tokens, it got the error, and if it was less, it didn't. I literally removed newer events from my stream and restored the first ones repeatedly. That said, I was looking at more things than condensation actually (some strange LLM behavior on microagents) |
I am still to check your latest changes, I think we may want to sync at least on what kind of Action do we want here, it kinda matters for the rest IMHO. This kind of thing may help merging them (I meant to not change how the condensers work), idk. |
…nds into fix/condenser-visibility
…nds into fix/condenser-visibility
I'll take a closer look, thanks. It'd be great if we could also get the condensation trigger behavior to be token-count aware, even if we're just approximating it. With enough of a safety factor we might be able to avoid context window errors entirely.
My first pass had different events for each kind of condenser, because they wanted slightly different bits of information. The current approach just has a single event with the union of all those features. I'd love to have some kind of "payload polymorphism" so that even with a single event there's still some type-safety, but the current event implementation makes that pretty fiddly. Might revisit that after a little event clean-up. As it stands, I've got all existing tests working and have confirmed condensation happens appropriately without disrupting the agent flow. There's still some clean-up needed on documentation, and I want to add more tests, but I'm decently happy with where this PR is now. |
1. By providing a list of event IDs. | ||
2. By providing the start and end IDs of a range of events. | ||
|
||
In the second case, we assume that event IDs are monotonically increasing, and that _all_ events between the start and end IDs are to be forgotten. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, yes, I also noticed again while rebuilding lists that this is completely reasonable, but for the condense actions themselves. They will be in the stream/history interweaved with the last big list to be forgotten.
This isn't just an assumption, it's by definition. We can define the start id and end id to be the range between which most condensers (the exception being attention) have their forgotten events in that step.
In fact, whether we need the other condense actions is a question for the condenser implementation. Outside the condensers, if we want to send them to the agent LLM, we can pick them up. They're in the stream, so they're in history, so we don't, strictly speaking, need the condensers to also provide those, we can pick them up if we want.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to confirm so we can resolve this comment -- all event IDs are ordered monotonically increasing as they get added to the event stream? So if event_i
is added to the event stream, then event_j
is added, it will always be the case that event_i.id < event_j.id
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. The stream itself assigns ids in ascending order, in the add_event
method, which is the only way to do it, and the _id
field is otherwise never set.
Sorry, I don't mean to imply it's a bad design pattern in general. In this PR I just kept running into issues where I'd have a huge set of fields and a single discriminator enum that would control which combinations of fields should be enabled and how they should be interpreted. I can add validators in
That's the approach I tried to go with. The current implementation has that working, but I couldn't avoid all instances of that flat polymorphism pattern without just relying on lists of integers for event identifiers.
The assumption is that events have monotonically increasing (wrt their creation time) IDs. That way I can define start/end ranges based on IDs and -- since the forgotten events act as a filter over the history -- I won't miss any events or forget one that I don't want to. |
I see... was that an artefact of the combinations of fields in the work in progress or does it have to be the case always? I ask because I'm not sure the combinations are many... with the range, I mean.
Yes, this is reasonable. It is definitely the case in the stream, and it is the case in history save for the truncation in action. I can look into stopping that, but if this PR works as expected, that case should be extremely rare with Claude and other LLMs with decent-ish windows. |
It was definitely an issue with an earlier approach that captured more info. The options are not numerous here, as you can see in |
Sorry, I tried to test that it works as expected, when restoring a session:
It doesn't seem to work at the moment because it won't recognize the old condense action? It creates actions, but it recognizes obs. |
I restored a ~450 events session made from the It doesn't manage it, it falls back to the controller truncation, and then condensation happens at half (event_id=231). I think this is because condensers don't have error handling, and it gets the exception in the condenser LLM call.
|
@enyst: Hi! I'm going to also spend some time taking a look at this today and trying to help with testing and getting it merged, since this is high priority. Would you be able to share that trajectory either as-is (with quick directions about how to load it in) or curate it into a failing test? |
I don't think we should expect that to work in this case. The old sessions won't have the
That's also expected here, and how it currently works on |
That doesn't matter. We can't save that observation in the right spot in the history without violating the temporal ordering of events, so the |
There are two ways to specify the events to be forgotten: | ||
1. By providing a list of event IDs. | ||
2. By providing the start and end IDs of a range of events. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reading this class again, do we actually need these two code paths? Couldn't we just do the first one and the functionality would be a superset of the second one? Sure it'd be a little bit less efficient memory-wise, but storing a list of ints is not really a big deal I think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The first path is a functional super-set of the second, yes. Engel expressed concerns about the size of the messages so I added the suggested range representation as an optimization. I don't have numbers on how much it saves but I think the overall memory impact is minor either way (maybe ~0.1% of the data in the whole trajectory?).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I'm sorry, it may be just me and my inefficiency hate: they're consecutive, so unnecessary and they would be saved/read in every condensation action. 🥹 They would also be sent via websocket.
Our usual case is the summarizer, where they're consecutive. In fact, all of them, except for llm_attention
, specify, at most, a range: from start
to end
.
IMHO a way to represent these would be with polymorphism and discriminated union. But the event stream is "flat". When we represented this kind of case before, we made a flattened representation with an enum as discriminator for serialization and some validation, and an umbrella class for the necessary fields. Calvin doesn't like that, quite understandably - and here they would be more than two. 😅
This is an umbrella class, but with just minimal options, which is two. It's a middle-ground I'm happy with, but maybe it's bad?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, personally I think code complexity is worse than small amounts of inefficiency. But it's not a hill that I'll die on :)
@neubig Sure, but sorry, just to clarify, I agree with Calvin that ContextWindowExceeded not handled on a previous session is expected. We don't have exception handling in condensers, so if ContextWindowExceededError happens during their LLM call, falls to the controller. With this PR, those should extremely rare, at least with LLMs with some decent context window size. Except for previous sessions, that is, sessions made on 5c3288e4f277426388257d13a3a6ab9a.zip |
Edited: Sorry, I see it. It works just fine. Session on the PR branch, 2 condensation actions -> closed, restored Sorry to be an annoying nitpicker, I think it's great and good to go. |
…nds into fix/condenser-visibility
…eam (All-Hands-AI#7353) Co-authored-by: Calvin Smith <[email protected]>
…vent stream (All-Hands-AI#7353)" This reverts commit 5abbb38.
…eam (All-Hands-AI#7353) Co-authored-by: Calvin Smith <[email protected]>
…eam (All-Hands-AI#7353) Co-authored-by: Calvin Smith <[email protected]>
End-user friendly description of the problem this fixes or functionality that this introduces.
Give a summary of what the PR does, explaining any non-trivial design decisions.
This PR updates the condenser interface to support a stateless implementation of a rolling condenser.
Instead of returning just a condensed history, this new interface allows a condenser to return a
View
of the history or aCondensation
event. The condensation event can be added to the event stream, which is then one step closer to being the only source of state needed for the agent.To implement this strategy, rolling condensers now define three abstract functions:
get_view
, which converts the sequence of events in the state to a view object by utilizing the information in the condensation events,get_condensation
, which takes a view and produces a condensation event defining how future views should be constructed, andshould_condense
, a trigger which determines when the condensation should happen.In addition to the condenser changes, this PR updates the CodeAct agent implementation to use the new interface and updates all condenser tests accordingly.
Link of any specific issues this addresses.
This directly solves the problem in #6706, but should also resolve #7175.