-
Notifications
You must be signed in to change notification settings - Fork 965
AI Chat: refactor prompt building and engine calling to EngineConsumer implementations #19988
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
a28144c
to
a804591
Compare
63c8654
to
52a62c1
Compare
52a62c1
to
71735e9
Compare
There are two occurrences of both |
api_request_helper::APIRequestResult result); | ||
void OnEngineCompletionDataReceived(int64_t for_navigation_id, | ||
std::string result); | ||
void OnEngineCompletionComplete(int64_t for_navigation_id, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Although Completion is technically correct, this could also be OnEngineGenerationComplete
to avoid the repetition and to align with the suggested question generation. Same for above
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OnEngineGenerationComplete
- is it generating engines? 😄
In thinking about it, we want this to be clear it's the callbacks for EngineConsumer::SubmitHumanInput
specifically. So I guess my question for you is your opinion on these function names in EngineConsumer
:
GenerateQuestionSuggestions
SubmitHumanInput
Perhaps the latter should be GenerateAssistantResponse
? In which case we can name these OnAssistantResponseData
and OnAssistantResponseComplete
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The active verb that sounds a bit off, as the client is merely getting the response rather than generating it. But I don't have a strong opinion about this. GenerateAssistantResponse
would follow the suggested question function name and agree it would sound clearer.
Another option could be using Chat()
(and OnChatDataReceived/Complete
) and SuggestedQuestion()
(nit from QuestionSuggestions 😅). PS: was reading OnEngineGenerationComplete
as "engine's generation"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The consumer doesn't know that the class is working over remote API, it could be local for all it cared. But something is generating the questions and the responses, and the consumer is asking for that to happen, so I think it makes sense to go with that new version.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually I went for a hybrid approach, since these callbacks are actually used by all API calls (inside RemoteCompletionClient
):
GenerationResult
GenerationDataCallback
GenerationCompletedCallback
It also was a more concise option.
// Prevent indirect prompt injections being sent to the AI model. | ||
// Include break-out strings contained in prompts, as well as the base | ||
// model command separators. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: can moving this comment to the virtual definition give us a good support in our intellisense?
// conversation prompts. | ||
constexpr char kHumanPrompt[] = "Human:"; | ||
|
||
class RemoteCompletionClient { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"RemoteCompletionClient" assumes that we're always going to use this API as a completion endpoint. This might not be true. I believe there will be future use cases where we can expand this interface to ask the API for embeddings or audio transcribing. Perhaps, just "RemoteClient"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
RemoteEngineClient
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
...but I don't know that those extra items would come from the same class. We can't make those assumptions either. It's all unknown. Right now it's only the completion endpoint.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
RemoteEngineClient
sounds good to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this is just a nit then I'd rather leave it until a future potential refactor since this only currently deals with the completion endpoint.
|
||
#include "base/functional/callback_forward.h" | ||
#include "base/types/expected.h" | ||
#include "brave/components/ai_chat/common/mojom/ai_chat.mojom.h" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could use the forward variant here? ai_chat.mojom-forward.h which is lightweight
@@ -3,5 +3,6 @@ include_rules = [ | |||
"+services/data_decoder/public", | |||
"+services/network/public", | |||
"+services/service_manager/public", | |||
"+absl/types/optional.h", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where are you using optional?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
engine_consumer_llama.cc uses absl::Optional
AIChatAPI::AIChatAPI( | ||
// static | ||
std::string RemoteCompletionClient::GetHumanPromptSegment() { | ||
return base::StrCat({"\n\n", kHumanPrompt, " "}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@LorenzoMinto @nvonpentz is this meant to apply as a stop sequence to both llama2 and claude? I thought it wasn't used for Llama2 but we're still adding it as a stop sequence for all llama2 calls in addition to extras each call provide. I'm guessing it doesn't do anything because llama2 doesn't generate " Human:"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this shouldn't be added as a stop sequence to llama as it doesn't use it. It may not affect the generation much as that sequence ("\n\nHuman:") is generally unlikely, but we should probably keep this Claude specific.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
std::string prompt = BuildLlama2Prompt(conversation_history, page_content, | ||
is_video, human_input); | ||
DCHECK(api_); | ||
api_->QueryPrompt(prompt, {"</response>"}, std::move(completed_callback), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@LorenzoMinto @nvonpentz should we be adding kLlama2Eos as a stop sequence here too, like we do with the question suggestions prompt?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should, and remove </response>
as we don't use it in llama. It might have gone overlooked because it's not explicit what it is. Could we maybe extract it to a named variable? A free string could be many different things without remembering the QueryPrompt
signature
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
DEPS lgtm
71735e9
to
34ac136
Compare
base::ReplaceSubstringsAfterOffset(&input, 0, kHumanPrompt, ""); | ||
base::ReplaceSubstringsAfterOffset(&input, 0, kAIPrompt, ""); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@LorenzoMinto do you think this is ok that we're stripping Human:
from any input (article or human entered) and not waiting for the full \n\nHuman:
? I think the reason to do it is in case something does have the smaller string, without the perfect line breaks and Claude still interprets it as a change of character. Downside is that it's over-eager and could produce unexpected results when we're silently ignoring a substring.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
++
Too fast on the clicking - I did a merge commit instead of a squash 😞 |
Moves all prompt building from the single AIChatTabHelper class to either
EngineConsumerLlama
orEngineConsumerClaude
with shared functionality inRemoteCompletionClient
(wasAIChatAPI
).The drivers for this are:
AIChatTabHelper
more readable - adding llama prompt building added a lot of noise to it.Resolves brave/brave-browser#31821
Some of this will be further refactored when prompt generation is moved to an API.
Also fixes that input sanitization was not happening when using the llama models.
Resolves brave/brave-browser#32787
Submitter Checklist:
QA/Yes
orQA/No
;release-notes/include
orrelease-notes/exclude
;OS/...
) to the associated issuenpm run test -- brave_browser_tests
,npm run test -- brave_unit_tests
wikinpm run lint
,npm run presubmit
wiki,npm run gn_check
,npm run tslint
git rebase master
(if needed)Reviewer Checklist:
gn
After-merge Checklist:
changes has landed on
Test Plan:
For both llama and claude models: