AI Chat: refactor prompt building and engine calling to EngineConsumer implementations #19988

petemill · 2023-09-05T07:48:40Z

Moves all prompt building from the single AIChatTabHelper class to either EngineConsumerLlama or EngineConsumerClaude with shared functionality in RemoteCompletionClient (was AIChatAPI).

The drivers for this are:

Make the AIChatTabHelper more readable - adding llama prompt building added a lot of noise to it.
Allow the prompt building and API calling to be implemented once and shared with iOS, which can't use TabHelpers.
Make prompt building and API use more testable
Perform some ground-work for model/engine choice per-Conversation

Resolves brave/brave-browser#31821

Some of this will be further refactored when prompt generation is moved to an API.

Also fixes that input sanitization was not happening when using the llama models.
Resolves brave/brave-browser#32787

Submitter Checklist:

I confirm that no security/privacy review is needed and no other type of reviews are needed, or that I have requested them
There is a ticket for my issue
Used Github auto-closing keywords in the PR description above
Wrote a good PR/commit description
Squashed any review feedback or "fixup" commits before merge, so that history is a record of what happened in the repo, not your PR
Added appropriate labels (QA/Yes or QA/No; release-notes/include or release-notes/exclude; OS/...) to the associated issue
Checked the PR locally:
- npm run test -- brave_browser_tests, npm run test -- brave_unit_tests wiki
- npm run lint, npm run presubmit wiki, npm run gn_check, npm run tslint
Ran git rebase master (if needed)

Reviewer Checklist:

A security review is not needed, or a link to one is included in the PR description
New files have MPL-2.0 license header
Adequate test coverage exists to prevent regressions
Major classes, functions and non-trivial code blocks are well-commented
Changes in component dependencies are properly reflected in gn
Code follows the style guide
Test plan is specified in PR before merging

After-merge Checklist:

Test Plan:

For both llama and claude models:

Page-connected conversations still work
Responses are streamed
Disconnected conversations still work
Suggested questions still work

LorenzoMinto · 2023-09-07T13:14:02Z

There are two occurrences of both "Summarize this video" and "Summarize this page" that could be brought under a single constant. Can't seem to comment on them.

components/ai_chat/browser/constants.cc

components/ai_chat/browser/ai_chat_tab_helper.cc

LorenzoMinto · 2023-09-07T13:36:26Z

components/ai_chat/browser/ai_chat_tab_helper.h

-      api_request_helper::APIRequestResult result);
+  void OnEngineCompletionDataReceived(int64_t for_navigation_id,
+                                      std::string result);
+  void OnEngineCompletionComplete(int64_t for_navigation_id,


Although Completion is technically correct, this could also be OnEngineGenerationComplete to avoid the repetition and to align with the suggested question generation. Same for above

OnEngineGenerationComplete - is it generating engines? 😄

In thinking about it, we want this to be clear it's the callbacks for EngineConsumer::SubmitHumanInput specifically. So I guess my question for you is your opinion on these function names in EngineConsumer:

GenerateQuestionSuggestions

SubmitHumanInput

Perhaps the latter should be GenerateAssistantResponse? In which case we can name these OnAssistantResponseData and OnAssistantResponseComplete?

The active verb that sounds a bit off, as the client is merely getting the response rather than generating it. But I don't have a strong opinion about this. GenerateAssistantResponse would follow the suggested question function name and agree it would sound clearer.

Another option could be using Chat() (and OnChatDataReceived/Complete) and SuggestedQuestion() (nit from QuestionSuggestions 😅). PS: was reading OnEngineGenerationComplete as "engine's generation"

The consumer doesn't know that the class is working over remote API, it could be local for all it cared. But something is generating the questions and the responses, and the consumer is asking for that to happen, so I think it makes sense to go with that new version.

Actually I went for a hybrid approach, since these callbacks are actually used by all API calls (inside RemoteCompletionClient):

GenerationResult
GenerationDataCallback
GenerationCompletedCallback

It also was a more concise option.

components/ai_chat/browser/engine/engine_consumer_llama.cc

nullhook · 2023-09-07T21:58:30Z

components/ai_chat/browser/engine/engine_consumer_claude.cc

+  // Prevent indirect prompt injections being sent to the AI model.
+  // Include break-out strings contained in prompts, as well as the base
+  // model command separators.


nit: can moving this comment to the virtual definition give us a good support in our intellisense?

nullhook · 2023-09-07T22:16:14Z

components/ai_chat/browser/engine/remote_completion_client.h

+// conversation prompts.
+constexpr char kHumanPrompt[] = "Human:";
+
+class RemoteCompletionClient {


"RemoteCompletionClient" assumes that we're always going to use this API as a completion endpoint. This might not be true. I believe there will be future use cases where we can expand this interface to ask the API for embeddings or audio transcribing. Perhaps, just "RemoteClient"?

RemoteEngineClient?

...but I don't know that those extra items would come from the same class. We can't make those assumptions either. It's all unknown. Right now it's only the completion endpoint.

RemoteEngineClient sounds good to me.

If this is just a nit then I'd rather leave it until a future potential refactor since this only currently deals with the completion endpoint.

nullhook · 2023-09-07T22:42:01Z

components/ai_chat/browser/engine/engine_consumer.h

+
+#include "base/functional/callback_forward.h"
+#include "base/types/expected.h"
+#include "brave/components/ai_chat/common/mojom/ai_chat.mojom.h"


You could use the forward variant here? ai_chat.mojom-forward.h which is lightweight

nullhook · 2023-09-07T22:49:43Z

components/ai_chat/browser/DEPS

@@ -3,5 +3,6 @@ include_rules = [
  "+services/data_decoder/public",
  "+services/network/public",
  "+services/service_manager/public",
+  "+absl/types/optional.h",


Where are you using optional?

engine_consumer_llama.cc uses absl::Optional

petemill · 2023-09-08T04:52:01Z

components/ai_chat/browser/engine/remote_completion_client.cc

-AIChatAPI::AIChatAPI(
+// static
+std::string RemoteCompletionClient::GetHumanPromptSegment() {
+  return base::StrCat({"\n\n", kHumanPrompt, " "});


@LorenzoMinto @nvonpentz is this meant to apply as a stop sequence to both llama2 and claude? I thought it wasn't used for Llama2 but we're still adding it as a stop sequence for all llama2 calls in addition to extras each call provide. I'm guessing it doesn't do anything because llama2 doesn't generate " Human:"?

Yes, this shouldn't be added as a stop sequence to llama as it doesn't use it. It may not affect the generation much as that sequence ("\n\nHuman:") is generally unlikely, but we should probably keep this Claude specific.

petemill · 2023-09-08T04:54:14Z

components/ai_chat/browser/engine/engine_consumer_llama.cc

+  std::string prompt = BuildLlama2Prompt(conversation_history, page_content,
+                                         is_video, human_input);
+  DCHECK(api_);
+  api_->QueryPrompt(prompt, {"</response>"}, std::move(completed_callback),


@LorenzoMinto @nvonpentz should we be adding kLlama2Eos as a stop sequence here too, like we do with the question suggestions prompt?

We should, and remove </response> as we don't use it in llama. It might have gone overlooked because it's not explicit what it is. Could we maybe extract it to a named variable? A free string could be many different things without remembering the QueryPrompt signature

iefremov

DEPS lgtm

…elper, separated by engine

petemill · 2023-09-13T16:25:10Z

components/ai_chat/browser/engine/engine_consumer_claude.cc

+  base::ReplaceSubstringsAfterOffset(&input, 0, kHumanPrompt, "");
+  base::ReplaceSubstringsAfterOffset(&input, 0, kAIPrompt, "");


@LorenzoMinto do you think this is ok that we're stripping Human: from any input (article or human entered) and not waiting for the full \n\nHuman: ? I think the reason to do it is in case something does have the smaller string, without the perfect line breaks and Claude still interprets it as a change of character. Downside is that it's over-eager and could produce unexpected results when we're silently ignoring a substring.

nullhook

++

petemill · 2023-09-14T04:13:38Z

Too fast on the clicking - I did a merge commit instead of a squash 😞

petemill self-assigned this Sep 5, 2023

petemill force-pushed the ai-chat-prompt-abstraction branch from a28144c to a804591 Compare September 6, 2023 04:20

petemill marked this pull request as ready for review September 6, 2023 04:20

petemill requested review from nullhook, nvonpentz and LorenzoMinto September 6, 2023 04:21

petemill force-pushed the ai-chat-prompt-abstraction branch 2 times, most recently from 63c8654 to 52a62c1 Compare September 6, 2023 16:12

petemill requested a review from a team as a code owner September 6, 2023 16:12

petemill force-pushed the ai-chat-prompt-abstraction branch from 52a62c1 to 71735e9 Compare September 6, 2023 16:13

LorenzoMinto reviewed Sep 7, 2023

View reviewed changes

nullhook suggested changes Sep 7, 2023

View reviewed changes

petemill commented Sep 8, 2023

View reviewed changes

iefremov approved these changes Sep 11, 2023

View reviewed changes

petemill added 4 commits September 11, 2023 16:45

AI Chat: extract prompt building and ai chat engine calling from TabH…

14c27c7

…elper, separated by engine

AI Chat API: move api_chat_api -> engine/remote_completion_client

22b8b74

format

dd431da

more refactor

34ac136

petemill force-pushed the ai-chat-prompt-abstraction branch from 71735e9 to 34ac136 Compare September 11, 2023 23:46

petemill added 2 commits September 12, 2023 16:37

feedback

0e03e1a

each model provides its own default stop sequences

e8a454d

petemill requested review from nullhook and LorenzoMinto September 13, 2023 16:21

petemill commented Sep 13, 2023

View reviewed changes

nullhook approved these changes Sep 13, 2023

View reviewed changes

petemill merged commit 72fbe8a into master Sep 14, 2023

petemill deleted the ai-chat-prompt-abstraction branch September 14, 2023 04:12

github-actions bot added this to the 1.60.x - Nightly milestone Sep 14, 2023

		base::ReplaceSubstringsAfterOffset(&input, 0, kHumanPrompt, "");
		base::ReplaceSubstringsAfterOffset(&input, 0, kAIPrompt, "");

AI Chat: refactor prompt building and engine calling to EngineConsumer implementations #19988

AI Chat: refactor prompt building and engine calling to EngineConsumer implementations #19988

Uh oh!

Conversation

petemill commented Sep 5, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Submitter Checklist:

Reviewer Checklist:

After-merge Checklist:

Test Plan:

Uh oh!

LorenzoMinto commented Sep 7, 2023

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

petemill Sep 8, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

LorenzoMinto Sep 8, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

nullhook Sep 7, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

nullhook Sep 7, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

petemill Sep 8, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

LorenzoMinto Sep 8, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

LorenzoMinto Sep 8, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

iefremov left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

nullhook left a comment

Choose a reason for hiding this comment

Uh oh!

petemill commented Sep 14, 2023

petemill commented Sep 5, 2023 •

edited

Loading

petemill Sep 8, 2023 •

edited

Loading

LorenzoMinto Sep 8, 2023 •

edited

Loading

nullhook Sep 7, 2023 •

edited

Loading

nullhook Sep 7, 2023 •

edited

Loading

petemill Sep 8, 2023 •

edited

Loading

LorenzoMinto Sep 8, 2023 •

edited

Loading

LorenzoMinto Sep 8, 2023 •

edited

Loading