Skip to content

AI Chat: refactor prompt building and engine calling to EngineConsumer implementations #19988

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Sep 14, 2023

Conversation

petemill
Copy link
Member

@petemill petemill commented Sep 5, 2023

Moves all prompt building from the single AIChatTabHelper class to either EngineConsumerLlama or EngineConsumerClaude with shared functionality in RemoteCompletionClient (was AIChatAPI).

The drivers for this are:

  • Make the AIChatTabHelper more readable - adding llama prompt building added a lot of noise to it.
  • Allow the prompt building and API calling to be implemented once and shared with iOS, which can't use TabHelpers.
  • Make prompt building and API use more testable
  • Perform some ground-work for model/engine choice per-Conversation

Resolves brave/brave-browser#31821

Some of this will be further refactored when prompt generation is moved to an API.

Also fixes that input sanitization was not happening when using the llama models.
Resolves brave/brave-browser#32787

Submitter Checklist:

  • I confirm that no security/privacy review is needed and no other type of reviews are needed, or that I have requested them
  • There is a ticket for my issue
  • Used Github auto-closing keywords in the PR description above
  • Wrote a good PR/commit description
  • Squashed any review feedback or "fixup" commits before merge, so that history is a record of what happened in the repo, not your PR
  • Added appropriate labels (QA/Yes or QA/No; release-notes/include or release-notes/exclude; OS/...) to the associated issue
  • Checked the PR locally:
    • npm run test -- brave_browser_tests, npm run test -- brave_unit_tests wiki
    • npm run lint, npm run presubmit wiki, npm run gn_check, npm run tslint
  • Ran git rebase master (if needed)

Reviewer Checklist:

  • A security review is not needed, or a link to one is included in the PR description
  • New files have MPL-2.0 license header
  • Adequate test coverage exists to prevent regressions
  • Major classes, functions and non-trivial code blocks are well-commented
  • Changes in component dependencies are properly reflected in gn
  • Code follows the style guide
  • Test plan is specified in PR before merging

After-merge Checklist:

Test Plan:

For both llama and claude models:

  • Page-connected conversations still work
  • Responses are streamed
  • Disconnected conversations still work
  • Suggested questions still work

@petemill petemill self-assigned this Sep 5, 2023
@petemill petemill force-pushed the ai-chat-prompt-abstraction branch from a28144c to a804591 Compare September 6, 2023 04:20
@petemill petemill marked this pull request as ready for review September 6, 2023 04:20
@petemill petemill force-pushed the ai-chat-prompt-abstraction branch 2 times, most recently from 63c8654 to 52a62c1 Compare September 6, 2023 16:12
@petemill petemill requested a review from a team as a code owner September 6, 2023 16:12
@petemill petemill force-pushed the ai-chat-prompt-abstraction branch from 52a62c1 to 71735e9 Compare September 6, 2023 16:13
@LorenzoMinto
Copy link
Contributor

There are two occurrences of both "Summarize this video" and "Summarize this page" that could be brought under a single constant. Can't seem to comment on them.

api_request_helper::APIRequestResult result);
void OnEngineCompletionDataReceived(int64_t for_navigation_id,
std::string result);
void OnEngineCompletionComplete(int64_t for_navigation_id,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although Completion is technically correct, this could also be OnEngineGenerationComplete to avoid the repetition and to align with the suggested question generation. Same for above

Copy link
Member Author

@petemill petemill Sep 8, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OnEngineGenerationComplete - is it generating engines? 😄

In thinking about it, we want this to be clear it's the callbacks for EngineConsumer::SubmitHumanInput specifically. So I guess my question for you is your opinion on these function names in EngineConsumer:

  • GenerateQuestionSuggestions
  • SubmitHumanInput

Perhaps the latter should be GenerateAssistantResponse? In which case we can name these OnAssistantResponseData and OnAssistantResponseComplete?

Copy link
Contributor

@LorenzoMinto LorenzoMinto Sep 8, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The active verb that sounds a bit off, as the client is merely getting the response rather than generating it. But I don't have a strong opinion about this. GenerateAssistantResponse would follow the suggested question function name and agree it would sound clearer.

Another option could be using Chat() (and OnChatDataReceived/Complete) and SuggestedQuestion() (nit from QuestionSuggestions 😅). PS: was reading OnEngineGenerationComplete as "engine's generation"

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The consumer doesn't know that the class is working over remote API, it could be local for all it cared. But something is generating the questions and the responses, and the consumer is asking for that to happen, so I think it makes sense to go with that new version.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually I went for a hybrid approach, since these callbacks are actually used by all API calls (inside RemoteCompletionClient):

GenerationResult
GenerationDataCallback
GenerationCompletedCallback

It also was a more concise option.

Comment on lines 177 to 179
// Prevent indirect prompt injections being sent to the AI model.
// Include break-out strings contained in prompts, as well as the base
// model command separators.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: can moving this comment to the virtual definition give us a good support in our intellisense?

// conversation prompts.
constexpr char kHumanPrompt[] = "Human:";

class RemoteCompletionClient {
Copy link
Contributor

@nullhook nullhook Sep 7, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"RemoteCompletionClient" assumes that we're always going to use this API as a completion endpoint. This might not be true. I believe there will be future use cases where we can expand this interface to ask the API for embeddings or audio transcribing. Perhaps, just "RemoteClient"?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RemoteEngineClient?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

...but I don't know that those extra items would come from the same class. We can't make those assumptions either. It's all unknown. Right now it's only the completion endpoint.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RemoteEngineClient sounds good to me.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is just a nit then I'd rather leave it until a future potential refactor since this only currently deals with the completion endpoint.


#include "base/functional/callback_forward.h"
#include "base/types/expected.h"
#include "brave/components/ai_chat/common/mojom/ai_chat.mojom.h"
Copy link
Contributor

@nullhook nullhook Sep 7, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could use the forward variant here? ai_chat.mojom-forward.h which is lightweight

@@ -3,5 +3,6 @@ include_rules = [
"+services/data_decoder/public",
"+services/network/public",
"+services/service_manager/public",
"+absl/types/optional.h",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where are you using optional?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

engine_consumer_llama.cc uses absl::Optional

AIChatAPI::AIChatAPI(
// static
std::string RemoteCompletionClient::GetHumanPromptSegment() {
return base::StrCat({"\n\n", kHumanPrompt, " "});
Copy link
Member Author

@petemill petemill Sep 8, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@LorenzoMinto @nvonpentz is this meant to apply as a stop sequence to both llama2 and claude? I thought it wasn't used for Llama2 but we're still adding it as a stop sequence for all llama2 calls in addition to extras each call provide. I'm guessing it doesn't do anything because llama2 doesn't generate " Human:"?

Copy link
Contributor

@LorenzoMinto LorenzoMinto Sep 8, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this shouldn't be added as a stop sequence to llama as it doesn't use it. It may not affect the generation much as that sequence ("\n\nHuman:") is generally unlikely, but we should probably keep this Claude specific.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

std::string prompt = BuildLlama2Prompt(conversation_history, page_content,
is_video, human_input);
DCHECK(api_);
api_->QueryPrompt(prompt, {"</response>"}, std::move(completed_callback),
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@LorenzoMinto @nvonpentz should we be adding kLlama2Eos as a stop sequence here too, like we do with the question suggestions prompt?

Copy link
Contributor

@LorenzoMinto LorenzoMinto Sep 8, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should, and remove </response> as we don't use it in llama. It might have gone overlooked because it's not explicit what it is. Could we maybe extract it to a named variable? A free string could be many different things without remembering the QueryPrompt signature

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Copy link
Contributor

@iefremov iefremov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DEPS lgtm

@petemill petemill force-pushed the ai-chat-prompt-abstraction branch from 71735e9 to 34ac136 Compare September 11, 2023 23:46
Comment on lines +194 to +195
base::ReplaceSubstringsAfterOffset(&input, 0, kHumanPrompt, "");
base::ReplaceSubstringsAfterOffset(&input, 0, kAIPrompt, "");
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@LorenzoMinto do you think this is ok that we're stripping Human: from any input (article or human entered) and not waiting for the full \n\nHuman: ? I think the reason to do it is in case something does have the smaller string, without the perfect line breaks and Claude still interprets it as a change of character. Downside is that it's over-eager and could produce unexpected results when we're silently ignoring a substring.

Copy link
Contributor

@nullhook nullhook left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

++

@petemill petemill merged commit 72fbe8a into master Sep 14, 2023
@petemill petemill deleted the ai-chat-prompt-abstraction branch September 14, 2023 04:12
@github-actions github-actions bot added this to the 1.60.x - Nightly milestone Sep 14, 2023
@petemill
Copy link
Member Author

Too fast on the clicking - I did a merge commit instead of a squash 😞

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants