Skip to content

FEATURE: Add Multi Modal Capabilities to Flowise #1419

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 76 commits into from
Feb 27, 2024
Merged
Show file tree
Hide file tree
Changes from 27 commits
Commits
Show all changes
76 commits
Select commit Hold shift + click to select a range
c96572e
GPT Vision - OpenAIVisionChain
vinodkiran Nov 25, 2023
73f7046
GPT Vision: Initial implementation of the OpenAI Vision API
vinodkiran Dec 6, 2023
dc265eb
Merge branch 'main' into FEATURE/Vision
vinodkiran Dec 6, 2023
b492153
GPT Vision: Storing filenames only in chat message
vinodkiran Dec 7, 2023
68fbe0e
GPT Vision: Vision Chain Node update along with addition of chatid fo…
vinodkiran Dec 7, 2023
3257582
GPT Vision: Converting vision into Multi Modal. Base Changes.
vinodkiran Dec 8, 2023
1b308a8
making the chain multi-modal. now we accept audio and image uploads a…
vinodkiran Dec 9, 2023
1bd1fd5
MultiModal: Minor adjustments to layout and categorization of node
vinodkiran Dec 13, 2023
c609c63
MultiModal: start integration of audio input (live recording) for Mul…
vinodkiran Dec 13, 2023
826de70
MultiModal: addition of live recording...
vinodkiran Dec 15, 2023
60800db
Merge branch 'main' into FEATURE/Vision
vinodkiran Dec 15, 2023
c6ae3be
Merge branch 'main' into FEATURE/Vision
vinodkiran Dec 20, 2023
d3ce6f8
Merge branch 'main' into FEATURE/Vision
vinodkiran Dec 21, 2023
7f15494
Merge branch 'main' into FEATURE/Vision
HenryHengZJ Jan 8, 2024
f57daea
Merge branch 'main' into FEATURE/Vision
HenryHengZJ Jan 15, 2024
398a31f
UI touchup
HenryHengZJ Jan 17, 2024
8a14a52
GPT Vision: Renaming to OpenAIMultiModalChain and merging the functio…
vinodkiran Jan 18, 2024
1883111
GPT Vision: Fix for error when only speech input is sent.
vinodkiran Jan 18, 2024
9222aaf
GPT Vision: Updated behaviour to submit voice recording directly with…
vinodkiran Jan 18, 2024
f87d849
GPT Vision: lint fixes
vinodkiran Jan 18, 2024
e774bd3
GPT Vision: Added multi model capabilities to ChatOpenAI and Conversa…
vinodkiran Jan 19, 2024
7e5d8e7
Fix image uploads appear on top of chat messages. Now image uploads w…
0xi4o Jan 22, 2024
59643b6
Fix the flickering issue when dragging files over the chat window
0xi4o Jan 22, 2024
7d0ae52
Fix chat popup styles and remove console statements
0xi4o Jan 22, 2024
f384ad9
Update audio recording ui in internal chat
0xi4o Jan 22, 2024
318686e
Fix issue where audio recording is not sent on stopping recording
0xi4o Jan 23, 2024
3ce22d0
MultiModal : Adding functionality to base OpenAI Chat Model
vinodkiran Jan 24, 2024
d61e3d5
SpeechToText: Adding SpeechToText at the Chatflow level.
vinodkiran Jan 27, 2024
517c2f2
Fix error message when audio recording is not available
0xi4o Jan 30, 2024
1d12208
Fix auto scroll on audio messages
0xi4o Jan 30, 2024
4604594
SpeechToText: Adding SpeechToText at the Chatflow level.
vinodkiran Jan 31, 2024
e81927e
SpeechToText: Adding SpeechToText at the Chatflow level.
vinodkiran Jan 31, 2024
5c8f48c
Multimodal: Image Uploads.
vinodkiran Feb 1, 2024
aa5d141
Multimodal: deleting uploads on delete of all chatmessages
vinodkiran Feb 1, 2024
eab8c19
Multimodal: deleting uploads on delete of all chatmessages or chatflow
vinodkiran Feb 1, 2024
9cd0362
Merge branch 'main' into FEATURE/Vision
HenryHengZJ Feb 2, 2024
a219efc
Rename MultiModalUtils.ts to multiModalUtils.ts
HenryHengZJ Feb 2, 2024
c5bd4d4
address configuration fix and add BLOB_STORAGE_PATH env variable
HenryHengZJ Feb 2, 2024
a4131dc
add fixes for chaining
HenryHengZJ Feb 2, 2024
041bfea
add more params
HenryHengZJ Feb 2, 2024
c504f91
Multimodal: guard to check for nodeData before image message insertion.
vinodkiran Feb 2, 2024
8c494cf
Fix UI issues - chat window height, image & audio styling, and image …
0xi4o Feb 6, 2024
9072e69
Return uploads config in public chatbot config endpoint
0xi4o Feb 12, 2024
0a54db7
Update how uploads config is sent
0xi4o Feb 12, 2024
11219c6
Fix audio recording not sending when recording stops
0xi4o Feb 13, 2024
2056703
Check if uploads are enabled/changed on chatflow save and update chat…
0xi4o Feb 14, 2024
56b2186
Send uploads config if available, even when chatbot config is not ava…
0xi4o Feb 14, 2024
dcb1ad1
Merge branch 'main' into FEATURE/Vision
HenryHengZJ Feb 14, 2024
86da67f
add missing human text when image presents
HenryHengZJ Feb 14, 2024
44c1f54
Showing image/audio files in the View Messages Dialog
vinodkiran Feb 14, 2024
a71c5a1
fix for concurrent requests for media handling
vinodkiran Feb 14, 2024
85809a9
fix for concurrency
HenryHengZJ Feb 14, 2024
6acc921
ViewMessages->Export Messages. Add Fullpath of the image/audio file.
vinodkiran Feb 14, 2024
9c874bb
Concurrency fixes - correcting wrong id
vinodkiran Feb 15, 2024
52ffa17
Multimodal Fixes...removing all static methods/variables.
vinodkiran Feb 15, 2024
10fc1bf
Multimodal Fixes for cyclic (circular) dependencies during langsmith …
vinodkiran Feb 16, 2024
81c07dc
Update UI of speech to text dialog
0xi4o Feb 19, 2024
5aa991a
Update how uploads are shown in view messages dialog
0xi4o Feb 19, 2024
46c4701
Merge branch 'main' into FEATURE/Vision
HenryHengZJ Feb 19, 2024
d313dc6
Show transcribed audio inputs as message along with audio clip in int…
0xi4o Feb 19, 2024
8bad360
Remove status indicator in speech to text configuration
0xi4o Feb 19, 2024
b31e871
reverting all image upload logic to individual chains/agents
vinodkiran Feb 19, 2024
97a376d
Fix local state sync issue, STT auth issue, and add none option for s…
0xi4o Feb 20, 2024
51c2a93
Merge remote-tracking branch 'origin/FEATURE/Vision' into FEATURE/Vision
vinodkiran Feb 20, 2024
0bc8559
Merge branch 'main' into FEATURE/Vision
vinodkiran Feb 20, 2024
4cee518
image uploads for mrkl agent
vinodkiran Feb 20, 2024
d172802
Merge branch 'main' into feature/Vision
HenryHengZJ Feb 21, 2024
a48edcd
touchup fixes
HenryHengZJ Feb 21, 2024
4071fe5
add default none option
HenryHengZJ Feb 21, 2024
35d3b93
Merge branch 'main' into feature/Vision
HenryHengZJ Feb 21, 2024
e86550a
update marketplace templates
HenryHengZJ Feb 22, 2024
7e84268
Add content-disposition package for handling content disposition resp…
0xi4o Feb 23, 2024
e55975e
Revert useEffect in async dropdown and input components
0xi4o Feb 23, 2024
b884e93
fix speech to text dialog credential, fix url changed when clicked se…
HenryHengZJ Feb 24, 2024
bca7e82
Merge branch 'main' into FEATURE/Vision
HenryHengZJ Feb 26, 2024
68ac61c
fix speech to dialog state
HenryHengZJ Feb 26, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
import { FlowiseMemory, ICommonObject, IMessage, INode, INodeData, INodeParams } from '../../../src/Interface'
import { FlowiseMemory, ICommonObject, INode, INodeData, INodeParams } from '../../../src/Interface'
import { ConversationChain } from 'langchain/chains'
import { getBaseClasses } from '../../../src/utils'
import { ChatPromptTemplate, HumanMessagePromptTemplate, MessagesPlaceholder, SystemMessagePromptTemplate } from 'langchain/prompts'
Expand All @@ -8,6 +8,7 @@ import { flatten } from 'lodash'
import { Document } from 'langchain/document'
import { RunnableSequence } from 'langchain/schema/runnable'
import { StringOutputParser } from 'langchain/schema/output_parser'
import { injectChainNodeData } from '../../../src/MultiModalUtils'

let systemMessage = `The following is a friendly conversation between a human and an AI. The AI is talkative and provides lots of specific details from its context. If the AI does not know the answer to a question, it truthfully says it does not know.`
const inputKey = 'input'
Expand Down Expand Up @@ -67,13 +68,15 @@ class ConversationChain_Chains implements INode {
}

async init(nodeData: INodeData, _: string, options: ICommonObject): Promise<any> {
const chain = prepareChain(nodeData, this.sessionId, options.chatHistory)
const chain = prepareChain(nodeData, options, this.sessionId)
return chain
}

async run(nodeData: INodeData, input: string, options: ICommonObject): Promise<string> {
const memory = nodeData.inputs?.memory
const chain = prepareChain(nodeData, this.sessionId, options.chatHistory)
injectChainNodeData(nodeData, options)

const chain = prepareChain(nodeData, options, this.sessionId)

const loggerHandler = new ConsoleCallbackHandler(options.logger)
const callbacks = await additionalCallbacks(nodeData, options)
Expand Down Expand Up @@ -105,7 +108,7 @@ class ConversationChain_Chains implements INode {
}
}

const prepareChatPrompt = (nodeData: INodeData) => {
const prepareChatPrompt = (nodeData: INodeData, options: ICommonObject) => {
const memory = nodeData.inputs?.memory as FlowiseMemory
const prompt = nodeData.inputs?.systemMessagePrompt as string
const docs = nodeData.inputs?.document as Document[]
Expand All @@ -128,16 +131,19 @@ const prepareChatPrompt = (nodeData: INodeData) => {

if (finalText) systemMessage = `${systemMessage}\nThe AI has the following context:\n${finalText}`

const chatPrompt = ChatPromptTemplate.fromMessages([
//TODO, this should not be any[], what interface should it be?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

interface should be BasesMessage[]?

let promptMessages: any[] = [
SystemMessagePromptTemplate.fromTemplate(prompt ? `${prompt}\n${systemMessage}` : systemMessage),
new MessagesPlaceholder(memory.memoryKey ?? 'chat_history'),
HumanMessagePromptTemplate.fromTemplate(`{${inputKey}}`)
])
]
const chatPrompt = ChatPromptTemplate.fromMessages(promptMessages)

return chatPrompt
}

const prepareChain = (nodeData: INodeData, sessionId?: string, chatHistory: IMessage[] = []) => {
const prepareChain = (nodeData: INodeData, options: ICommonObject, sessionId?: string) => {
const chatHistory = options.chatHistory
const model = nodeData.inputs?.model as BaseChatModel
const memory = nodeData.inputs?.memory as FlowiseMemory
const memoryKey = memory.memoryKey ?? 'chat_history'
Expand All @@ -150,7 +156,7 @@ const prepareChain = (nodeData: INodeData, sessionId?: string, chatHistory: IMes
return history
}
},
prepareChatPrompt(nodeData),
prepareChatPrompt(nodeData, options),
model,
new StringOutputParser()
])
Expand Down
2 changes: 2 additions & 0 deletions packages/components/nodes/chains/LLMChain/LLMChain.ts
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@ import { formatResponse, injectOutputParser } from '../../outputparsers/OutputPa
import { BaseLLMOutputParser } from 'langchain/schema/output_parser'
import { OutputFixingParser } from 'langchain/output_parsers'
import { checkInputs, Moderation, streamResponse } from '../../moderation/Moderation'
import { injectChainNodeData } from '../../../src/MultiModalUtils'

class LLMChain_Chains implements INode {
label: string
Expand Down Expand Up @@ -129,6 +130,7 @@ class LLMChain_Chains implements INode {
if (!this.outputParser && outputParser) {
this.outputParser = outputParser
}
injectChainNodeData(nodeData, options)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@HenryHengZJ review critically

promptValues = injectOutputParser(this.outputParser, chain, promptValues)
const res = await runPrediction(inputVariables, chain, input, promptValues, options, nodeData)
// eslint-disable-next-line no-console
Expand Down
Loading