Skip to content

hemslo/chat-search

Folders and files

NameName
Last commit message
Last commit date

Latest commit

90794a6 · Jul 21, 2024

History

81 Commits
Apr 23, 2024
Mar 10, 2024
Jul 21, 2024
Jul 21, 2024
Jul 21, 2024
Apr 17, 2024
Mar 10, 2024
Jan 10, 2024
Jul 21, 2024
Mar 13, 2024
Apr 23, 2024
Mar 13, 2024
Jul 21, 2024
Jul 21, 2024
Apr 14, 2024
Jul 21, 2024
May 6, 2024

Repository files navigation

chat-search

CICD Artifact Hub

Chat with documents, search via natural language.

chat-search supports hybrid language models to add chat capabilities to website. RAG built with LangChain, Redis, various model providers (OpenAI, Ollama, vLLM, Huggingface).

Demo: Chat about my blog

Usage

Setup .env

cp .env.example .env

Populate .env file with the required environment variables.

Name Value Default
AUTH_TOKEN auto token used for ingest
CHAT_PROVIDER model provider, openai or ollama openai
DEBUG enable DEBUG, 1 or 0 0
DIGEST_PREFIX prefix for digest in Redis digest
DOCUMENT_CONTENT_DESCRIPTION document content description Document content
EMBEDDING_DIM embedding dimensions 1536
EMBEDDING_PROVIDER embedding provider, openai or ollama or huggingface openai
ENABLE_FEEDBACK_ENDPOINT enable feedback endpoint, 1 or 0 1
ENABLE_PUBLIC_TRACE_LINK_ENDPOINT enable public trace link endpoint, 1 or 0 1
FULLTEXT_RETRIEVER_SEARCH_K fulltext retriever search result number 4
FULLTEXT_RETRIEVER_SELF_QUERY whether to enable fulltext retriever self query, 1 or 0 1
FULLTEXT_RETRIEVER_WEIGHT fulltext retriever weight 0.5
HEADERS_TO_SPLIT_ON html headers to split text h1,h2,h3
HF_HUB_EMBEDDING_MODEL huggingface hub embedding model or Text Embeddings Inference url http://localhost:8080
INDEX_NAME index name document
INDEX_SCHEMA_PATH index schema path (will use app/schema.yaml)
LANGCHAIN_API_KEY langchain api key for langsmith
LANGCHAIN_ENDPOINT langchain endpoint for langsmith https://api.smith.langchain.com
LANGCHAIN_PROJECT langchain project for langsmith default
LANGCHAIN_TRACING_V2 enable langchain tracing v2 true
LLM_TEMPERATURE temperature for LLM 0
MERGE_SYSTEM_PROMPT merge system prompt with user input, for models not support system role, 1 or 0 0
OLLAMA_CHAT_MODEL ollama chat model llama3
OLLAMA_EMBEDDING_MODEL ollama embedding model nomic-embed-text
OLLAMA_URL ollama url http://localhost:11434
OPENAI_API_BASE openai compatible api base url https://api.openai.com/v1
OPENAI_API_KEY openai api key EMPTY
OPENAI_CHAT_MODEL openai chat model gpt-4o-mini
OPENAI_EMBEDDING_MODEL openai embedding model text-embedding-3-small
OTEL_SDK_DISABLED disable OpenTelemetry, false or true false
OTEL_SERVICE_NAME OpenTelemetry service name, also used for Pyroscope application name chat-search
PYROSCOPE_BASIC_AUTH_PASSWORD Pyroscope basic auth password
PYROSCOPE_BASIC_AUTH_USERNAME Pyroscope basic auth username
PYROSCOPE_ENABLED Enable Pyroscope or not, 1 or 0 1
PYROSCOPE_SERVER_ADDRESS Pyroscope server address http://localhost:4040
REDIS_URL redis url redis://localhost:6379/
REPHRASE_PROMPT prompt for rephrase check config.py
RETRIEVAL_QA_CHAT_SYSTEM_PROMPT prompt for retrieval check config.py
RETRIEVER_SEARCH_K retriever search result number 4
RETRIEVER_SELF_QUERY_EXAMPLES retriever self query examples as json check config.py
TEXT_SPLIT_CHUNK_OVERLAP chunk overlap for text split 200
TEXT_SPLIT_CHUNK_SIZE chunk size for text split 4000
VECTORSTORE_RETRIEVER_SEARCH_KWARGS search kwargs for redis vectorstore retriever as json check config.py
VECTORSTORE_RETRIEVER_SEARCH_TYPE search type for redis vectorstore retriever mmr
VECTORSTORE_RETRIEVER_SELF_QUERY whether to enable vectorstore retriever self query, 1 or 0 1
VECTORSTORE_RETRIEVER_WEIGHT vectorstore retriever weight 0.5
VERBOSE enable verbose, 1 or 0 0

Start Ollama (Optional)

Follow Ollama instructions

ollama serve
ollama pull llama3
ollama pull nomic-embed-text

Run on host

Install dependencies

pip install poetry==1.7.1
poetry shell
poetry install

Start dependencies

Start redis

docker compose -f compose.redis.yaml up

Launch LangServe

langchain serve

Visit http://localhost:8000/

Run in Docker

There is a compose.yml file for running the app and all dependencies in containers. Suitable for local end to end testing.

docker compose up --build

Visit http://localhost:8000/

Run in Kubernetes

There is a helm chart for deploying the app in Kubernetes.

Config Helm values

Using Helm

cp values.example.yaml values.yaml

Then update values.yaml accordingly.

Add helm repos:

helm repo add chat-search https://hemslo.github.io/chat-search/
helm repo add redis-stack https://redis-stack.github.io/helm-redis-stack/
helm repo add ollama-helm https://otwld.github.io/ollama-helm/

Install/Upgrade chat-search

helm upgrade -i --wait my-chat-search chat-search/chat-search -f values.yaml

Using Skaffold for local development

skaffold run --port-forward

Ingest data

crawl --sitemap-url $SITEMAP_URL --auth-token $AUTH_TOKEN

Check crawl.yml for web crawling,

Example auto ingest after Github Pages deploy, jekyll.yml.

Architecture

Ingest

Loading
flowchart LR
  A(Crawl) --> |doc| B(/ingest)
  B --> |metadata| C(Redis)
  B --> |doc| D(Text Splitter)
  D --> |docs| E(Embedding Model)
  E --> |docs with embeddings| F(Redis)

Query

Loading
flowchart LR
  A((Request)) --> |messages| B(/chat)
  B --> |messages| C(LLM)
  C --> |question| D(Embedding Model)
  D --> |embeddings| E(Redis)
  E --> |relevant docs| F(LLM)
  B --> |messages|F
  F --> |answer| G((Response))

Deployment

Check cicd.yml for Google Cloud Run deployment, deploy-to-cloud-run.