The repo demonstrates the deterministic solution for searching documents and data, it leverages the Gen AI technologies for assisting end users with their domain specific searches. It also highlights some of the ground realities and common assumptions in this space e.g. Large Language Models (LLMs) are absolutely required for RAG solutions. In RAG solutions, where business domain is well scoped to the localized search, LLMs are only used to convert the response to a more natural sounding language (worldly knowledge is not so much required here) whereas the actual search is provided by vector database. Use of LLMs often result in hallucinations and inaccurate results when used in a straight through processing, it isn't suitable for matters of consequence; use of SLMs could potentially be a better alternative here. This repo, enables the configurable combination of vector DB and S/LLMs to evaluate the optimal solution for the desired outcomes.
- Search the pre-indexed documents using vector DB and respond in natural language using S/LLM models, optionally.
- Search the databases (e.g. Influx, SQL) using predefined queries in vector DB or synthesized by S/LLM model, and respond in natural language using S/LLM models, optionally.
There two primary and basic use cases handled by this solution, they are described below along with their respective flows.
User would like to search existing documents, these documents can be machine manuals in industrial domain or regulatory/compliance policies in financial domain.
On database search side of things, this solution addresses the challenge faced by non IT users who'd benefit from data exploration apart from well thought-out and predefined queries written by IT upfront.
This section describes the steps to deploy this solution in your environment.
-
Open this codespace in your browser or in your local Visual Studio Code.
-
Install dependent services
make setup
-
Run Document API:
- Run document search service
make run_doc
- Open Swagger link
http://localhost:5152/swagger/index.html
if you are on VS Code. - Open Swagger link by appending
/swagger/index.html
to the hostname from the Ports tab if you are on Codespaces in a browser.
- Run document search service
-
Run Data API:
- Configure Influx DB as described here.
- Run document search service
make run_db
- Open Swagger link
http://localhost:5155/swagger/index.html
if you are on VS Code. - Open Swagger link by appending
/swagger/index.html
to the hostname from the Ports tab if you are on Codespaces in a browser.
- Clone repo
git clone [email protected]:suneetnangia/rag-doc-data-search.git && cd rag-doc-data-search
- Optionally, open the repo in a pre-configured Dev Container.
- Install dependent services
make setup
- Run Document API:
- Run document search service
make run_doc
- Open Swagger link
http://localhost:5152/swagger/index.html
to try the APIs.
- Run document search service
- Run Data API:
- Configure Influx DB as described here.
- Run document search service
make run_db
- Open Swagger link
http://localhost:5155/swagger/index.html
to try the APIs.
This repo makes use of Ollama to host both embeddings models and S/LLM models. Ollama provides various options regarding hosting and management of models, we surface some of those options along with vector db options in this solution, they can be configured via appsettings.
These potential extensions can provide layers on top of this solution, to provide an on-ramp for various use cases.
- CLI Repo: Provides access to the solution via CLI interface for scripting and automating.
- Bootstrapping Repo: Loads sample data in the solution.
- K8s Repo: Deploys the solution in K8s setting using sidecar pattern.