Skip to content

Updated RankLLM integration #81

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 15 commits into
base: main
Choose a base branch
from
Open

Conversation

clides
Copy link

@clides clides commented May 23, 2025

Updated existing RankLLM integration for langchain-community, with the latest params, function names, etc.

Example with updated usage for RankLLM:

Install all the packages:

pip install langchain-community faiss-gpu torch transformers sentence-transformers huggingface-hub rank_llm

Install the document example:
https://github.com/hwchase17/chat-your-data/blob/master/state_of_the_union.txt

Set up the base vector store retriever:

from langchain_community.document_loaders import TextLoader
from langchain_community.vectorstores import FAISS
from langchain_community.embeddings import HuggingFaceEmbeddings
from langchain_text_splitters import RecursiveCharacterTextSplitter
import torch
import os

device = "cuda"

documents = TextLoader("state_of_the_union.txt").load()
text_splitter = RecursiveCharacterTextSplitter(chunk_size=500, chunk_overlap=100)
texts = text_splitter.split_documents(documents)
for idx, text in enumerate(texts):
    text.metadata["id"] = idx

embedding = HuggingFaceEmbeddings(
    model_name="BAAI/bge-small-en", # or any model of your choice
    model_kwargs={'device': 'cuda'},
    encode_kwargs={'normalize_embeddings': True}
)

retriever = FAISS.from_documents(texts, embedding).as_retriever(search_kwargs={"k": 20})

Retrieval without reranking:

query = "What was done to Russia?"
docs = retriever.invoke(query)
pretty_print_docs(docs)

All the field arguments to RankLLMRerank:

model_path: str = Field(default="rank_zephyr")
top_n: int = Field(default=3)
window_size: int = Field(default=20)
context_size: int = Field(default=4096)
prompt_mode: str = Field(default="rank_GPT")
num_gpus: int = Field(default=1)
num_few_shot_examples: int = Field(default=0)
few_shot_file: Optional[str] = Field(default=None)
use_logits: bool = Field(default=False)
use_alpha: bool = Field(default=False)
variable_passages: bool = Field(default=False)
stride: int = Field(default=10)
use_azure_openai: bool = Field(default=False)
model_coordinator: Any = Field(default=None, exclude=True)

Retrieval with reranking (default RankLLM model is rank_zephyr):

torch.cuda.empty_cache()

compressor = RankLLMRerank(top_n=3, model_path="rank_zephyr")
compression_retriever = ContextualCompressionRetriever(
    base_compressor=compressor, base_retriever=retriever
)

del compressor

compressed_docs = compression_retriever.invoke(query)
pretty_print_docs(compressed_docs)

Copy link
Contributor

@eyurtsev eyurtsev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @clides thanks for the contribution. At the moment, this code isn't tested or documented at a level where it can be merged to langchain-community.

As a rule of thumb the amount of documentation should be similar to the amount of code, and there should be likely some examples with usage.

It looks like there are some breaking changes in this code? Is that correct?

@clides
Copy link
Author

clides commented Jun 3, 2025

It looks like there are some breaking changes in this code? Is that correct?

Yep, there are some changes with RankLLM (ie new params, changed function/class names, etc)

@clides
Copy link
Author

clides commented Jun 12, 2025

As a rule of thumb the amount of documentation should be similar to the amount of code, and there should be likely some examples with usage.

Added more comments as requested

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants