Updated RankLLM integration #81

clides · 2025-05-23T16:25:46Z

Updated existing RankLLM integration for langchain-community, with the latest params, function names, etc.

Example with updated usage for RankLLM:

Install all the packages:

pip install langchain-community faiss-gpu torch transformers sentence-transformers huggingface-hub rank_llm

Install the document example:
https://github.com/hwchase17/chat-your-data/blob/master/state_of_the_union.txt

Set up the base vector store retriever:

from langchain_community.document_loaders import TextLoader
from langchain_community.vectorstores import FAISS
from langchain_community.embeddings import HuggingFaceEmbeddings
from langchain_text_splitters import RecursiveCharacterTextSplitter
import torch
import os

device = "cuda"

documents = TextLoader("state_of_the_union.txt").load()
text_splitter = RecursiveCharacterTextSplitter(chunk_size=500, chunk_overlap=100)
texts = text_splitter.split_documents(documents)
for idx, text in enumerate(texts):
    text.metadata["id"] = idx

embedding = HuggingFaceEmbeddings(
    model_name="BAAI/bge-small-en", # or any model of your choice
    model_kwargs={'device': 'cuda'},
    encode_kwargs={'normalize_embeddings': True}
)

retriever = FAISS.from_documents(texts, embedding).as_retriever(search_kwargs={"k": 20})

Retrieval without reranking:

query = "What was done to Russia?"
docs = retriever.invoke(query)
pretty_print_docs(docs)

All the field arguments to RankLLMRerank:

model_path: str = Field(default="rank_zephyr")
top_n: int = Field(default=3)
window_size: int = Field(default=20)
context_size: int = Field(default=4096)
prompt_mode: str = Field(default="rank_GPT")
num_gpus: int = Field(default=1)
num_few_shot_examples: int = Field(default=0)
few_shot_file: Optional[str] = Field(default=None)
use_logits: bool = Field(default=False)
use_alpha: bool = Field(default=False)
variable_passages: bool = Field(default=False)
stride: int = Field(default=10)
use_azure_openai: bool = Field(default=False)
model_coordinator: Any = Field(default=None, exclude=True)

Retrieval with reranking (default RankLLM model is rank_zephyr):

torch.cuda.empty_cache()

compressor = RankLLMRerank(top_n=3, model_path="rank_zephyr")
compression_retriever = ContextualCompressionRetriever(
    base_compressor=compressor, base_retriever=retriever
)

del compressor

compressed_docs = compression_retriever.invoke(query)
pretty_print_docs(compressed_docs)

eyurtsev

Hi @clides thanks for the contribution. At the moment, this code isn't tested or documented at a level where it can be merged to langchain-community.

As a rule of thumb the amount of documentation should be similar to the amount of code, and there should be likely some examples with usage.

It looks like there are some breaking changes in this code? Is that correct?

clides · 2025-06-03T00:01:55Z

It looks like there are some breaking changes in this code? Is that correct?

Yep, there are some changes with RankLLM (ie new params, changed function/class names, etc)

clides · 2025-06-12T15:19:13Z

As a rule of thumb the amount of documentation should be similar to the amount of code, and there should be likely some examples with usage.

Added more comments as requested

clides and others added 13 commits May 12, 2025 12:18

update rankllm integration

3895c8b

fix rankllm integration

5d143b4

fixed model error

4266ac6

fixed errors

7de2e0c

fixed errors

25049f8

fix error

482bc68

minor change

72a617c

small change

cbe9ff3

small change

e0f2fc4

small change

c1387d4

updated integration with new fields

0a61b67

updated integration with new fields

d70839b

Merge branch 'langchain-ai:main' into rankllm-dev

0ce0058

eyurtsev requested changes Jun 2, 2025

View reviewed changes

clides and others added 2 commits June 12, 2025 10:36

Merge branch 'langchain-ai:main' into rankllm-dev

e362e1c

added more comments

75982b8

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Updated RankLLM integration #81

Updated RankLLM integration #81

Uh oh!

clides commented May 23, 2025

Uh oh!

eyurtsev left a comment

Uh oh!

clides commented Jun 3, 2025

Uh oh!

clides commented Jun 12, 2025

Uh oh!

Uh oh!

Updated RankLLM integration #81

Are you sure you want to change the base?

Updated RankLLM integration #81

Uh oh!

Conversation

clides commented May 23, 2025

Example with updated usage for RankLLM:

Uh oh!

eyurtsev left a comment

Choose a reason for hiding this comment

Uh oh!

clides commented Jun 3, 2025

Uh oh!

clides commented Jun 12, 2025

Uh oh!

Uh oh!