Skip to content

docs: add MTEB evaluation guide and update usage.rst #3477

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

sahibpreetsingh12
Copy link
Contributor

This PR resolves #3332.

Summary

Adds a new documentation page for evaluating SentenceTransformer models using the Massive Text Embedding Benchmark (MTEB), along with relevant task examples and best practices.

Changes

  • mteb_evaluation.md in docs/sentence_transformer/usage/:

    • Installation steps
    • Minimal working example
    • Task-type breakdown (STS, Classification, Retrieval, etc.)
    • Notes on output handling
    • Warnings about not using MTEB during training
    • Leaderboard + export instructions
  • Linked from usage.rst to include in sidebar navigation

Notes

Following the guidance in the discussion, MTEB is documented as a post-training evaluation tool, not integrated as an evaluator to avoid benchmark overfitting.

Let me know if you'd like any section adjusted. Thank you!

@sahibpreetsingh12
Copy link
Contributor Author

@tomaarsen Please share your feedback and if anything else i should chnage

Copy link
Contributor

@Samoed Samoed left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@KennethEnevoldsen can you look too?

@sahibpreetsingh12
Copy link
Contributor Author

sahibpreetsingh12 commented Jul 31, 2025

@Samoed and @tomaarsen
If anything else is required form my side please do share.
Since I am new to this thing
What can i do in future to make Unit test run successfully since I just commited changes from UI
and If this merges willl take a pull for later

@sahibpreetsingh12
Copy link
Contributor Author

@Samoed what is required for this PR to merge
I am happy to contribute

Install MTEB and its dependencies:

```bash
pip install mteb
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're now working on v2 release that would be breaking

Suggested change
pip install mteb
pip install "mteb<2"

from sentence_transformers import SentenceTransformer
from mteb import MTEB

model = SentenceTransformer("all-MiniLM-L6-v2")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what the best approach is here, because this documentation is for SentenceTransformer, not MTEB. However, this method might be better since some models could have custom instructions or implementations coming from MTEB.

Suggested change
model = SentenceTransformer("all-MiniLM-L6-v2")
model = mteb.get_model("sentence-transformers/all-MiniLM-L6-v2")

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add documentation for evaluating using MTEB
2 participants