-
Notifications
You must be signed in to change notification settings - Fork 2.7k
docs: add MTEB evaluation guide and update usage.rst #3477
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
docs: add MTEB evaluation guide and update usage.rst #3477
Conversation
@tomaarsen Please share your feedback and if anything else i should chnage |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@KennethEnevoldsen can you look too?
Co-authored-by: Roman Solomatin <[email protected]>
Co-authored-by: Roman Solomatin <[email protected]>
Co-authored-by: Roman Solomatin <[email protected]>
@Samoed and @tomaarsen |
@Samoed what is required for this PR to merge |
Install MTEB and its dependencies: | ||
|
||
```bash | ||
pip install mteb |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We're now working on v2
release that would be breaking
pip install mteb | |
pip install "mteb<2" |
from sentence_transformers import SentenceTransformer | ||
from mteb import MTEB | ||
|
||
model = SentenceTransformer("all-MiniLM-L6-v2") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure what the best approach is here, because this documentation is for SentenceTransformer
, not MTEB
. However, this method might be better since some models could have custom instructions or implementations coming from MTEB
.
model = SentenceTransformer("all-MiniLM-L6-v2") | |
model = mteb.get_model("sentence-transformers/all-MiniLM-L6-v2") |
This PR resolves #3332.
Summary
Adds a new documentation page for evaluating SentenceTransformer models using the Massive Text Embedding Benchmark (MTEB), along with relevant task examples and best practices.
Changes
mteb_evaluation.md
indocs/sentence_transformer/usage/
:Linked from
usage.rst
to include in sidebar navigationNotes
Following the guidance in the discussion, MTEB is documented as a post-training evaluation tool, not integrated as an evaluator to avoid benchmark overfitting.
Let me know if you'd like any section adjusted. Thank you!