Skip to content

Add ONNX export optimization support for ModernBERT #2177

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
amas0 opened this issue Feb 3, 2025 · 5 comments · May be fixed by #2208
Open

Add ONNX export optimization support for ModernBERT #2177

amas0 opened this issue Feb 3, 2025 · 5 comments · May be fixed by #2208

Comments

@amas0
Copy link

amas0 commented Feb 3, 2025

Feature request

Release v1.24.0 successfully supports exporting a ModernBERT model to ONNX; however, this support does not extend to enabling optimizations via the --optimize flag in optimum-cli.

I'm not sure how much needs to go into enabling this in a more formal capacity, but a very brief attempt by me locally at simply adding modernbert in:

class ORTConfigManager:
"""
A class that contains all the information needed by ONNX Runtime optimization for a given model type.
Attributes:
_conf (`Dict[str]`):
A dictionary mapping each supported model type to the corresponding ONNX Runtime model type.
"""
# Contribution note: Please add new models in alphabetical order
# TODO: for encoder-decoder models, validate if bert or gpt2 optimization is better
_conf = {
"albert": "bert",
"bart": "bart",
"bert": "bert",
"big-bird": "bert",
"bigbird-pegasus": "bart",
"blenderbot": "bert",
"bloom": "gpt2",
"camembert": "bert",
"codegen": "gpt2",
"deberta": "bert",
"deberta-v2": "bert",
"distilbert": "bert",
"electra": "bert",
"gpt2": "gpt2",
"gpt-bigcode": "gpt2",
"gpt-neo": "gpt2",
"gpt-neox": "gpt2",
"gptj": "gpt2",
"granite": "gpt2",
"longt5": "bert",
"llama": "gpt2",
"marian": "bart",
"mbart": "bart",
"mistral": "gpt2",
"mpnet": "bert",
"mt5": "bart",
"m2m-100": "bart",
"nystromformer": "bert",
"pegasus": "bert",
"roberta": "bert",
"segformer": "vit",
"t5": "bert",
"vit": "vit",
"whisper": "bart",
"xlm-roberta": "bert",
"pix2struct": "vit",
}

and

_conf = {
"albert": NormalizedTextConfig,
"bart": BartLikeNormalizedTextConfig,
"bert": NormalizedTextConfig,
"big-bird": NormalizedTextConfig,
"bigbird-pegasus": BartLikeNormalizedTextConfig,
"blenderbot": BartLikeNormalizedTextConfig,
"blenderbot-small": BartLikeNormalizedTextConfig,
"bloom": NormalizedTextConfig.with_args(num_layers="n_layer"),
"falcon": NormalizedTextConfig,
"camembert": NormalizedTextConfig,
"codegen": GPT2LikeNormalizedTextConfig,
"cvt": NormalizedVisionConfig,
"deberta": NormalizedTextConfig,
"deberta-v2": NormalizedTextConfig,
"deit": NormalizedVisionConfig,
"distilbert": NormalizedTextConfig.with_args(num_attention_heads="n_heads", hidden_size="dim"),
"donut-swin": NormalizedVisionConfig,
"electra": NormalizedTextConfig,
"encoder-decoder": NormalizedEncoderDecoderConfig,
"gemma": NormalizedTextConfigWithGQA,
"gpt2": GPT2LikeNormalizedTextConfig,
"gpt-bigcode": GPTBigCodeNormalizedTextConfig,
"gpt-neo": NormalizedTextConfig.with_args(num_attention_heads="num_heads"),
"gpt-neox": NormalizedTextConfig,
"gptj": GPT2LikeNormalizedTextConfig,
"imagegpt": GPT2LikeNormalizedTextConfig,
"llama": NormalizedTextConfigWithGQA,
"longt5": T5LikeNormalizedTextConfig,
"marian": BartLikeNormalizedTextConfig,
"markuplm": NormalizedTextConfig,
"mbart": BartLikeNormalizedTextConfig,
"mistral": NormalizedTextConfigWithGQA,
"mixtral": NormalizedTextConfigWithGQA,
"mpnet": NormalizedTextConfig,
"mpt": MPTNormalizedTextConfig,
"mt5": T5LikeNormalizedTextConfig,
"m2m-100": BartLikeNormalizedTextConfig,
"nystromformer": NormalizedTextConfig,
"opt": NormalizedTextConfig,
"pegasus": BartLikeNormalizedTextConfig,
"pix2struct": Pix2StructNormalizedTextConfig,
"phi": NormalizedTextConfig,
"phi3": NormalizedTextConfigWithGQA,
"phi3small": NormalizedTextConfigWithGQA,
"poolformer": NormalizedVisionConfig,
"regnet": NormalizedVisionConfig,
"resnet": NormalizedVisionConfig,
"roberta": NormalizedTextConfig,
"segformer": NormalizedSegformerConfig,
"speech-to-text": SpeechToTextLikeNormalizedTextConfig,
"splinter": NormalizedTextConfig,
"t5": T5LikeNormalizedTextConfig,
"trocr": TrOCRLikeNormalizedTextConfig,
"vision-encoder-decoder": NormalizedEncoderDecoderConfig,
"vit": NormalizedVisionConfig,
"whisper": WhisperLikeNormalizedTextConfig,
"xlm-roberta": NormalizedTextConfig,
"yolos": NormalizedVisionConfig,
"qwen2": NormalizedTextConfig,
"granite": NormalizedTextConfigWithGQA,
}

with the former mapping to "bert" and the latter mapping to NormalizedTextConfig seemed to allow me to export the model with optimizations. In my brief testing after that I didn't notice any glaring issues with the output and observed some expected speedups.

Motivation

I would like to export an optimized ONNX version of my ModernBERT model.

Your contribution

I'd be happy to submit a PR if given more information on how this support is typically added.

@xenova
Copy link
Contributor

xenova commented Mar 5, 2025

Great! Please open a PR :)

@amas0 amas0 linked a pull request Mar 6, 2025 that will close this issue
Copy link

github-actions bot commented Apr 5, 2025

This issue has been marked as stale because it has been open for 30 days with no activity. This thread will be automatically closed in 5 days if no further activity occurs.

@github-actions github-actions bot added the Stale label Apr 5, 2025
@amas0
Copy link
Author

amas0 commented Apr 5, 2025

Pending PR review -- commenting to keep this open

@github-actions github-actions bot removed the Stale label Apr 6, 2025
Copy link

github-actions bot commented May 6, 2025

This issue has been marked as stale because it has been open for 30 days with no activity. This thread will be automatically closed in 5 days if no further activity occurs.

@github-actions github-actions bot added the Stale label May 6, 2025
@amas0
Copy link
Author

amas0 commented May 6, 2025

Commenting to keep open -- awaiting PR review after updates.

@github-actions github-actions bot removed the Stale label May 7, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants