Skip to content

Commit d8424cb

Browse files
authored
Add support for image classification (#226)
- image-classification pipeline - CV models can be used for quantization/optimization
1 parent d501248 commit d8424cb

File tree

12 files changed

+325
-130
lines changed

12 files changed

+325
-130
lines changed

docs/source/onnxruntime/modeling_ort.mdx

Lines changed: 9 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -12,13 +12,13 @@ specific language governing permissions and limitations under the License.
1212

1313
# Optimum Inference with ONNX Runtime
1414

15-
Optimum is a utility package for building and running inference with accelerated runtime like ONNX Runtime.
16-
Optimum can be used to load optimized models from the [Hugging Face Hub](hf.co/models) and create pipelines
15+
Optimum is a utility package for building and running inference with accelerated runtime like ONNX Runtime.
16+
Optimum can be used to load optimized models from the [Hugging Face Hub](hf.co/models) and create pipelines
1717
to run accelerated inference without rewriting your APIs.
1818

1919
## Switching from Transformers to Optimum Inference
2020

21-
The Optimum Inference models are API compatible with Hugging Face Transformers models. This means you can just replace your `AutoModelForXxx` class with the corresponding `ORTModelForXxx` class in `optimum`. For example, this is how you can use a question answering model in `optimum`:
21+
The Optimum Inference models are API compatible with Hugging Face Transformers models. This means you can just replace your `AutoModelForXxx` class with the corresponding `ORTModelForXxx` class in `optimum`. For example, this is how you can use a question answering model in `optimum`:
2222

2323
```diff
2424
from transformers import AutoTokenizer, pipeline
@@ -57,8 +57,8 @@ You can find a complete walkhrough Optimum Inference for ONNX Runtime in this [n
5757
5858
### Working with the Hugging Face Model Hub
5959
60-
The Optimum model classes like [`~onnxruntime.ORTModelForSequenceClassification`] are integrated with the [Hugging Face Model Hub](https://hf.co/models), which means you can not only
61-
load model from the Hub, but also push your models to the Hub with `push_to_hub()` method. Below is an example which downloads a vanilla Transformers model
60+
The Optimum model classes like [`~onnxruntime.ORTModelForSequenceClassification`] are integrated with the [Hugging Face Model Hub](https://hf.co/models), which means you can not only
61+
load model from the Hub, but also push your models to the Hub with `push_to_hub()` method. Below is an example which downloads a vanilla Transformers model
6262
from the Hub and converts it to an optimum onnxruntime model and pushes it back into a new repository.
6363
6464
<!-- TODO: Add Quantizer into example when UX improved -->
@@ -105,3 +105,7 @@ from the Hub and converts it to an optimum onnxruntime model and pushes it back
105105

106106
[[autodoc]] onnxruntime.modeling_ort.ORTModelForCausalLM
107107

108+
## ORTModelForImageClassification
109+
110+
[[autodoc]] onnxruntime.modeling_ort.ORTModelForImageClassification
111+

docs/source/pipelines.mdx

Lines changed: 16 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -12,8 +12,7 @@ specific language governing permissions and limitations under the License.
1212

1313
# Optimum pipelines for inference
1414

15-
The [`~pipelines.pipeline`] function makes it simple to use models from the [Model Hub](https://huggingface.co/models) for accelerated inference on a variety of tasks such as text classification.
16-
Even if you don't have experience with a specific modality or understand the code powering the models, you can still use them with the [`~pipelines.pipeline`] function!
15+
The [`~pipelines.pipeline`] function makes it simple to use models from the [Model Hub](https://huggingface.co/models) for accelerated inference on a variety of tasks such as text classification, question answering and image classification.
1716

1817
<Tip>
1918

@@ -31,11 +30,12 @@ Currenlty supported tasks are:
3130
* `question-answering`
3231
* `zero-shot-classification`
3332
* `text-generation`
33+
* `image-classification`
3434

3535
## Optimum pipeline usage
3636

37-
While each task has an associated pipeline class, it is simpler to use the general [`~pipelines.pipeline`] function which wraps all the task-specific pipelines in one object.
38-
The [`~pipelines.pipeline`] function automatically loads a default model and tokenizer capable of inference for your task.
37+
While each task has an associated pipeline class, it is simpler to use the general [`~pipelines.pipeline`] function which wraps all the task-specific pipelines in one object.
38+
The [`~pipelines.pipeline`] function automatically loads a default model and tokenizer/feature-extractor capable of inference for your task.
3939

4040
1. Start by creating a pipeline by specifying an inference task:
4141

@@ -46,7 +46,7 @@ The [`~pipelines.pipeline`] function automatically loads a default model and tok
4646
4747
```
4848
49-
2. Pass your input text to the [`~pipelines.pipeline`] function:
49+
2. Pass your input text/image to the [`~pipelines.pipeline`] function:
5050
5151
```python
5252
>>> classifier("I like you. I love you.")
@@ -57,9 +57,9 @@ _Note: The default models used in the [`~pipelines.pipeline`] function are not o
5757

5858
### Using vanilla Transformers model and converting to ONNX
5959

60-
The [`~pipelines.pipeline`] function accepts any supported model from the [Model Hub](https://huggingface.co/models).
61-
There are tags on the Model Hub that allow you to filter for a model you'd like to use for your task.
62-
Once you've picked an appropriate model, load it with the `from_pretrained("{model_id}",from_transformers=True)` method associated with the `ORTModelFor*`
60+
The [`~pipelines.pipeline`] function accepts any supported model from the [Model Hub](https://huggingface.co/models).
61+
There are tags on the Model Hub that allow you to filter for a model you'd like to use for your task.
62+
Once you've picked an appropriate model, load it with the `from_pretrained("{model_id}",from_transformers=True)` method associated with the `ORTModelFor*`
6363
`AutoTokenizer' class. For example, here's how you can load the [`~onnxruntime.ORTModelForQuestionAnswering`] class for question answering:
6464

6565
```python
@@ -80,10 +80,10 @@ Once you've picked an appropriate model, load it with the `from_pretrained("{mod
8080

8181
### Using Optimum models
8282

83-
The [`~pipelines.pipeline`] function is tightly integrated with [Model Hub](https://huggingface.co/models) and can load optimized models directly, e.g. those created with ONNX Runtime.
84-
There are tags on the Model Hub that allow you to filter for a model you'd like to use for your task.
83+
The [`~pipelines.pipeline`] function is tightly integrated with [Model Hub](https://huggingface.co/models) and can load optimized models directly, e.g. those created with ONNX Runtime.
84+
There are tags on the Model Hub that allow you to filter for a model you'd like to use for your task.
8585
Once you've picked an appropriate model, load it with the `from_pretrained()` method associated with the corresponding `ORTModelFor*`
86-
and `AutoTokenizer' class. For example, here's how you can load an optimized model for question answering:
86+
and `AutoTokenizer'/`AutoFeatureExtractor` class. For example, here's how you can load an optimized model for question answering:
8787

8888
```python
8989
>>> from transformers import AutoTokenizer
@@ -132,7 +132,7 @@ Below you can find two examples on how you could [`~onnxruntime.ORTOptimizer`] a
132132
onnx_quantized_model_output_path=save_path / "model-quantized.onnx",
133133
quantization_config=qconfig,
134134
)
135-
>>> quantizer.model.config.save_pretrained(save_path) # saves config.json
135+
>>> quantizer.model.config.save_pretrained(save_path) # saves config.json
136136

137137
# load optimized model from local path or repository
138138
>>> model = ORTModelForSequenceClassification.from_pretrained(save_path,file_name="model-quantized.onnx")
@@ -176,7 +176,7 @@ Below you can find two examples on how you could [`~onnxruntime.ORTOptimizer`] a
176176
onnx_optimized_model_output_path=save_path / "model-optimized.onnx",
177177
optimization_config=optimization_config,
178178
)
179-
>>> optimizer.model.config.save_pretrained(save_path) # saves config.json
179+
>>> optimizer.model.config.save_pretrained(save_path) # saves config.json
180180

181181
# load optimized model from local path or repository
182182
>>> model = ORTModelForSequenceClassification.from_pretrained(save_path,file_name="model-optimized.onnx")
@@ -198,16 +198,16 @@ Below you can find two examples on how you could [`~onnxruntime.ORTOptimizer`] a
198198
## Transformers pipeline usage
199199

200200
The [`~pipelines.pipeline`] function is just a light wrapper around the `transformers.pipeline` function to enable checks for supported tasks and additional features
201-
, like quantization and optimization. This being said you can use the `transformers.pipeline` and just replace your `AutoFor*` with the optimum
202-
`ORTModelFor*` class.
201+
, like quantization and optimization. This being said you can use the `transformers.pipeline` and just replace your `AutoModelFor*` with the optimum
202+
`ORTModelFor*` class.
203203

204204
```diff
205205
from transformers import AutoTokenizer, pipeline
206206
-from transformers import AutoModelForQuestionAnswering
207207
+from optimum.onnxruntime import ORTModelForQuestionAnswering
208208

209209
-model = AutoModelForQuestionAnswering.from_pretrained("deepset/roberta-base-squad2")
210-
+model = ORTModelForQuestionAnswering.from_transformers("optimum/roberta-base-squad2")
210+
+model = ORTModelForQuestionAnswering.from_pretrained("optimum/roberta-base-squad2")
211211
tokenizer = AutoTokenizer.from_pretrained("deepset/roberta-base-squad2")
212212

213213
onnx_qa = pipeline("question-answering",model=model,tokenizer=tokenizer)

optimum/onnxruntime/__init__.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -53,6 +53,7 @@ class ORTQuantizableOperator(Enum):
5353
from .modeling_ort import (
5454
ORTModelForCausalLM,
5555
ORTModelForFeatureExtraction,
56+
ORTModelForImageClassification,
5657
ORTModelForQuestionAnswering,
5758
ORTModelForSequenceClassification,
5859
ORTModelForTokenClassification,

optimum/onnxruntime/configuration.py

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -20,7 +20,6 @@
2020
from datasets import Dataset
2121
from packaging.version import Version, parse
2222

23-
from onnxruntime import GraphOptimizationLevel
2423
from onnxruntime import __version__ as ort_version
2524
from onnxruntime.quantization import CalibraterBase, CalibrationMethod, QuantFormat, QuantizationMode, QuantType
2625
from onnxruntime.quantization.calibrate import create_calibrator

0 commit comments

Comments
 (0)