# Quantization

## Quantization

### ORTQuantizer

#### class optimum.onnxruntime.ORTQuantizer

[\<source>](https://github.com/huggingface/optimum/blob/main/optimum/onnxruntime/quantization.py#L83)

( onnx\_model\_path: Pathconfig: typing.Optional\[ForwardRef('PretrainedConfig')] = None )

Handles the ONNX Runtime quantization process for models shared on boincai.com/models.

**compute\_ranges**

[\<source>](https://github.com/huggingface/optimum/blob/main/optimum/onnxruntime/quantization.py#L261)

( )

Computes the quantization ranges.

**fit**

[\<source>](https://github.com/huggingface/optimum/blob/main/optimum/onnxruntime/quantization.py#L157)

( dataset: Datasetcalibration\_config: CalibrationConfigonnx\_augmented\_model\_name: typing.Union\[str, pathlib.Path] = 'augmented\_model.onnx'operators\_to\_quantize: typing.Optional\[typing.List\[str]] = Nonebatch\_size: int = 1use\_external\_data\_format: bool = Falseuse\_gpu: bool = Falseforce\_symmetric\_range: bool = False )

Parameters

* **dataset** (`Dataset`) — The dataset to use when performing the calibration step.
* **calibration\_config** (`~CalibrationConfig`) — The configuration containing the parameters related to the calibration step.
* **onnx\_augmented\_model\_name** (`Union[str, Path]`, defaults to `"augmented_model.onnx"`) — The path used to save the augmented model used to collect the quantization ranges.
* **operators\_to\_quantize** (`Optional[List[str]]`, defaults to `None`) — List of the operators types to quantize.
* **batch\_size** (`int`, defaults to 1) — The batch size to use when collecting the quantization ranges values.
* **use\_external\_data\_format** (`bool`, defaults to `False`) — Whether to use external data format to store model which size is >= 2Gb.
* **use\_gpu** (`bool`, defaults to `False`) — Whether to use the GPU when collecting the quantization ranges values.
* **force\_symmetric\_range** (`bool`, defaults to `False`) — Whether to make the quantization ranges symmetric.

Performs the calibration step and computes the quantization ranges.

**from\_pretrained**

[\<source>](https://github.com/huggingface/optimum/blob/main/optimum/onnxruntime/quantization.py#L110)

( model\_or\_path: typing.Union\[ForwardRef('ORTModel'), str, pathlib.Path]file\_name: typing.Optional\[str] = None )

Parameters

* **model\_or\_path** (`Union[ORTModel, str, Path]`) — Can be either:
  * A path to a saved exported ONNX Intermediate Representation (IR) model, e.g., \`./my\_model\_directory/.
  * Or an `ORTModelForXX` class, e.g., `ORTModelForQuestionAnswering`.
* **file\_name(`Optional[str]`,** defaults to `None`) — Overwrites the default model file name from `"model.onnx"` to `file_name`. This allows you to load different model files from the same repository or directory.

Instantiates a `ORTQuantizer` from an ONNX model file or an `ORTModel`.

**get\_calibration\_dataset**

[\<source>](https://github.com/huggingface/optimum/blob/main/optimum/onnxruntime/quantization.py#L435)

( dataset\_name: strnum\_samples: int = 100dataset\_config\_name: typing.Optional\[str] = Nonedataset\_split: typing.Optional\[str] = Nonepreprocess\_function: typing.Optional\[typing.Callable] = Nonepreprocess\_batch: bool = Trueseed: int = 2016use\_auth\_token: bool = False )

Parameters

* **dataset\_name** (`str`) — The dataset repository name on the BOINC AI Hub or path to a local directory containing data files to load to use for the calibration step.
* **num\_samples** (`int`, defaults to 100) — The maximum number of samples composing the calibration dataset.
* **dataset\_config\_name** (`Optional[str]`, defaults to `None`) — The name of the dataset configuration.
* **dataset\_split** (`Optional[str]`, defaults to `None`) — Which split of the dataset to use to perform the calibration step.
* **preprocess\_function** (`Optional[Callable]`, defaults to `None`) — Processing function to apply to each example after loading dataset.
* **preprocess\_batch** (`bool`, defaults to `True`) — Whether the `preprocess_function` should be batched.
* **seed** (`int`, defaults to 2016) — The random seed to use when shuffling the calibration dataset.
* **use\_auth\_token** (`bool`, defaults to `False`) — Whether to use the token generated when running `transformers-cli login` (necessary for some datasets like ImageNet).

Creates the calibration `datasets.Dataset` to use for the post-training static quantization calibration step.

**partial\_fit**

[\<source>](https://github.com/huggingface/optimum/blob/main/optimum/onnxruntime/quantization.py#L211)

( dataset: Datasetcalibration\_config: CalibrationConfigonnx\_augmented\_model\_name: typing.Union\[str, pathlib.Path] = 'augmented\_model.onnx'operators\_to\_quantize: typing.Optional\[typing.List\[str]] = Nonebatch\_size: int = 1use\_external\_data\_format: bool = Falseuse\_gpu: bool = Falseforce\_symmetric\_range: bool = False )

Parameters

* **dataset** (`Dataset`) — The dataset to use when performing the calibration step.
* **calibration\_config** (`CalibrationConfig`) — The configuration containing the parameters related to the calibration step.
* **onnx\_augmented\_model\_name** (`Union[str, Path]`, defaults to `"augmented_model.onnx"`) — The path used to save the augmented model used to collect the quantization ranges.
* **operators\_to\_quantize** (`Optional[List[str]]`, defaults to `None`) — List of the operators types to quantize.
* **batch\_size** (`int`, defaults to 1) — The batch size to use when collecting the quantization ranges values.
* **use\_external\_data\_format** (`bool`, defaults to `False`) — Whether uto se external data format to store model which size is >= 2Gb.
* **use\_gpu** (`bool`, defaults to `False`) — Whether to use the GPU when collecting the quantization ranges values.
* **force\_symmetric\_range** (`bool`, defaults to `False`) — Whether to make the quantization ranges symmetric.

Performs the calibration step and collects the quantization ranges without computing them.

**quantize**

[\<source>](https://github.com/huggingface/optimum/blob/main/optimum/onnxruntime/quantization.py#L280)

( quantization\_config: QuantizationConfigsave\_dir: typing.Union\[str, pathlib.Path]file\_suffix: typing.Optional\[str] = 'quantized'calibration\_tensors\_range: typing.Union\[typing.Dict\[str, typing.Tuple\[float, float]], NoneType] = Noneuse\_external\_data\_format: bool = Falsepreprocessor: typing.Optional\[optimum.onnxruntime.preprocessors.quantization.QuantizationPreprocessor] = None )

Parameters

* **quantization\_config** (`QuantizationConfig`) — The configuration containing the parameters related to quantization.
* **save\_dir** (`Union[str, Path]`) — The directory where the quantized model should be saved.
* **file\_suffix** (`Optional[str]`, defaults to `"quantized"`) — The file\_suffix used to save the quantized model.
* **calibration\_tensors\_range** (`Optional[Dict[str, Tuple[float, float]]]`, defaults to `None`) — The dictionary mapping the nodes name to their quantization ranges, used and required only when applying static quantization.
* **use\_external\_data\_format** (`bool`, defaults to `False`) — Whether to use external data format to store model which size is >= 2Gb.
* **preprocessor** (`Optional[QuantizationPreprocessor]`, defaults to `None`) — The preprocessor to use to collect the nodes to include or exclude from quantization.

Quantizes a model given the optimization specifications defined in `quantization_config`.


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://boinc-ai.gitbook.io/optimum/onnx-runtime/reference/quantization.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
