# Image Processor

## Image Processor

An image processor is in charge of preparing input features for vision models and post processing their outputs. This includes transformations such as resizing, normalization, and conversion to PyTorch, TensorFlow, Flax and Numpy tensors. It may also include model specific post-processing such as converting logits to segmentation masks.

### ImageProcessingMixin

#### class transformers.ImageProcessingMixin

[\<source>](https://github.com/huggingface/transformers/blob/v4.34.1/src/transformers/image_processing_utils.py#L68)

( \*\*kwargs )

This is an image processor mixin used to provide saving/loading functionality for sequential and image feature extractors.

**from\_pretrained**

[\<source>](https://github.com/huggingface/transformers/blob/v4.34.1/src/transformers/image_processing_utils.py#L92)

( pretrained\_model\_name\_or\_path: typing.Union\[str, os.PathLike]cache\_dir: typing.Union\[str, os.PathLike, NoneType] = Noneforce\_download: bool = Falselocal\_files\_only: bool = Falsetoken: typing.Union\[bool, str, NoneType] = Nonerevision: str = 'main'\*\*kwargs )

Parameters

* **pretrained\_model\_name\_or\_path** (`str` or `os.PathLike`) — This can be either:
  * a string, the *model id* of a pretrained image\_processor hosted inside a model repo on boincai.com. Valid model ids can be located at the root-level, like `bert-base-uncased`, or namespaced under a user or organization name, like `dbmdz/bert-base-german-cased`.
  * a path to a *directory* containing a image processor file saved using the [save\_pretrained()](https://huggingface.co/docs/transformers/v4.34.1/en/main_classes/image_processor#transformers.ImageProcessingMixin.save_pretrained) method, e.g., `./my_model_directory/`.
  * a path or url to a saved image processor JSON *file*, e.g., `./my_model_directory/preprocessor_config.json`.
* **cache\_dir** (`str` or `os.PathLike`, *optional*) — Path to a directory in which a downloaded pretrained model image processor should be cached if the standard cache should not be used.
* **force\_download** (`bool`, *optional*, defaults to `False`) — Whether or not to force to (re-)download the image processor files and override the cached versions if they exist.
* **resume\_download** (`bool`, *optional*, defaults to `False`) — Whether or not to delete incompletely received file. Attempts to resume the download if such a file exists.
* **proxies** (`Dict[str, str]`, *optional*) — A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}.` The proxies are used on each request.
* **token** (`str` or `bool`, *optional*) — The token to use as HTTP bearer authorization for remote files. If `True`, or not specified, will use the token generated when running `boincai-cli login` (stored in `~/.boincai`).
* **revision** (`str`, *optional*, defaults to `"main"`) — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on boincai.com, so `revision` can be any identifier allowed by git.

Instantiate a type of [ImageProcessingMixin](https://huggingface.co/docs/transformers/v4.34.1/en/main_classes/image_processor#transformers.ImageProcessingMixin) from an image processor.

Examples:

Copied

```
# We can't instantiate directly the base class *ImageProcessingMixin* so let's show the examples on a
# derived class: *CLIPImageProcessor*
image_processor = CLIPImageProcessor.from_pretrained(
    "openai/clip-vit-base-patch32"
)  # Download image_processing_config from boincai.com and cache.
image_processor = CLIPImageProcessor.from_pretrained(
    "./test/saved_model/"
)  # E.g. image processor (or model) was saved using *save_pretrained('./test/saved_model/')*
image_processor = CLIPImageProcessor.from_pretrained("./test/saved_model/preprocessor_config.json")
image_processor = CLIPImageProcessor.from_pretrained(
    "openai/clip-vit-base-patch32", do_normalize=False, foo=False
)
assert image_processor.do_normalize is False
image_processor, unused_kwargs = CLIPImageProcessor.from_pretrained(
    "openai/clip-vit-base-patch32", do_normalize=False, foo=False, return_unused_kwargs=True
)
assert image_processor.do_normalize is False
assert unused_kwargs == {"foo": False}
```

**save\_pretrained**

[\<source>](https://github.com/huggingface/transformers/blob/v4.34.1/src/transformers/image_processing_utils.py#L206)

( save\_directory: typing.Union\[str, os.PathLike]push\_to\_hub: bool = False\*\*kwargs )

Parameters

* **save\_directory** (`str` or `os.PathLike`) — Directory where the image processor JSON file will be saved (will be created if it does not exist).
* **push\_to\_hub** (`bool`, *optional*, defaults to `False`) — Whether or not to push your model to the BOINC AI model hub after saving it. You can specify the repository you want to push to with `repo_id` (will default to the name of `save_directory` in your namespace).
* **kwargs** (`Dict[str, Any]`, *optional*) — Additional key word arguments passed along to the [push\_to\_hub()](https://huggingface.co/docs/transformers/v4.34.1/en/main_classes/processors#transformers.ProcessorMixin.push_to_hub) method.

Save an image processor object to the directory `save_directory`, so that it can be re-loaded using the [from\_pretrained()](https://huggingface.co/docs/transformers/v4.34.1/en/main_classes/image_processor#transformers.ImageProcessingMixin.from_pretrained) class method.

### BatchFeature

#### class transformers.BatchFeature

[\<source>](https://github.com/huggingface/transformers/blob/v4.34.1/src/transformers/feature_extraction_utils.py#L61)

( data: typing.Union\[typing.Dict\[str, typing.Any], NoneType] = Nonetensor\_type: typing.Union\[NoneType, str, transformers.utils.generic.TensorType] = None )

Parameters

* **data** (`dict`) — Dictionary of lists/arrays/tensors returned by the **call**/pad methods (‘input\_values’, ‘attention\_mask’, etc.).
* **tensor\_type** (`Union[None, str, TensorType]`, *optional*) — You can give a tensor\_type here to convert the lists of integers in PyTorch/TensorFlow/Numpy Tensors at initialization.

Holds the output of the [pad()](https://huggingface.co/docs/transformers/v4.34.1/en/main_classes/feature_extractor#transformers.SequenceFeatureExtractor.pad) and feature extractor specific `__call__` methods.

This class is derived from a python dictionary and can be used as a dictionary.

**convert\_to\_tensors**

[\<source>](https://github.com/huggingface/transformers/blob/v4.34.1/src/transformers/feature_extraction_utils.py#L115)

( tensor\_type: typing.Union\[str, transformers.utils.generic.TensorType, NoneType] = None )

Parameters

* **tensor\_type** (`str` or [TensorType](https://huggingface.co/docs/transformers/v4.34.1/en/internal/file_utils#transformers.TensorType), *optional*) — The type of tensors to use. If `str`, should be one of the values of the enum [TensorType](https://huggingface.co/docs/transformers/v4.34.1/en/internal/file_utils#transformers.TensorType). If `None`, no modification is done.

Convert the inner content to tensors.

**to**

[\<source>](https://github.com/huggingface/transformers/blob/v4.34.1/src/transformers/feature_extraction_utils.py#L188)

( \*args\*\*kwargs ) → [BatchFeature](https://huggingface.co/docs/transformers/v4.34.1/en/main_classes/image_processor#transformers.BatchFeature)

Parameters

* **args** (`Tuple`) — Will be passed to the `to(...)` function of the tensors.
* **kwargs** (`Dict`, *optional*) — Will be passed to the `to(...)` function of the tensors.

Returns

[BatchFeature](https://huggingface.co/docs/transformers/v4.34.1/en/main_classes/image_processor#transformers.BatchFeature)

The same instance after modification.

Send all values to device by calling `v.to(*args, **kwargs)` (PyTorch only). This should support casting in different `dtypes` and sending the `BatchFeature` to a different `device`.

### BaseImageProcessor

#### class transformers.image\_processing\_utils.BaseImageProcessor

[\<source>](https://github.com/huggingface/transformers/blob/v4.34.1/src/transformers/image_processing_utils.py#L540)

( \*\*kwargs )

**center\_crop**

[\<source>](https://github.com/huggingface/transformers/blob/v4.34.1/src/transformers/image_processing_utils.py#L620)

( image: ndarraysize: typing.Dict\[str, int]data\_format: typing.Union\[transformers.image\_utils.ChannelDimension, str, NoneType] = Noneinput\_data\_format: typing.Union\[transformers.image\_utils.ChannelDimension, str, NoneType] = None\*\*kwargs )

Parameters

* **image** (`np.ndarray`) — Image to center crop.
* **size** (`Dict[str, int]`) — Size of the output image.
* **data\_format** (`str` or `ChannelDimension`, *optional*) — The channel dimension format for the output image. If unset, the channel dimension format of the input image is used. Can be one of:
  * `"channels_first"` or `ChannelDimension.FIRST`: image in (num\_channels, height, width) format.
  * `"channels_last"` or `ChannelDimension.LAST`: image in (height, width, num\_channels) format.
* **input\_data\_format** (`ChannelDimension` or `str`, *optional*) — The channel dimension format for the input image. If unset, the channel dimension format is inferred from the input image. Can be one of:
  * `"channels_first"` or `ChannelDimension.FIRST`: image in (num\_channels, height, width) format.
  * `"channels_last"` or `ChannelDimension.LAST`: image in (height, width, num\_channels) format.

Center crop an image to `(size["height"], size["width"])`. If the input size is smaller than `crop_size` along any edge, the image is padded with 0’s and then center cropped.

**normalize**

[\<source>](https://github.com/huggingface/transformers/blob/v4.34.1/src/transformers/image_processing_utils.py#L583)

( image: ndarraymean: typing.Union\[float, typing.Iterable\[float]]std: typing.Union\[float, typing.Iterable\[float]]data\_format: typing.Union\[transformers.image\_utils.ChannelDimension, str, NoneType] = Noneinput\_data\_format: typing.Union\[transformers.image\_utils.ChannelDimension, str, NoneType] = None\*\*kwargs ) → `np.ndarray`

Parameters

* **image** (`np.ndarray`) — Image to normalize.
* **mean** (`float` or `Iterable[float]`) — Image mean to use for normalization.
* **std** (`float` or `Iterable[float]`) — Image standard deviation to use for normalization.
* **data\_format** (`str` or `ChannelDimension`, *optional*) — The channel dimension format for the output image. If unset, the channel dimension format of the input image is used. Can be one of:
  * `"channels_first"` or `ChannelDimension.FIRST`: image in (num\_channels, height, width) format.
  * `"channels_last"` or `ChannelDimension.LAST`: image in (height, width, num\_channels) format.
* **input\_data\_format** (`ChannelDimension` or `str`, *optional*) — The channel dimension format for the input image. If unset, the channel dimension format is inferred from the input image. Can be one of:
  * `"channels_first"` or `ChannelDimension.FIRST`: image in (num\_channels, height, width) format.
  * `"channels_last"` or `ChannelDimension.LAST`: image in (height, width, num\_channels) format.

Returns

`np.ndarray`

The normalized image.

Normalize an image. image = (image - image\_mean) / image\_std.

**rescale**

[\<source>](https://github.com/huggingface/transformers/blob/v4.34.1/src/transformers/image_processing_utils.py#L551)

( image: ndarrayscale: floatdata\_format: typing.Union\[transformers.image\_utils.ChannelDimension, str, NoneType] = Noneinput\_data\_format: typing.Union\[transformers.image\_utils.ChannelDimension, str, NoneType] = None\*\*kwargs ) → `np.ndarray`

Parameters

* **image** (`np.ndarray`) — Image to rescale.
* **scale** (`float`) — The scaling factor to rescale pixel values by.
* **data\_format** (`str` or `ChannelDimension`, *optional*) — The channel dimension format for the output image. If unset, the channel dimension format of the input image is used. Can be one of:
  * `"channels_first"` or `ChannelDimension.FIRST`: image in (num\_channels, height, width) format.
  * `"channels_last"` or `ChannelDimension.LAST`: image in (height, width, num\_channels) format.
* **input\_data\_format** (`ChannelDimension` or `str`, *optional*) — The channel dimension format for the input image. If unset, the channel dimension format is inferred from the input image. Can be one of:
  * `"channels_first"` or `ChannelDimension.FIRST`: image in (num\_channels, height, width) format.
  * `"channels_last"` or `ChannelDimension.LAST`: image in (height, width, num\_channels) format.

Returns

`np.ndarray`

The rescaled image.

Rescale an image by a scale factor. image = image \* scale.


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://boinc-ai.gitbook.io/transformers/api/main-classes/image-processor.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
