Image Processor
Last updated
Last updated
An image processor is in charge of preparing input features for vision models and post processing their outputs. This includes transformations such as resizing, normalization, and conversion to PyTorch, TensorFlow, Flax and Numpy tensors. It may also include model specific post-processing such as converting logits to segmentation masks.
( **kwargs )
This is an image processor mixin used to provide saving/loading functionality for sequential and image feature extractors.
from_pretrained
( pretrained_model_name_or_path: typing.Union[str, os.PathLike]cache_dir: typing.Union[str, os.PathLike, NoneType] = Noneforce_download: bool = Falselocal_files_only: bool = Falsetoken: typing.Union[bool, str, NoneType] = Nonerevision: str = 'main'**kwargs )
Parameters
pretrained_model_name_or_path (str
or os.PathLike
) — This can be either:
a string, the model id of a pretrained image_processor hosted inside a model repo on boincai.com. Valid model ids can be located at the root-level, like bert-base-uncased
, or namespaced under a user or organization name, like dbmdz/bert-base-german-cased
.
a path to a directory containing a image processor file saved using the method, e.g., ./my_model_directory/
.
a path or url to a saved image processor JSON file, e.g., ./my_model_directory/preprocessor_config.json
.
cache_dir (str
or os.PathLike
, optional) — Path to a directory in which a downloaded pretrained model image processor should be cached if the standard cache should not be used.
force_download (bool
, optional, defaults to False
) — Whether or not to force to (re-)download the image processor files and override the cached versions if they exist.
resume_download (bool
, optional, defaults to False
) — Whether or not to delete incompletely received file. Attempts to resume the download if such a file exists.
proxies (Dict[str, str]
, optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g., {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}.
The proxies are used on each request.
token (str
or bool
, optional) — The token to use as HTTP bearer authorization for remote files. If True
, or not specified, will use the token generated when running boincai-cli login
(stored in ~/.boincai
).
revision (str
, optional, defaults to "main"
) — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on boincai.com, so revision
can be any identifier allowed by git.
Examples:
Copied
save_pretrained
( save_directory: typing.Union[str, os.PathLike]push_to_hub: bool = False**kwargs )
Parameters
save_directory (str
or os.PathLike
) — Directory where the image processor JSON file will be saved (will be created if it does not exist).
push_to_hub (bool
, optional, defaults to False
) — Whether or not to push your model to the BOINC AI model hub after saving it. You can specify the repository you want to push to with repo_id
(will default to the name of save_directory
in your namespace).
( data: typing.Union[typing.Dict[str, typing.Any], NoneType] = Nonetensor_type: typing.Union[NoneType, str, transformers.utils.generic.TensorType] = None )
Parameters
data (dict
) — Dictionary of lists/arrays/tensors returned by the call/pad methods (‘input_values’, ‘attention_mask’, etc.).
tensor_type (Union[None, str, TensorType]
, optional) — You can give a tensor_type here to convert the lists of integers in PyTorch/TensorFlow/Numpy Tensors at initialization.
This class is derived from a python dictionary and can be used as a dictionary.
convert_to_tensors
( tensor_type: typing.Union[str, transformers.utils.generic.TensorType, NoneType] = None )
Parameters
Convert the inner content to tensors.
to
Parameters
args (Tuple
) — Will be passed to the to(...)
function of the tensors.
kwargs (Dict
, optional) — Will be passed to the to(...)
function of the tensors.
Returns
The same instance after modification.
Send all values to device by calling v.to(*args, **kwargs)
(PyTorch only). This should support casting in different dtypes
and sending the BatchFeature
to a different device
.
( **kwargs )
center_crop
( image: ndarraysize: typing.Dict[str, int]data_format: typing.Union[transformers.image_utils.ChannelDimension, str, NoneType] = Noneinput_data_format: typing.Union[transformers.image_utils.ChannelDimension, str, NoneType] = None**kwargs )
Parameters
image (np.ndarray
) — Image to center crop.
size (Dict[str, int]
) — Size of the output image.
data_format (str
or ChannelDimension
, optional) — The channel dimension format for the output image. If unset, the channel dimension format of the input image is used. Can be one of:
"channels_first"
or ChannelDimension.FIRST
: image in (num_channels, height, width) format.
"channels_last"
or ChannelDimension.LAST
: image in (height, width, num_channels) format.
input_data_format (ChannelDimension
or str
, optional) — The channel dimension format for the input image. If unset, the channel dimension format is inferred from the input image. Can be one of:
"channels_first"
or ChannelDimension.FIRST
: image in (num_channels, height, width) format.
"channels_last"
or ChannelDimension.LAST
: image in (height, width, num_channels) format.
Center crop an image to (size["height"], size["width"])
. If the input size is smaller than crop_size
along any edge, the image is padded with 0’s and then center cropped.
normalize
( image: ndarraymean: typing.Union[float, typing.Iterable[float]]std: typing.Union[float, typing.Iterable[float]]data_format: typing.Union[transformers.image_utils.ChannelDimension, str, NoneType] = Noneinput_data_format: typing.Union[transformers.image_utils.ChannelDimension, str, NoneType] = None**kwargs ) → np.ndarray
Parameters
image (np.ndarray
) — Image to normalize.
mean (float
or Iterable[float]
) — Image mean to use for normalization.
std (float
or Iterable[float]
) — Image standard deviation to use for normalization.
data_format (str
or ChannelDimension
, optional) — The channel dimension format for the output image. If unset, the channel dimension format of the input image is used. Can be one of:
"channels_first"
or ChannelDimension.FIRST
: image in (num_channels, height, width) format.
"channels_last"
or ChannelDimension.LAST
: image in (height, width, num_channels) format.
input_data_format (ChannelDimension
or str
, optional) — The channel dimension format for the input image. If unset, the channel dimension format is inferred from the input image. Can be one of:
"channels_first"
or ChannelDimension.FIRST
: image in (num_channels, height, width) format.
"channels_last"
or ChannelDimension.LAST
: image in (height, width, num_channels) format.
Returns
np.ndarray
The normalized image.
Normalize an image. image = (image - image_mean) / image_std.
rescale
( image: ndarrayscale: floatdata_format: typing.Union[transformers.image_utils.ChannelDimension, str, NoneType] = Noneinput_data_format: typing.Union[transformers.image_utils.ChannelDimension, str, NoneType] = None**kwargs ) → np.ndarray
Parameters
image (np.ndarray
) — Image to rescale.
scale (float
) — The scaling factor to rescale pixel values by.
data_format (str
or ChannelDimension
, optional) — The channel dimension format for the output image. If unset, the channel dimension format of the input image is used. Can be one of:
"channels_first"
or ChannelDimension.FIRST
: image in (num_channels, height, width) format.
"channels_last"
or ChannelDimension.LAST
: image in (height, width, num_channels) format.
input_data_format (ChannelDimension
or str
, optional) — The channel dimension format for the input image. If unset, the channel dimension format is inferred from the input image. Can be one of:
"channels_first"
or ChannelDimension.FIRST
: image in (num_channels, height, width) format.
"channels_last"
or ChannelDimension.LAST
: image in (height, width, num_channels) format.
Returns
np.ndarray
The rescaled image.
Rescale an image by a scale factor. image = image * scale.
Instantiate a type of from an image processor.
kwargs (Dict[str, Any]
, optional) — Additional key word arguments passed along to the method.
Save an image processor object to the directory save_directory
, so that it can be re-loaded using the class method.
Holds the output of the and feature extractor specific __call__
methods.
tensor_type (str
or , optional) — The type of tensors to use. If str
, should be one of the values of the enum . If None
, no modification is done.
( *args**kwargs ) →