Image Processor
Image Processor
An image processor is in charge of preparing input features for vision models and post processing their outputs. This includes transformations such as resizing, normalization, and conversion to PyTorch, TensorFlow, Flax and Numpy tensors. It may also include model specific post-processing such as converting logits to segmentation masks.
ImageProcessingMixin
class transformers.ImageProcessingMixin
( **kwargs )
This is an image processor mixin used to provide saving/loading functionality for sequential and image feature extractors.
from_pretrained
( pretrained_model_name_or_path: typing.Union[str, os.PathLike]cache_dir: typing.Union[str, os.PathLike, NoneType] = Noneforce_download: bool = Falselocal_files_only: bool = Falsetoken: typing.Union[bool, str, NoneType] = Nonerevision: str = 'main'**kwargs )
Parameters
pretrained_model_name_or_path (
str
oros.PathLike
) — This can be either:a string, the model id of a pretrained image_processor hosted inside a model repo on boincai.com. Valid model ids can be located at the root-level, like
bert-base-uncased
, or namespaced under a user or organization name, likedbmdz/bert-base-german-cased
.a path to a directory containing a image processor file saved using the save_pretrained() method, e.g.,
./my_model_directory/
.a path or url to a saved image processor JSON file, e.g.,
./my_model_directory/preprocessor_config.json
.
cache_dir (
str
oros.PathLike
, optional) — Path to a directory in which a downloaded pretrained model image processor should be cached if the standard cache should not be used.force_download (
bool
, optional, defaults toFalse
) — Whether or not to force to (re-)download the image processor files and override the cached versions if they exist.resume_download (
bool
, optional, defaults toFalse
) — Whether or not to delete incompletely received file. Attempts to resume the download if such a file exists.proxies (
Dict[str, str]
, optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g.,{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}.
The proxies are used on each request.token (
str
orbool
, optional) — The token to use as HTTP bearer authorization for remote files. IfTrue
, or not specified, will use the token generated when runningboincai-cli login
(stored in~/.boincai
).revision (
str
, optional, defaults to"main"
) — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on boincai.com, sorevision
can be any identifier allowed by git.
Instantiate a type of ImageProcessingMixin from an image processor.
Examples:
Copied
save_pretrained
( save_directory: typing.Union[str, os.PathLike]push_to_hub: bool = False**kwargs )
Parameters
save_directory (
str
oros.PathLike
) — Directory where the image processor JSON file will be saved (will be created if it does not exist).push_to_hub (
bool
, optional, defaults toFalse
) — Whether or not to push your model to the BOINC AI model hub after saving it. You can specify the repository you want to push to withrepo_id
(will default to the name ofsave_directory
in your namespace).kwargs (
Dict[str, Any]
, optional) — Additional key word arguments passed along to the push_to_hub() method.
Save an image processor object to the directory save_directory
, so that it can be re-loaded using the from_pretrained() class method.
BatchFeature
class transformers.BatchFeature
( data: typing.Union[typing.Dict[str, typing.Any], NoneType] = Nonetensor_type: typing.Union[NoneType, str, transformers.utils.generic.TensorType] = None )
Parameters
data (
dict
) — Dictionary of lists/arrays/tensors returned by the call/pad methods (‘input_values’, ‘attention_mask’, etc.).tensor_type (
Union[None, str, TensorType]
, optional) — You can give a tensor_type here to convert the lists of integers in PyTorch/TensorFlow/Numpy Tensors at initialization.
Holds the output of the pad() and feature extractor specific __call__
methods.
This class is derived from a python dictionary and can be used as a dictionary.
convert_to_tensors
( tensor_type: typing.Union[str, transformers.utils.generic.TensorType, NoneType] = None )
Parameters
tensor_type (
str
or TensorType, optional) — The type of tensors to use. Ifstr
, should be one of the values of the enum TensorType. IfNone
, no modification is done.
Convert the inner content to tensors.
to
( *args**kwargs ) → BatchFeature
Parameters
args (
Tuple
) — Will be passed to theto(...)
function of the tensors.kwargs (
Dict
, optional) — Will be passed to theto(...)
function of the tensors.
Returns
The same instance after modification.
Send all values to device by calling v.to(*args, **kwargs)
(PyTorch only). This should support casting in different dtypes
and sending the BatchFeature
to a different device
.
BaseImageProcessor
class transformers.image_processing_utils.BaseImageProcessor
( **kwargs )
center_crop
( image: ndarraysize: typing.Dict[str, int]data_format: typing.Union[transformers.image_utils.ChannelDimension, str, NoneType] = Noneinput_data_format: typing.Union[transformers.image_utils.ChannelDimension, str, NoneType] = None**kwargs )
Parameters
image (
np.ndarray
) — Image to center crop.size (
Dict[str, int]
) — Size of the output image.data_format (
str
orChannelDimension
, optional) — The channel dimension format for the output image. If unset, the channel dimension format of the input image is used. Can be one of:"channels_first"
orChannelDimension.FIRST
: image in (num_channels, height, width) format."channels_last"
orChannelDimension.LAST
: image in (height, width, num_channels) format.
input_data_format (
ChannelDimension
orstr
, optional) — The channel dimension format for the input image. If unset, the channel dimension format is inferred from the input image. Can be one of:"channels_first"
orChannelDimension.FIRST
: image in (num_channels, height, width) format."channels_last"
orChannelDimension.LAST
: image in (height, width, num_channels) format.
Center crop an image to (size["height"], size["width"])
. If the input size is smaller than crop_size
along any edge, the image is padded with 0’s and then center cropped.
normalize
( image: ndarraymean: typing.Union[float, typing.Iterable[float]]std: typing.Union[float, typing.Iterable[float]]data_format: typing.Union[transformers.image_utils.ChannelDimension, str, NoneType] = Noneinput_data_format: typing.Union[transformers.image_utils.ChannelDimension, str, NoneType] = None**kwargs ) → np.ndarray
Parameters
image (
np.ndarray
) — Image to normalize.mean (
float
orIterable[float]
) — Image mean to use for normalization.std (
float
orIterable[float]
) — Image standard deviation to use for normalization.data_format (
str
orChannelDimension
, optional) — The channel dimension format for the output image. If unset, the channel dimension format of the input image is used. Can be one of:"channels_first"
orChannelDimension.FIRST
: image in (num_channels, height, width) format."channels_last"
orChannelDimension.LAST
: image in (height, width, num_channels) format.
input_data_format (
ChannelDimension
orstr
, optional) — The channel dimension format for the input image. If unset, the channel dimension format is inferred from the input image. Can be one of:"channels_first"
orChannelDimension.FIRST
: image in (num_channels, height, width) format."channels_last"
orChannelDimension.LAST
: image in (height, width, num_channels) format.
Returns
np.ndarray
The normalized image.
Normalize an image. image = (image - image_mean) / image_std.
rescale
( image: ndarrayscale: floatdata_format: typing.Union[transformers.image_utils.ChannelDimension, str, NoneType] = Noneinput_data_format: typing.Union[transformers.image_utils.ChannelDimension, str, NoneType] = None**kwargs ) → np.ndarray
Parameters
image (
np.ndarray
) — Image to rescale.scale (
float
) — The scaling factor to rescale pixel values by.data_format (
str
orChannelDimension
, optional) — The channel dimension format for the output image. If unset, the channel dimension format of the input image is used. Can be one of:"channels_first"
orChannelDimension.FIRST
: image in (num_channels, height, width) format."channels_last"
orChannelDimension.LAST
: image in (height, width, num_channels) format.
input_data_format (
ChannelDimension
orstr
, optional) — The channel dimension format for the input image. If unset, the channel dimension format is inferred from the input image. Can be one of:"channels_first"
orChannelDimension.FIRST
: image in (num_channels, height, width) format."channels_last"
orChannelDimension.LAST
: image in (height, width, num_channels) format.
Returns
np.ndarray
The rescaled image.
Rescale an image by a scale factor. image = image * scale.
Last updated