AutoFeatureExtractor

class transformers.AutoFeatureExtractor

( )

This is a generic feature extractor class that will be instantiated as one of the feature extractor classes of the library when created with the AutoFeatureExtractor.from_pretrained() class method.

This class cannot be instantiated directly using __init__() (throws an error).

from_pretrained

( pretrained_model_name_or_path**kwargs )

Parameters

pretrained_model_name_or_path (str or os.PathLike) — This can be either:
- a string, the model id of a pretrained feature_extractor hosted inside a model repo on huggingface.co. Valid model ids can be located at the root-level, like bert-base-uncased, or namespaced under a user or organization name, like dbmdz/bert-base-german-cased.
- a path to a directory containing a feature extractor file saved using the save_pretrained() method, e.g., ./my_model_directory/.
- a path or url to a saved feature extractor JSON file, e.g., ./my_model_directory/preprocessor_config.json.
cache_dir (str or os.PathLike, optional) — Path to a directory in which a downloaded pretrained model feature extractor should be cached if the standard cache should not be used.
force_download (bool, optional, defaults to False) — Whether or not to force to (re-)download the feature extractor files and override the cached versions if they exist.
resume_download (bool, optional, defaults to False) — Whether or not to delete incompletely received file. Attempts to resume the download if such a file exists.
proxies (Dict[str, str], optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g., {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}. The proxies are used on each request.
token (str or bool, optional) — The token to use as HTTP bearer authorization for remote files. If True, will use the token generated when running huggingface-cli login (stored in ~/.huggingface).
revision (str, optional, defaults to "main") — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so revision can be any identifier allowed by git.
return_unused_kwargs (bool, optional, defaults to False) — If False, then this function returns just the final feature extractor object. If True, then this functions returns a Tuple(feature_extractor, unused_kwargs) where unused_kwargs is a dictionary consisting of the key/value pairs whose keys are not feature extractor attributes: i.e., the part of kwargs which has not been used to update feature_extractor and is otherwise ignored.
trust_remote_code (bool, optional, defaults to False) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to True for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.
kwargs (Dict[str, Any], optional) — The values in kwargs of any keys which are feature extractor attributes will be used to override the loaded values. Behavior concerning key/value pairs whose keys are not feature extractor attributes is controlled by the return_unused_kwargs keyword parameter.

Instantiate one of the feature extractor classes of the library from a pretrained model vocabulary.

The feature extractor class to instantiate is selected based on the model_type property of the config object (either passed as an argument or loaded from pretrained_model_name_or_path if possible), or when it’s missing, by falling back to using pattern matching on pretrained_model_name_or_path:

audio-spectrogram-transformer — ASTFeatureExtractor (Audio Spectrogram Transformer model)
beit — BeitFeatureExtractor (BEiT model)
chinese_clip — ChineseCLIPFeatureExtractor (Chinese-CLIP model)
clap — ClapFeatureExtractor (CLAP model)
clip — CLIPFeatureExtractor (CLIP model)
clipseg — ViTFeatureExtractor (CLIPSeg model)
conditional_detr — ConditionalDetrFeatureExtractor (Conditional DETR model)
convnext — ConvNextFeatureExtractor (ConvNeXT model)
cvt — ConvNextFeatureExtractor (CvT model)
data2vec-audio — Wav2Vec2FeatureExtractor (Data2VecAudio model)
data2vec-vision — BeitFeatureExtractor (Data2VecVision model)
deformable_detr — DeformableDetrFeatureExtractor (Deformable DETR model)
deit — DeiTFeatureExtractor (DeiT model)
detr — DetrFeatureExtractor (DETR model)
dinat — ViTFeatureExtractor (DiNAT model)
donut-swin — DonutFeatureExtractor (DonutSwin model)
dpt — DPTFeatureExtractor (DPT model)
encodec — EncodecFeatureExtractor (EnCodec model)
flava — FlavaFeatureExtractor (FLAVA model)
glpn — GLPNFeatureExtractor (GLPN model)
groupvit — CLIPFeatureExtractor (GroupViT model)
hubert — Wav2Vec2FeatureExtractor (Hubert model)
imagegpt — ImageGPTFeatureExtractor (ImageGPT model)
layoutlmv2 — LayoutLMv2FeatureExtractor (LayoutLMv2 model)
layoutlmv3 — LayoutLMv3FeatureExtractor (LayoutLMv3 model)
levit — LevitFeatureExtractor (LeViT model)
maskformer — MaskFormerFeatureExtractor (MaskFormer model)
mctct — MCTCTFeatureExtractor (M-CTC-T model)
mobilenet_v1 — MobileNetV1FeatureExtractor (MobileNetV1 model)
mobilenet_v2 — MobileNetV2FeatureExtractor (MobileNetV2 model)
mobilevit — MobileViTFeatureExtractor (MobileViT model)
nat — ViTFeatureExtractor (NAT model)
owlvit — OwlViTFeatureExtractor (OWL-ViT model)
perceiver — PerceiverFeatureExtractor (Perceiver model)
poolformer — PoolFormerFeatureExtractor (PoolFormer model)
pop2piano — Pop2PianoFeatureExtractor (Pop2Piano model)
regnet — ConvNextFeatureExtractor (RegNet model)
resnet — ConvNextFeatureExtractor (ResNet model)
segformer — SegformerFeatureExtractor (SegFormer model)
sew — Wav2Vec2FeatureExtractor (SEW model)
sew-d — Wav2Vec2FeatureExtractor (SEW-D model)
speech_to_text — Speech2TextFeatureExtractor (Speech2Text model)
speecht5 — SpeechT5FeatureExtractor (SpeechT5 model)
swiftformer — ViTFeatureExtractor (SwiftFormer model)
swin — ViTFeatureExtractor (Swin Transformer model)
swinv2 — ViTFeatureExtractor (Swin Transformer V2 model)
table-transformer — DetrFeatureExtractor (Table Transformer model)
timesformer — VideoMAEFeatureExtractor (TimeSformer model)
tvlt — TvltFeatureExtractor (TVLT model)
unispeech — Wav2Vec2FeatureExtractor (UniSpeech model)
unispeech-sat — Wav2Vec2FeatureExtractor (UniSpeechSat model)
van — ConvNextFeatureExtractor (VAN model)
videomae — VideoMAEFeatureExtractor (VideoMAE model)
vilt — ViltFeatureExtractor (ViLT model)
vit — ViTFeatureExtractor (ViT model)
vit_mae — ViTFeatureExtractor (ViTMAE model)
vit_msn — ViTFeatureExtractor (ViTMSN model)
wav2vec2 — Wav2Vec2FeatureExtractor (Wav2Vec2 model)
wav2vec2-conformer — Wav2Vec2FeatureExtractor (Wav2Vec2-Conformer model)
wavlm — Wav2Vec2FeatureExtractor (WavLM model)
whisper — WhisperFeatureExtractor (Whisper model)
xclip — CLIPFeatureExtractor (X-CLIP model)
yolos — YolosFeatureExtractor (YOLOS model)

Passing token=True is required when you want to use a private model.

Examples:

Copied

>>> from transformers import AutoFeatureExtractor

>>> # Download feature extractor from huggingface.co and cache.
>>> feature_extractor = AutoFeatureExtractor.from_pretrained("facebook/wav2vec2-base-960h")

>>> # If feature extractor files are in a directory (e.g. feature extractor was saved using *save_pretrained('./test/saved_model/')*)
>>> # feature_extractor = AutoFeatureExtractor.from_pretrained("./test/saved_model/")

register

( config_classfeature_extractor_classexist_ok = False )

Parameters

config_class (PretrainedConfig) — The configuration corresponding to the model to register.
feature_extractor_class (FeatureExtractorMixin) — The feature extractor to register.

PreviousAutoTokenizer NextAutoImageProcessor

Last updated 1 year ago