AutoFeatureExtractor
AutoFeatureExtractor
class transformers.AutoFeatureExtractor
( )
This is a generic feature extractor class that will be instantiated as one of the feature extractor classes of the library when created with the AutoFeatureExtractor.from_pretrained() class method.
This class cannot be instantiated directly using __init__() (throws an error).
from_pretrained
( pretrained_model_name_or_path**kwargs )
Parameters
- pretrained_model_name_or_path ( - stror- os.PathLike) β This can be either:- a string, the model id of a pretrained feature_extractor hosted inside a model repo on huggingface.co. Valid model ids can be located at the root-level, like - bert-base-uncased, or namespaced under a user or organization name, like- dbmdz/bert-base-german-cased.
- a path to a directory containing a feature extractor file saved using the save_pretrained() method, e.g., - ./my_model_directory/.
- a path or url to a saved feature extractor JSON file, e.g., - ./my_model_directory/preprocessor_config.json.
 
- cache_dir ( - stror- os.PathLike, optional) β Path to a directory in which a downloaded pretrained model feature extractor should be cached if the standard cache should not be used.
- force_download ( - bool, optional, defaults to- False) β Whether or not to force to (re-)download the feature extractor files and override the cached versions if they exist.
- resume_download ( - bool, optional, defaults to- False) β Whether or not to delete incompletely received file. Attempts to resume the download if such a file exists.
- proxies ( - Dict[str, str], optional) β A dictionary of proxy servers to use by protocol or endpoint, e.g.,- {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}.The proxies are used on each request.
- token ( - stror bool, optional) β The token to use as HTTP bearer authorization for remote files. If- True, will use the token generated when running- huggingface-cli login(stored in- ~/.huggingface).
- revision ( - str, optional, defaults to- "main") β The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so- revisioncan be any identifier allowed by git.
- return_unused_kwargs ( - bool, optional, defaults to- False) β If- False, then this function returns just the final feature extractor object. If- True, then this functions returns a- Tuple(feature_extractor, unused_kwargs)where unused_kwargs is a dictionary consisting of the key/value pairs whose keys are not feature extractor attributes: i.e., the part of- kwargswhich has not been used to update- feature_extractorand is otherwise ignored.
- trust_remote_code ( - bool, optional, defaults to- False) β Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to- Truefor repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.
- kwargs ( - Dict[str, Any], optional) β The values in kwargs of any keys which are feature extractor attributes will be used to override the loaded values. Behavior concerning key/value pairs whose keys are not feature extractor attributes is controlled by the- return_unused_kwargskeyword parameter.
Instantiate one of the feature extractor classes of the library from a pretrained model vocabulary.
The feature extractor class to instantiate is selected based on the model_type property of the config object (either passed as an argument or loaded from pretrained_model_name_or_path if possible), or when itβs missing, by falling back to using pattern matching on pretrained_model_name_or_path:
- audio-spectrogram-transformer β ASTFeatureExtractor (Audio Spectrogram Transformer model) 
- beit β BeitFeatureExtractor (BEiT model) 
- chinese_clip β ChineseCLIPFeatureExtractor (Chinese-CLIP model) 
- clap β ClapFeatureExtractor (CLAP model) 
- clip β CLIPFeatureExtractor (CLIP model) 
- clipseg β ViTFeatureExtractor (CLIPSeg model) 
- conditional_detr β ConditionalDetrFeatureExtractor (Conditional DETR model) 
- convnext β ConvNextFeatureExtractor (ConvNeXT model) 
- cvt β ConvNextFeatureExtractor (CvT model) 
- data2vec-audio β Wav2Vec2FeatureExtractor (Data2VecAudio model) 
- data2vec-vision β BeitFeatureExtractor (Data2VecVision model) 
- deformable_detr β DeformableDetrFeatureExtractor (Deformable DETR model) 
- deit β DeiTFeatureExtractor (DeiT model) 
- detr β DetrFeatureExtractor (DETR model) 
- dinat β ViTFeatureExtractor (DiNAT model) 
- donut-swin β DonutFeatureExtractor (DonutSwin model) 
- dpt β DPTFeatureExtractor (DPT model) 
- encodec β EncodecFeatureExtractor (EnCodec model) 
- flava β FlavaFeatureExtractor (FLAVA model) 
- glpn β GLPNFeatureExtractor (GLPN model) 
- groupvit β CLIPFeatureExtractor (GroupViT model) 
- hubert β Wav2Vec2FeatureExtractor (Hubert model) 
- imagegpt β ImageGPTFeatureExtractor (ImageGPT model) 
- layoutlmv2 β LayoutLMv2FeatureExtractor (LayoutLMv2 model) 
- layoutlmv3 β LayoutLMv3FeatureExtractor (LayoutLMv3 model) 
- levit β LevitFeatureExtractor (LeViT model) 
- maskformer β MaskFormerFeatureExtractor (MaskFormer model) 
- mctct β MCTCTFeatureExtractor (M-CTC-T model) 
- mobilenet_v1 β MobileNetV1FeatureExtractor (MobileNetV1 model) 
- mobilenet_v2 β MobileNetV2FeatureExtractor (MobileNetV2 model) 
- mobilevit β MobileViTFeatureExtractor (MobileViT model) 
- nat β ViTFeatureExtractor (NAT model) 
- owlvit β OwlViTFeatureExtractor (OWL-ViT model) 
- perceiver β PerceiverFeatureExtractor (Perceiver model) 
- poolformer β PoolFormerFeatureExtractor (PoolFormer model) 
- pop2piano β Pop2PianoFeatureExtractor (Pop2Piano model) 
- regnet β ConvNextFeatureExtractor (RegNet model) 
- resnet β ConvNextFeatureExtractor (ResNet model) 
- segformer β SegformerFeatureExtractor (SegFormer model) 
- sew β Wav2Vec2FeatureExtractor (SEW model) 
- sew-d β Wav2Vec2FeatureExtractor (SEW-D model) 
- speech_to_text β Speech2TextFeatureExtractor (Speech2Text model) 
- speecht5 β SpeechT5FeatureExtractor (SpeechT5 model) 
- swiftformer β ViTFeatureExtractor (SwiftFormer model) 
- swin β ViTFeatureExtractor (Swin Transformer model) 
- swinv2 β ViTFeatureExtractor (Swin Transformer V2 model) 
- table-transformer β DetrFeatureExtractor (Table Transformer model) 
- timesformer β VideoMAEFeatureExtractor (TimeSformer model) 
- tvlt β TvltFeatureExtractor (TVLT model) 
- unispeech β Wav2Vec2FeatureExtractor (UniSpeech model) 
- unispeech-sat β Wav2Vec2FeatureExtractor (UniSpeechSat model) 
- van β ConvNextFeatureExtractor (VAN model) 
- videomae β VideoMAEFeatureExtractor (VideoMAE model) 
- vilt β ViltFeatureExtractor (ViLT model) 
- vit β ViTFeatureExtractor (ViT model) 
- vit_mae β ViTFeatureExtractor (ViTMAE model) 
- vit_msn β ViTFeatureExtractor (ViTMSN model) 
- wav2vec2 β Wav2Vec2FeatureExtractor (Wav2Vec2 model) 
- wav2vec2-conformer β Wav2Vec2FeatureExtractor (Wav2Vec2-Conformer model) 
- wavlm β Wav2Vec2FeatureExtractor (WavLM model) 
- whisper β WhisperFeatureExtractor (Whisper model) 
- xclip β CLIPFeatureExtractor (X-CLIP model) 
- yolos β YolosFeatureExtractor (YOLOS model) 
Passing token=True is required when you want to use a private model.
Examples:
Copied
>>> from transformers import AutoFeatureExtractor
>>> # Download feature extractor from huggingface.co and cache.
>>> feature_extractor = AutoFeatureExtractor.from_pretrained("facebook/wav2vec2-base-960h")
>>> # If feature extractor files are in a directory (e.g. feature extractor was saved using *save_pretrained('./test/saved_model/')*)
>>> # feature_extractor = AutoFeatureExtractor.from_pretrained("./test/saved_model/")register
( config_classfeature_extractor_classexist_ok = False )
Parameters
- config_class (PretrainedConfig) β The configuration corresponding to the model to register. 
- feature_extractor_class ( - FeatureExtractorMixin) β The feature extractor to register.
Register a new feature extractor for this class.
Last updated
