AutoModel

class transformers.AutoModel

( *args**kwargs )

This is a generic model class that will be instantiated as one of the base model classes of the library when created with the from_pretrained() class method or the from_config() class method.

This class cannot be instantiated directly using __init__() (throws an error).

from_config

( **kwargs )

Parameters

config (PretrainedConfig) — The model class to instantiate is selected based on the configuration class:
- ASTConfig configuration class: ASTModel (Audio Spectrogram Transformer model)
- AlbertConfig configuration class: AlbertModel (ALBERT model)
- AlignConfig configuration class: AlignModel (ALIGN model)
- AltCLIPConfig configuration class: AltCLIPModel (AltCLIP model)
- AutoformerConfig configuration class: AutoformerModel (Autoformer model)
- BarkConfig configuration class: BarkModel (Bark model)
- BartConfig configuration class: BartModel (BART model)
- BeitConfig configuration class: BeitModel (BEiT model)
- BertConfig configuration class: BertModel (BERT model)
- BertGenerationConfig configuration class: BertGenerationEncoder (Bert Generation model)
- BigBirdConfig configuration class: BigBirdModel (BigBird model)
- BigBirdPegasusConfig configuration class: BigBirdPegasusModel (BigBird-Pegasus model)
- BioGptConfig configuration class: BioGptModel (BioGpt model)
- BitConfig configuration class: BitModel (BiT model)
- BlenderbotConfig configuration class: BlenderbotModel (Blenderbot model)
- BlenderbotSmallConfig configuration class: BlenderbotSmallModel (BlenderbotSmall model)
- Blip2Config configuration class: Blip2Model (BLIP-2 model)
- BlipConfig configuration class: BlipModel (BLIP model)
- BloomConfig configuration class: BloomModel (BLOOM model)
- BridgeTowerConfig configuration class: BridgeTowerModel (BridgeTower model)
- BrosConfig configuration class: BrosModel (BROS model)
- CLIPConfig configuration class: CLIPModel (CLIP model)
- CLIPSegConfig configuration class: CLIPSegModel (CLIPSeg model)
- CTRLConfig configuration class: CTRLModel (CTRL model)
- CamembertConfig configuration class: CamembertModel (CamemBERT model)
- CanineConfig configuration class: CanineModel (CANINE model)
- ChineseCLIPConfig configuration class: ChineseCLIPModel (Chinese-CLIP model)
- ClapConfig configuration class: ClapModel (CLAP model)
- CodeGenConfig configuration class: CodeGenModel (CodeGen model)
- ConditionalDetrConfig configuration class: ConditionalDetrModel (Conditional DETR model)
- ConvBertConfig configuration class: ConvBertModel (ConvBERT model)
- ConvNextConfig configuration class: ConvNextModel (ConvNeXT model)
- ConvNextV2Config configuration class: ConvNextV2Model (ConvNeXTV2 model)
- CpmAntConfig configuration class: CpmAntModel (CPM-Ant model)
- CvtConfig configuration class: CvtModel (CvT model)
- DPRConfig configuration class: DPRQuestionEncoder (DPR model)
- DPTConfig configuration class: DPTModel (DPT model)
- Data2VecAudioConfig configuration class: Data2VecAudioModel (Data2VecAudio model)
- Data2VecTextConfig configuration class: Data2VecTextModel (Data2VecText model)
- Data2VecVisionConfig configuration class: Data2VecVisionModel (Data2VecVision model)
- DebertaConfig configuration class: DebertaModel (DeBERTa model)
- DebertaV2Config configuration class: DebertaV2Model (DeBERTa-v2 model)
- DecisionTransformerConfig configuration class: DecisionTransformerModel (Decision Transformer model)
- DeformableDetrConfig configuration class: DeformableDetrModel (Deformable DETR model)
- DeiTConfig configuration class: DeiTModel (DeiT model)
- DetaConfig configuration class: DetaModel (DETA model)
- DetrConfig configuration class: DetrModel (DETR model)
- DinatConfig configuration class: DinatModel (DiNAT model)
- Dinov2Config configuration class: Dinov2Model (DINOv2 model)
- DistilBertConfig configuration class: DistilBertModel (DistilBERT model)
- DonutSwinConfig configuration class: DonutSwinModel (DonutSwin model)
- EfficientFormerConfig configuration class: EfficientFormerModel (EfficientFormer model)
- EfficientNetConfig configuration class: EfficientNetModel (EfficientNet model)
- ElectraConfig configuration class: ElectraModel (ELECTRA model)
- EncodecConfig configuration class: EncodecModel (EnCodec model)
- ErnieConfig configuration class: ErnieModel (ERNIE model)
- ErnieMConfig configuration class: ErnieMModel (ErnieM model)
- EsmConfig configuration class: EsmModel (ESM model)
- FNetConfig configuration class: FNetModel (FNet model)
- FSMTConfig configuration class: FSMTModel (FairSeq Machine-Translation model)
- FalconConfig configuration class: FalconModel (Falcon model)
- FlaubertConfig configuration class: FlaubertModel (FlauBERT model)
- FlavaConfig configuration class: FlavaModel (FLAVA model)
- FocalNetConfig configuration class: FocalNetModel (FocalNet model)
- FunnelConfig configuration class: FunnelModel or FunnelBaseModel (Funnel Transformer model)
- GLPNConfig configuration class: GLPNModel (GLPN model)
- GPT2Config configuration class: GPT2Model (OpenAI GPT-2 model)
- GPTBigCodeConfig configuration class: GPTBigCodeModel (GPTBigCode model)
- GPTJConfig configuration class: GPTJModel (GPT-J model)
- GPTNeoConfig configuration class: GPTNeoModel (GPT Neo model)
- GPTNeoXConfig configuration class: GPTNeoXModel (GPT NeoX model)
- GPTNeoXJapaneseConfig configuration class: GPTNeoXJapaneseModel (GPT NeoX Japanese model)
- GPTSanJapaneseConfig configuration class: GPTSanJapaneseForConditionalGeneration (GPTSAN-japanese model)
- GitConfig configuration class: GitModel (GIT model)
- GraphormerConfig configuration class: GraphormerModel (Graphormer model)
- GroupViTConfig configuration class: GroupViTModel (GroupViT model)
- HubertConfig configuration class: HubertModel (Hubert model)
- IBertConfig configuration class: IBertModel (I-BERT model)
- IdeficsConfig configuration class: IdeficsModel (IDEFICS model)
- ImageGPTConfig configuration class: ImageGPTModel (ImageGPT model)
- InformerConfig configuration class: InformerModel (Informer model)
- JukeboxConfig configuration class: JukeboxModel (Jukebox model)
- LEDConfig configuration class: LEDModel (LED model)
- LayoutLMConfig configuration class: LayoutLMModel (LayoutLM model)
- LayoutLMv2Config configuration class: LayoutLMv2Model (LayoutLMv2 model)
- LayoutLMv3Config configuration class: LayoutLMv3Model (LayoutLMv3 model)
- LevitConfig configuration class: LevitModel (LeViT model)
- LiltConfig configuration class: LiltModel (LiLT model)
- LlamaConfig configuration class: LlamaModel (LLaMA model)
- LongT5Config configuration class: LongT5Model (LongT5 model)
- LongformerConfig configuration class: LongformerModel (Longformer model)
- LukeConfig configuration class: LukeModel (LUKE model)
- LxmertConfig configuration class: LxmertModel (LXMERT model)
- M2M100Config configuration class: M2M100Model (M2M100 model)
- MBartConfig configuration class: MBartModel (mBART model)
- MCTCTConfig configuration class: MCTCTModel (M-CTC-T model)
- MPNetConfig configuration class: MPNetModel (MPNet model)
- MT5Config configuration class: MT5Model (MT5 model)
- MarianConfig configuration class: MarianModel (Marian model)
- MarkupLMConfig configuration class: MarkupLMModel (MarkupLM model)
- Mask2FormerConfig configuration class: Mask2FormerModel (Mask2Former model)
- MaskFormerConfig configuration class: MaskFormerModel (MaskFormer model)
- MaskFormerSwinConfig configuration class: MaskFormerSwinModel (MaskFormerSwin model)
- MegaConfig configuration class: MegaModel (MEGA model)
- MegatronBertConfig configuration class: MegatronBertModel (Megatron-BERT model)
- MgpstrConfig configuration class: MgpstrForSceneTextRecognition (MGP-STR model)
- MistralConfig configuration class: MistralModel (Mistral model)
- MobileBertConfig configuration class: MobileBertModel (MobileBERT model)
- MobileNetV1Config configuration class: MobileNetV1Model (MobileNetV1 model)
- MobileNetV2Config configuration class: MobileNetV2Model (MobileNetV2 model)
- MobileViTConfig configuration class: MobileViTModel (MobileViT model)
- MobileViTV2Config configuration class: MobileViTV2Model (MobileViTV2 model)
- MptConfig configuration class: MptModel (MPT model)
- MraConfig configuration class: MraModel (MRA model)
- MvpConfig configuration class: MvpModel (MVP model)
- NatConfig configuration class: NatModel (NAT model)
- NezhaConfig configuration class: NezhaModel (Nezha model)
- NllbMoeConfig configuration class: NllbMoeModel (NLLB-MOE model)
- NystromformerConfig configuration class: NystromformerModel (Nyströmformer model)
- OPTConfig configuration class: OPTModel (OPT model)
- OneFormerConfig configuration class: OneFormerModel (OneFormer model)
- OpenAIGPTConfig configuration class: OpenAIGPTModel (OpenAI GPT model)
- OpenLlamaConfig configuration class: OpenLlamaModel (OpenLlama model)
- OwlViTConfig configuration class: OwlViTModel (OWL-ViT model)
- PLBartConfig configuration class: PLBartModel (PLBart model)
- PegasusConfig configuration class: PegasusModel (Pegasus model)
- PegasusXConfig configuration class: PegasusXModel (PEGASUS-X model)
- PerceiverConfig configuration class: PerceiverModel (Perceiver model)
- PersimmonConfig configuration class: PersimmonModel (Persimmon model)
- PoolFormerConfig configuration class: PoolFormerModel (PoolFormer model)
- ProphetNetConfig configuration class: ProphetNetModel (ProphetNet model)
- PvtConfig configuration class: PvtModel (PVT model)
- QDQBertConfig configuration class: QDQBertModel (QDQBert model)
- ReformerConfig configuration class: ReformerModel (Reformer model)
- RegNetConfig configuration class: RegNetModel (RegNet model)
- RemBertConfig configuration class: RemBertModel (RemBERT model)
- ResNetConfig configuration class: ResNetModel (ResNet model)
- RetriBertConfig configuration class: RetriBertModel (RetriBERT model)
- RoCBertConfig configuration class: RoCBertModel (RoCBert model)
- RoFormerConfig configuration class: RoFormerModel (RoFormer model)
- RobertaConfig configuration class: RobertaModel (RoBERTa model)
- RobertaPreLayerNormConfig configuration class: RobertaPreLayerNormModel (RoBERTa-PreLayerNorm model)
- RwkvConfig configuration class: RwkvModel (RWKV model)
- SEWConfig configuration class: SEWModel (SEW model)
- SEWDConfig configuration class: SEWDModel (SEW-D model)
- SamConfig configuration class: SamModel (SAM model)
- SegformerConfig configuration class: SegformerModel (SegFormer model)
- Speech2TextConfig configuration class: Speech2TextModel (Speech2Text model)
- SpeechT5Config configuration class: SpeechT5Model (SpeechT5 model)
- SplinterConfig configuration class: SplinterModel (Splinter model)
- SqueezeBertConfig configuration class: SqueezeBertModel (SqueezeBERT model)
- SwiftFormerConfig configuration class: SwiftFormerModel (SwiftFormer model)
- Swin2SRConfig configuration class: Swin2SRModel (Swin2SR model)
- SwinConfig configuration class: SwinModel (Swin Transformer model)
- Swinv2Config configuration class: Swinv2Model (Swin Transformer V2 model)
- SwitchTransformersConfig configuration class: SwitchTransformersModel (SwitchTransformers model)
- T5Config configuration class: T5Model (T5 model)
- TableTransformerConfig configuration class: TableTransformerModel (Table Transformer model)
- TapasConfig configuration class: TapasModel (TAPAS model)
- TimeSeriesTransformerConfig configuration class: TimeSeriesTransformerModel (Time Series Transformer model)
- TimesformerConfig configuration class: TimesformerModel (TimeSformer model)
- TimmBackboneConfig configuration class: TimmBackbone (TimmBackbone model)
- TrajectoryTransformerConfig configuration class: TrajectoryTransformerModel (Trajectory Transformer model)
- TransfoXLConfig configuration class: TransfoXLModel (Transformer-XL model)
- TvltConfig configuration class: TvltModel (TVLT model)
- UMT5Config configuration class: UMT5Model (UMT5 model)
- UniSpeechConfig configuration class: UniSpeechModel (UniSpeech model)
- UniSpeechSatConfig configuration class: UniSpeechSatModel (UniSpeechSat model)
- VanConfig configuration class: VanModel (VAN model)
- ViTConfig configuration class: ViTModel (ViT model)
- ViTHybridConfig configuration class: ViTHybridModel (ViT Hybrid model)
- ViTMAEConfig configuration class: ViTMAEModel (ViTMAE model)
- ViTMSNConfig configuration class: ViTMSNModel (ViTMSN model)
- VideoMAEConfig configuration class: VideoMAEModel (VideoMAE model)
- ViltConfig configuration class: ViltModel (ViLT model)
- VisionTextDualEncoderConfig configuration class: VisionTextDualEncoderModel (VisionTextDualEncoder model)
- VisualBertConfig configuration class: VisualBertModel (VisualBERT model)
- VitDetConfig configuration class: VitDetModel (VitDet model)
- VitsConfig configuration class: VitsModel (VITS model)
- VivitConfig configuration class: VivitModel (ViViT model)
- Wav2Vec2Config configuration class: Wav2Vec2Model (Wav2Vec2 model)
- Wav2Vec2ConformerConfig configuration class: Wav2Vec2ConformerModel (Wav2Vec2-Conformer model)
- WavLMConfig configuration class: WavLMModel (WavLM model)
- WhisperConfig configuration class: WhisperModel (Whisper model)
- XCLIPConfig configuration class: XCLIPModel (X-CLIP model)
- XGLMConfig configuration class: XGLMModel (XGLM model)
- XLMConfig configuration class: XLMModel (XLM model)
- XLMProphetNetConfig configuration class: XLMProphetNetModel (XLM-ProphetNet model)
- XLMRobertaConfig configuration class: XLMRobertaModel (XLM-RoBERTa model)
- XLMRobertaXLConfig configuration class: XLMRobertaXLModel (XLM-RoBERTa-XL model)
- XLNetConfig configuration class: XLNetModel (XLNet model)
- XmodConfig configuration class: XmodModel (X-MOD model)
- YolosConfig configuration class: YolosModel (YOLOS model)
- YosoConfig configuration class: YosoModel (YOSO model)

Instantiates one of the base model classes of the library from a configuration.

Note: Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.

Examples:

Copied

>>> from transformers import AutoConfig, AutoModel

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("bert-base-cased")
>>> model = AutoModel.from_config(config)

from_pretrained

( *model_args**kwargs )

Parameters

pretrained_model_name_or_path (str or os.PathLike) — Can be either:
- A string, the model id of a pretrained model hosted inside a model repo on huggingface.co. Valid model ids can be located at the root-level, like bert-base-uncased, or namespaced under a user or organization name, like dbmdz/bert-base-german-cased.
- A path to a directory containing model weights saved using save_pretrained(), e.g., ./my_model_directory/.
- A path or url to a tensorflow index checkpoint file (e.g, ./tf_model/model.ckpt.index). In this case, from_tf should be set to True and a configuration object should be provided as config argument. This loading path is slower than converting the TensorFlow checkpoint in a PyTorch model using the provided conversion scripts and loading the PyTorch model afterwards.
model_args (additional positional arguments, optional) — Will be passed along to the underlying model __init__() method.
config (PretrainedConfig, optional) — Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:
- The model is a model provided by the library (loaded with the model id string of a pretrained model).
- The model was saved using save_pretrained() and is reloaded by supplying the save directory.
- The model is loaded by supplying a local directory as pretrained_model_name_or_path and a configuration JSON file named config.json is found in the directory.
state_dict (Dict[str, torch.Tensor], optional) — A state dictionary to use instead of a state dictionary loaded from saved weights file.
This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using save_pretrained() and from_pretrained() is not a simpler option.
cache_dir (str or os.PathLike, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.
from_tf (bool, optional, defaults to False) — Load the model weights from a TensorFlow checkpoint save file (see docstring of pretrained_model_name_or_path argument).
force_download (bool, optional, defaults to False) — Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.
resume_download (bool, optional, defaults to False) — Whether or not to delete incompletely received files. Will attempt to resume the download if such a file exists.
proxies (Dict[str, str], optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g., {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}. The proxies are used on each request.
output_loading_info(bool, optional, defaults to False) — Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.
local_files_only(bool, optional, defaults to False) — Whether or not to only look at local files (e.g., not try downloading the model).
revision (str, optional, defaults to "main") — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so revision can be any identifier allowed by git.
trust_remote_code (bool, optional, defaults to False) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to True for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.
code_revision (str, optional, defaults to "main") — The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so revision can be any identifier allowed by git.
kwargs (additional keyword arguments, optional) — Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., output_attentions=True). Behaves differently depending on whether a config is provided or automatically loaded:
- If a configuration is provided with config, **kwargs will be directly passed to the underlying model’s __init__ method (we assume all relevant updates to the configuration have already been done)
- If a configuration is not provided, kwargs will be first passed to the configuration class initialization function (from_pretrained()). Each key of kwargs that corresponds to a configuration attribute will be used to override said attribute with the supplied kwargs value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s __init__ function.

Instantiate one of the base model classes of the library from a pretrained model.

The model class to instantiate is selected based on the model_type property of the config object (either passed as an argument or loaded from pretrained_model_name_or_path if possible), or when it’s missing, by falling back to using pattern matching on pretrained_model_name_or_path:

albert — AlbertModel (ALBERT model)
align — AlignModel (ALIGN model)
altclip — AltCLIPModel (AltCLIP model)
audio-spectrogram-transformer — ASTModel (Audio Spectrogram Transformer model)
autoformer — AutoformerModel (Autoformer model)
bark — BarkModel (Bark model)
bart — BartModel (BART model)
beit — BeitModel (BEiT model)
bert — BertModel (BERT model)
bert-generation — BertGenerationEncoder (Bert Generation model)
big_bird — BigBirdModel (BigBird model)
bigbird_pegasus — BigBirdPegasusModel (BigBird-Pegasus model)
biogpt — BioGptModel (BioGpt model)
bit — BitModel (BiT model)
blenderbot — BlenderbotModel (Blenderbot model)
blenderbot-small — BlenderbotSmallModel (BlenderbotSmall model)
blip — BlipModel (BLIP model)
blip-2 — Blip2Model (BLIP-2 model)
bloom — BloomModel (BLOOM model)
bridgetower — BridgeTowerModel (BridgeTower model)
bros — BrosModel (BROS model)
camembert — CamembertModel (CamemBERT model)
canine — CanineModel (CANINE model)
chinese_clip — ChineseCLIPModel (Chinese-CLIP model)
clap — ClapModel (CLAP model)
clip — CLIPModel (CLIP model)
clipseg — CLIPSegModel (CLIPSeg model)
code_llama — LlamaModel (CodeLlama model)
codegen — CodeGenModel (CodeGen model)
conditional_detr — ConditionalDetrModel (Conditional DETR model)
convbert — ConvBertModel (ConvBERT model)
convnext — ConvNextModel (ConvNeXT model)
convnextv2 — ConvNextV2Model (ConvNeXTV2 model)
cpmant — CpmAntModel (CPM-Ant model)
ctrl — CTRLModel (CTRL model)
cvt — CvtModel (CvT model)
data2vec-audio — Data2VecAudioModel (Data2VecAudio model)
data2vec-text — Data2VecTextModel (Data2VecText model)
data2vec-vision — Data2VecVisionModel (Data2VecVision model)
deberta — DebertaModel (DeBERTa model)
deberta-v2 — DebertaV2Model (DeBERTa-v2 model)
decision_transformer — DecisionTransformerModel (Decision Transformer model)
deformable_detr — DeformableDetrModel (Deformable DETR model)
deit — DeiTModel (DeiT model)
deta — DetaModel (DETA model)
detr — DetrModel (DETR model)
dinat — DinatModel (DiNAT model)
dinov2 — Dinov2Model (DINOv2 model)
distilbert — DistilBertModel (DistilBERT model)
donut-swin — DonutSwinModel (DonutSwin model)
dpr — DPRQuestionEncoder (DPR model)
dpt — DPTModel (DPT model)
efficientformer — EfficientFormerModel (EfficientFormer model)
efficientnet — EfficientNetModel (EfficientNet model)
electra — ElectraModel (ELECTRA model)
encodec — EncodecModel (EnCodec model)
ernie — ErnieModel (ERNIE model)
ernie_m — ErnieMModel (ErnieM model)
esm — EsmModel (ESM model)
falcon — FalconModel (Falcon model)
flaubert — FlaubertModel (FlauBERT model)
flava — FlavaModel (FLAVA model)
fnet — FNetModel (FNet model)
focalnet — FocalNetModel (FocalNet model)
fsmt — FSMTModel (FairSeq Machine-Translation model)
funnel — FunnelModel or FunnelBaseModel (Funnel Transformer model)
git — GitModel (GIT model)
glpn — GLPNModel (GLPN model)
gpt-sw3 — GPT2Model (GPT-Sw3 model)
gpt2 — GPT2Model (OpenAI GPT-2 model)
gpt_bigcode — GPTBigCodeModel (GPTBigCode model)
gpt_neo — GPTNeoModel (GPT Neo model)
gpt_neox — GPTNeoXModel (GPT NeoX model)
gpt_neox_japanese — GPTNeoXJapaneseModel (GPT NeoX Japanese model)
gptj — GPTJModel (GPT-J model)
gptsan-japanese — GPTSanJapaneseForConditionalGeneration (GPTSAN-japanese model)
graphormer — GraphormerModel (Graphormer model)
groupvit — GroupViTModel (GroupViT model)
hubert — HubertModel (Hubert model)
ibert — IBertModel (I-BERT model)
idefics — IdeficsModel (IDEFICS model)
imagegpt — ImageGPTModel (ImageGPT model)
informer — InformerModel (Informer model)
jukebox — JukeboxModel (Jukebox model)
layoutlm — LayoutLMModel (LayoutLM model)
layoutlmv2 — LayoutLMv2Model (LayoutLMv2 model)
layoutlmv3 — LayoutLMv3Model (LayoutLMv3 model)
led — LEDModel (LED model)
levit — LevitModel (LeViT model)
lilt — LiltModel (LiLT model)
llama — LlamaModel (LLaMA model)
longformer — LongformerModel (Longformer model)
longt5 — LongT5Model (LongT5 model)
luke — LukeModel (LUKE model)
lxmert — LxmertModel (LXMERT model)
m2m_100 — M2M100Model (M2M100 model)
marian — MarianModel (Marian model)
markuplm — MarkupLMModel (MarkupLM model)
mask2former — Mask2FormerModel (Mask2Former model)
maskformer — MaskFormerModel (MaskFormer model)
maskformer-swin — MaskFormerSwinModel (MaskFormerSwin model)
mbart — MBartModel (mBART model)
mctct — MCTCTModel (M-CTC-T model)
mega — MegaModel (MEGA model)
megatron-bert — MegatronBertModel (Megatron-BERT model)
mgp-str — MgpstrForSceneTextRecognition (MGP-STR model)
mistral — MistralModel (Mistral model)
mobilebert — MobileBertModel (MobileBERT model)
mobilenet_v1 — MobileNetV1Model (MobileNetV1 model)
mobilenet_v2 — MobileNetV2Model (MobileNetV2 model)
mobilevit — MobileViTModel (MobileViT model)
mobilevitv2 — MobileViTV2Model (MobileViTV2 model)
mpnet — MPNetModel (MPNet model)
mpt — MptModel (MPT model)
mra — MraModel (MRA model)
mt5 — MT5Model (MT5 model)
mvp — MvpModel (MVP model)
nat — NatModel (NAT model)
nezha — NezhaModel (Nezha model)
nllb-moe — NllbMoeModel (NLLB-MOE model)
nystromformer — NystromformerModel (Nyströmformer model)
oneformer — OneFormerModel (OneFormer model)
open-llama — OpenLlamaModel (OpenLlama model)
openai-gpt — OpenAIGPTModel (OpenAI GPT model)
opt — OPTModel (OPT model)
owlvit — OwlViTModel (OWL-ViT model)
pegasus — PegasusModel (Pegasus model)
pegasus_x — PegasusXModel (PEGASUS-X model)
perceiver — PerceiverModel (Perceiver model)
persimmon — PersimmonModel (Persimmon model)
plbart — PLBartModel (PLBart model)
poolformer — PoolFormerModel (PoolFormer model)
prophetnet — ProphetNetModel (ProphetNet model)
pvt — PvtModel (PVT model)
qdqbert — QDQBertModel (QDQBert model)
reformer — ReformerModel (Reformer model)
regnet — RegNetModel (RegNet model)
rembert — RemBertModel (RemBERT model)
resnet — ResNetModel (ResNet model)
retribert — RetriBertModel (RetriBERT model)
roberta — RobertaModel (RoBERTa model)
roberta-prelayernorm — RobertaPreLayerNormModel (RoBERTa-PreLayerNorm model)
roc_bert — RoCBertModel (RoCBert model)
roformer — RoFormerModel (RoFormer model)
rwkv — RwkvModel (RWKV model)
sam — SamModel (SAM model)
segformer — SegformerModel (SegFormer model)
sew — SEWModel (SEW model)
sew-d — SEWDModel (SEW-D model)
speech_to_text — Speech2TextModel (Speech2Text model)
speecht5 — SpeechT5Model (SpeechT5 model)
splinter — SplinterModel (Splinter model)
squeezebert — SqueezeBertModel (SqueezeBERT model)
swiftformer — SwiftFormerModel (SwiftFormer model)
swin — SwinModel (Swin Transformer model)
swin2sr — Swin2SRModel (Swin2SR model)
swinv2 — Swinv2Model (Swin Transformer V2 model)
switch_transformers — SwitchTransformersModel (SwitchTransformers model)
t5 — T5Model (T5 model)
table-transformer — TableTransformerModel (Table Transformer model)
tapas — TapasModel (TAPAS model)
time_series_transformer — TimeSeriesTransformerModel (Time Series Transformer model)
timesformer — TimesformerModel (TimeSformer model)
timm_backbone — TimmBackbone (TimmBackbone model)
trajectory_transformer — TrajectoryTransformerModel (Trajectory Transformer model)
transfo-xl — TransfoXLModel (Transformer-XL model)
tvlt — TvltModel (TVLT model)
umt5 — UMT5Model (UMT5 model)
unispeech — UniSpeechModel (UniSpeech model)
unispeech-sat — UniSpeechSatModel (UniSpeechSat model)
van — VanModel (VAN model)
videomae — VideoMAEModel (VideoMAE model)
vilt — ViltModel (ViLT model)
vision-text-dual-encoder — VisionTextDualEncoderModel (VisionTextDualEncoder model)
visual_bert — VisualBertModel (VisualBERT model)
vit — ViTModel (ViT model)
vit_hybrid — ViTHybridModel (ViT Hybrid model)
vit_mae — ViTMAEModel (ViTMAE model)
vit_msn — ViTMSNModel (ViTMSN model)
vitdet — VitDetModel (VitDet model)
vits — VitsModel (VITS model)
vivit — VivitModel (ViViT model)
wav2vec2 — Wav2Vec2Model (Wav2Vec2 model)
wav2vec2-conformer — Wav2Vec2ConformerModel (Wav2Vec2-Conformer model)
wavlm — WavLMModel (WavLM model)
whisper — WhisperModel (Whisper model)
xclip — XCLIPModel (X-CLIP model)
xglm — XGLMModel (XGLM model)
xlm — XLMModel (XLM model)
xlm-prophetnet — XLMProphetNetModel (XLM-ProphetNet model)
xlm-roberta — XLMRobertaModel (XLM-RoBERTa model)
xlm-roberta-xl — XLMRobertaXLModel (XLM-RoBERTa-XL model)
xlnet — XLNetModel (XLNet model)
xmod — XmodModel (X-MOD model)
yolos — YolosModel (YOLOS model)
yoso — YosoModel (YOSO model)

The model is set in evaluation mode by default using model.eval() (so for instance, dropout modules are deactivated). To train the model, you should first set it back in training mode with model.train()

Examples:

Copied

>>> from transformers import AutoConfig, AutoModel

>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModel.from_pretrained("bert-base-cased")

>>> # Update configuration during loading
>>> model = AutoModel.from_pretrained("bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a TF checkpoint file instead of a PyTorch model (slower)
>>> config = AutoConfig.from_pretrained("./tf_model/bert_tf_model_config.json")
>>> model = AutoModel.from_pretrained(
...     "./tf_model/bert_tf_checkpoint.ckpt.index", from_tf=True, config=config
... )

PreviousGeneric model classes NextTFAutoModel

Last updated 2 years ago