Transformers
  • 🌍GET STARTED
    • Transformers
    • Quick tour
    • Installation
  • 🌍TUTORIALS
    • Run inference with pipelines
    • Write portable code with AutoClass
    • Preprocess data
    • Fine-tune a pretrained model
    • Train with a script
    • Set up distributed training with BOINC AI Accelerate
    • Load and train adapters with BOINC AI PEFT
    • Share your model
    • Agents
    • Generation with LLMs
  • 🌍TASK GUIDES
    • 🌍NATURAL LANGUAGE PROCESSING
      • Text classification
      • Token classification
      • Question answering
      • Causal language modeling
      • Masked language modeling
      • Translation
      • Summarization
      • Multiple choice
    • 🌍AUDIO
      • Audio classification
      • Automatic speech recognition
    • 🌍COMPUTER VISION
      • Image classification
      • Semantic segmentation
      • Video classification
      • Object detection
      • Zero-shot object detection
      • Zero-shot image classification
      • Depth estimation
    • 🌍MULTIMODAL
      • Image captioning
      • Document Question Answering
      • Visual Question Answering
      • Text to speech
    • 🌍GENERATION
      • Customize the generation strategy
    • 🌍PROMPTING
      • Image tasks with IDEFICS
  • 🌍DEVELOPER GUIDES
    • Use fast tokenizers from BOINC AI Tokenizers
    • Run inference with multilingual models
    • Use model-specific APIs
    • Share a custom model
    • Templates for chat models
    • Run training on Amazon SageMaker
    • Export to ONNX
    • Export to TFLite
    • Export to TorchScript
    • Benchmarks
    • Notebooks with examples
    • Community resources
    • Custom Tools and Prompts
    • Troubleshoot
  • 🌍PERFORMANCE AND SCALABILITY
    • Overview
    • 🌍EFFICIENT TRAINING TECHNIQUES
      • Methods and tools for efficient training on a single GPU
      • Multiple GPUs and parallelism
      • Efficient training on CPU
      • Distributed CPU training
      • Training on TPUs
      • Training on TPU with TensorFlow
      • Training on Specialized Hardware
      • Custom hardware for training
      • Hyperparameter Search using Trainer API
    • 🌍OPTIMIZING INFERENCE
      • Inference on CPU
      • Inference on one GPU
      • Inference on many GPUs
      • Inference on Specialized Hardware
    • Instantiating a big model
    • Troubleshooting
    • XLA Integration for TensorFlow Models
    • Optimize inference using `torch.compile()`
  • 🌍CONTRIBUTE
    • How to contribute to transformers?
    • How to add a model to BOINC AI Transformers?
    • How to convert a BOINC AI Transformers model to TensorFlow?
    • How to add a pipeline to BOINC AI Transformers?
    • Testing
    • Checks on a Pull Request
  • 🌍CONCEPTUAL GUIDES
    • Philosophy
    • Glossary
    • What BOINC AI Transformers can do
    • How BOINC AI Transformers solve tasks
    • The Transformer model family
    • Summary of the tokenizers
    • Attention mechanisms
    • Padding and truncation
    • BERTology
    • Perplexity of fixed-length models
    • Pipelines for webserver inference
    • Model training anatomy
  • 🌍API
    • 🌍MAIN CLASSES
      • Agents and Tools
      • 🌍Auto Classes
        • Extending the Auto Classes
        • AutoConfig
        • AutoTokenizer
        • AutoFeatureExtractor
        • AutoImageProcessor
        • AutoProcessor
        • Generic model classes
          • AutoModel
          • TFAutoModel
          • FlaxAutoModel
        • Generic pretraining classes
          • AutoModelForPreTraining
          • TFAutoModelForPreTraining
          • FlaxAutoModelForPreTraining
        • Natural Language Processing
          • AutoModelForCausalLM
          • TFAutoModelForCausalLM
          • FlaxAutoModelForCausalLM
          • AutoModelForMaskedLM
          • TFAutoModelForMaskedLM
          • FlaxAutoModelForMaskedLM
          • AutoModelForMaskGenerationge
          • TFAutoModelForMaskGeneration
          • AutoModelForSeq2SeqLM
          • TFAutoModelForSeq2SeqLM
          • FlaxAutoModelForSeq2SeqLM
          • AutoModelForSequenceClassification
          • TFAutoModelForSequenceClassification
          • FlaxAutoModelForSequenceClassification
          • AutoModelForMultipleChoice
          • TFAutoModelForMultipleChoice
          • FlaxAutoModelForMultipleChoice
          • AutoModelForNextSentencePrediction
          • TFAutoModelForNextSentencePrediction
          • FlaxAutoModelForNextSentencePrediction
          • AutoModelForTokenClassification
          • TFAutoModelForTokenClassification
          • FlaxAutoModelForTokenClassification
          • AutoModelForQuestionAnswering
          • TFAutoModelForQuestionAnswering
          • FlaxAutoModelForQuestionAnswering
          • AutoModelForTextEncoding
          • TFAutoModelForTextEncoding
        • Computer vision
          • AutoModelForDepthEstimation
          • AutoModelForImageClassification
          • TFAutoModelForImageClassification
          • FlaxAutoModelForImageClassification
          • AutoModelForVideoClassification
          • AutoModelForMaskedImageModeling
          • TFAutoModelForMaskedImageModeling
          • AutoModelForObjectDetection
          • AutoModelForImageSegmentation
          • AutoModelForImageToImage
          • AutoModelForSemanticSegmentation
          • TFAutoModelForSemanticSegmentation
          • AutoModelForInstanceSegmentation
          • AutoModelForUniversalSegmentation
          • AutoModelForZeroShotImageClassification
          • TFAutoModelForZeroShotImageClassification
          • AutoModelForZeroShotObjectDetection
        • Audio
          • AutoModelForAudioClassification
          • AutoModelForAudioFrameClassification
          • TFAutoModelForAudioFrameClassification
          • AutoModelForCTC
          • AutoModelForSpeechSeq2Seq
          • TFAutoModelForSpeechSeq2Seq
          • FlaxAutoModelForSpeechSeq2Seq
          • AutoModelForAudioXVector
          • AutoModelForTextToSpectrogram
          • AutoModelForTextToWaveform
        • Multimodal
          • AutoModelForTableQuestionAnswering
          • TFAutoModelForTableQuestionAnswering
          • AutoModelForDocumentQuestionAnswering
          • TFAutoModelForDocumentQuestionAnswering
          • AutoModelForVisualQuestionAnswering
          • AutoModelForVision2Seq
          • TFAutoModelForVision2Seq
          • FlaxAutoModelForVision2Seq
      • Callbacks
      • Configuration
      • Data Collator
      • Keras callbacks
      • Logging
      • Models
      • Text Generation
      • ONNX
      • Optimization
      • Model outputs
      • Pipelines
      • Processors
      • Quantization
      • Tokenizer
      • Trainer
      • DeepSpeed Integration
      • Feature Extractor
      • Image Processor
    • 🌍MODELS
      • 🌍TEXT MODELS
        • ALBERT
        • BART
        • BARThez
        • BARTpho
        • BERT
        • BertGeneration
        • BertJapanese
        • Bertweet
        • BigBird
        • BigBirdPegasus
        • BioGpt
        • Blenderbot
        • Blenderbot Small
        • BLOOM
        • BORT
        • ByT5
        • CamemBERT
        • CANINE
        • CodeGen
        • CodeLlama
        • ConvBERT
        • CPM
        • CPMANT
        • CTRL
        • DeBERTa
        • DeBERTa-v2
        • DialoGPT
        • DistilBERT
        • DPR
        • ELECTRA
        • Encoder Decoder Models
        • ERNIE
        • ErnieM
        • ESM
        • Falcon
        • FLAN-T5
        • FLAN-UL2
        • FlauBERT
        • FNet
        • FSMT
        • Funnel Transformer
        • GPT
        • GPT Neo
        • GPT NeoX
        • GPT NeoX Japanese
        • GPT-J
        • GPT2
        • GPTBigCode
        • GPTSAN Japanese
        • GPTSw3
        • HerBERT
        • I-BERT
        • Jukebox
        • LED
        • LLaMA
        • LLama2
        • Longformer
        • LongT5
        • LUKE
        • M2M100
        • MarianMT
        • MarkupLM
        • MBart and MBart-50
        • MEGA
        • MegatronBERT
        • MegatronGPT2
        • Mistral
        • mLUKE
        • MobileBERT
        • MPNet
        • MPT
        • MRA
        • MT5
        • MVP
        • NEZHA
        • NLLB
        • NLLB-MoE
        • NystrΓΆmformer
        • Open-Llama
        • OPT
        • Pegasus
        • PEGASUS-X
        • Persimmon
        • PhoBERT
        • PLBart
        • ProphetNet
        • QDQBert
        • RAG
        • REALM
        • Reformer
        • RemBERT
        • RetriBERT
        • RoBERTa
        • RoBERTa-PreLayerNorm
        • RoCBert
        • RoFormer
        • RWKV
        • Splinter
        • SqueezeBERT
        • SwitchTransformers
        • T5
        • T5v1.1
        • TAPEX
        • Transformer XL
        • UL2
        • UMT5
        • X-MOD
        • XGLM
        • XLM
        • XLM-ProphetNet
        • XLM-RoBERTa
        • XLM-RoBERTa-XL
        • XLM-V
        • XLNet
        • YOSO
      • 🌍VISION MODELS
        • BEiT
        • BiT
        • Conditional DETR
        • ConvNeXT
        • ConvNeXTV2
        • CvT
        • Deformable DETR
        • DeiT
        • DETA
        • DETR
        • DiNAT
        • DINO V2
        • DiT
        • DPT
        • EfficientFormer
        • EfficientNet
        • FocalNet
        • GLPN
        • ImageGPT
        • LeViT
        • Mask2Former
        • MaskFormer
        • MobileNetV1
        • MobileNetV2
        • MobileViT
        • MobileViTV2
        • NAT
        • PoolFormer
        • Pyramid Vision Transformer (PVT)
        • RegNet
        • ResNet
        • SegFormer
        • SwiftFormer
        • Swin Transformer
        • Swin Transformer V2
        • Swin2SR
        • Table Transformer
        • TimeSformer
        • UperNet
        • VAN
        • VideoMAE
        • Vision Transformer (ViT)
        • ViT Hybrid
        • ViTDet
        • ViTMAE
        • ViTMatte
        • ViTMSN
        • ViViT
        • YOLOS
      • 🌍AUDIO MODELS
        • Audio Spectrogram Transformer
        • Bark
        • CLAP
        • EnCodec
        • Hubert
        • MCTCT
        • MMS
        • MusicGen
        • Pop2Piano
        • SEW
        • SEW-D
        • Speech2Text
        • Speech2Text2
        • SpeechT5
        • UniSpeech
        • UniSpeech-SAT
        • VITS
        • Wav2Vec2
        • Wav2Vec2-Conformer
        • Wav2Vec2Phoneme
        • WavLM
        • Whisper
        • XLS-R
        • XLSR-Wav2Vec2
      • 🌍MULTIMODAL MODELS
        • ALIGN
        • AltCLIP
        • BLIP
        • BLIP-2
        • BridgeTower
        • BROS
        • Chinese-CLIP
        • CLIP
        • CLIPSeg
        • Data2Vec
        • DePlot
        • Donut
        • FLAVA
        • GIT
        • GroupViT
        • IDEFICS
        • InstructBLIP
        • LayoutLM
        • LayoutLMV2
        • LayoutLMV3
        • LayoutXLM
        • LiLT
        • LXMERT
        • MatCha
        • MGP-STR
        • Nougat
        • OneFormer
        • OWL-ViT
        • Perceiver
        • Pix2Struct
        • Segment Anything
        • Speech Encoder Decoder Models
        • TAPAS
        • TrOCR
        • TVLT
        • ViLT
        • Vision Encoder Decoder Models
        • Vision Text Dual Encoder
        • VisualBERT
        • X-CLIP
      • 🌍REINFORCEMENT LEARNING MODELS
        • Decision Transformer
        • Trajectory Transformer
      • 🌍TIME SERIES MODELS
        • Autoformer
        • Informer
        • Time Series Transformer
      • 🌍GRAPH MODELS
        • Graphormer
  • 🌍INTERNAL HELPERS
    • Custom Layers and Utilities
    • Utilities for pipelines
    • Utilities for Tokenizers
    • Utilities for Trainer
    • Utilities for Generation
    • Utilities for Image Processors
    • Utilities for Audio processing
    • General Utilities
    • Utilities for Time Series
Powered by GitBook
On this page
  • Exporting 🌍 Transformers models to ONNX
  • ONNX Configurations
  • ONNX Features
  1. API
  2. MAIN CLASSES

ONNX

PreviousText GenerationNextOptimization

Last updated 1 year ago

Exporting 🌍 Transformers models to ONNX

🌍 Transformers provides a transformers.onnx package that enables you to convert model checkpoints to an ONNX graph by leveraging configuration objects.

See the on exporting 🌍 Transformers models for more details.

ONNX Configurations

We provide three abstract classes that you should inherit from, depending on the type of model architecture you wish to export:

  • Encoder-based models inherit from

  • Decoder-based models inherit from

  • Encoder-decoder models inherit from

OnnxConfig

class transformers.onnx.OnnxConfig

( config: PretrainedConfigtask: str = 'default'patching_specs: typing.List[transformers.onnx.config.PatchingSpec] = None )

Base class for ONNX exportable model describing metadata on how to export the model through the ONNX format.

flatten_output_collection_property

( name: strfield: typing.Iterable[typing.Any] ) β†’ (Dict[str, Any])

Returns

(Dict[str, Any])

Outputs with flattened structure and key mapping this new structure.

Flatten any potential nested structure expanding the name of the field with the index of the element within the structure.

from_model_config

( config: PretrainedConfigtask: str = 'default' )

Instantiate a OnnxConfig for a specific model

generate_dummy_inputs

( preprocessor: typing.Union[ForwardRef('PreTrainedTokenizerBase'), ForwardRef('FeatureExtractionMixin'), ForwardRef('ImageProcessingMixin')]batch_size: int = -1seq_length: int = -1num_choices: int = -1is_pair: bool = Falseframework: typing.Optional[transformers.utils.generic.TensorType] = Nonenum_channels: int = 3image_width: int = 40image_height: int = 40sampling_rate: int = 22050time_duration: float = 5.0frequency: int = 220tokenizer: PreTrainedTokenizerBase = None )

Parameters

  • batch_size (int, optional, defaults to -1) β€” The batch size to export the model for (-1 means dynamic axis).

  • num_choices (int, optional, defaults to -1) β€” The number of candidate answers provided for multiple choice task (-1 means dynamic axis).

  • seq_length (int, optional, defaults to -1) β€” The sequence length to export the model for (-1 means dynamic axis).

  • is_pair (bool, optional, defaults to False) β€” Indicate if the input is a pair (sentence 1, sentence 2)

  • framework (TensorType, optional, defaults to None) β€” The framework (PyTorch or TensorFlow) that the tokenizer will generate tensors for.

  • num_channels (int, optional, defaults to 3) β€” The number of channels of the generated images.

  • image_width (int, optional, defaults to 40) β€” The width of the generated images.

  • image_height (int, optional, defaults to 40) β€” The height of the generated images.

  • sampling_rate (int, optional defaults to 22050) β€” The sampling rate for audio data generation.

  • time_duration (float, optional defaults to 5.0) β€” Total seconds of sampling for audio data generation.

  • frequency (int, optional defaults to 220) β€” The desired natural frequency of generated audio.

Generate inputs to provide to the ONNX exporter for the specific framework

generate_dummy_inputs_onnxruntime

( reference_model_inputs: typing.Mapping[str, typing.Any] ) β†’ Mapping[str, Tensor]

Parameters

  • reference_model_inputs ([Mapping[str, Tensor]) β€” Reference inputs for the model.

Returns

Mapping[str, Tensor]

The mapping holding the kwargs to provide to the model’s forward function

Generate inputs for ONNX Runtime using the reference model inputs. Override this to run inference with seq2seq models which have the encoder and decoder exported as separate ONNX files.

use_external_data_format

( num_parameters: int )

Flag indicating if the model requires using external data format

OnnxConfigWithPast

class transformers.onnx.OnnxConfigWithPast

( config: PretrainedConfigtask: str = 'default'patching_specs: typing.List[transformers.onnx.config.PatchingSpec] = Noneuse_past: bool = False )

fill_with_past_key_values_

( inputs_or_outputs: typing.Mapping[str, typing.Mapping[int, str]]direction: strinverted_values_shape: bool = False )

Fill the input_or_outputs mapping with past_key_values dynamic axes considering.

with_past

( config: PretrainedConfigtask: str = 'default' )

Instantiate a OnnxConfig with use_past attribute set to True

OnnxSeq2SeqConfigWithPast

class transformers.onnx.OnnxSeq2SeqConfigWithPast

( config: PretrainedConfigtask: str = 'default'patching_specs: typing.List[transformers.onnx.config.PatchingSpec] = Noneuse_past: bool = False )

ONNX Features

Each ONNX configuration is associated with a set of features that enable you to export models for different types of topologies or tasks.

FeaturesManager

class transformers.onnx.FeaturesManager

( )

check_supported_model_or_raise

( model: typing.Union[ForwardRef('PreTrainedModel'), ForwardRef('TFPreTrainedModel')]feature: str = 'default' )

Check whether or not the model has the requested features.

determine_framework

( model: strframework: str = None )

Parameters

  • model (str) β€” The name of the model to export.

  • framework (str, optional, defaults to None) β€” The framework to use for the export. See above for priority if none provided.

Determines the framework to use for the export.

The priority is in the following order:

  1. User input via framework.

  2. If local checkpoint is provided, use the same framework as the checkpoint.

  3. Available framework in environment, with priority given to PyTorch

get_config

( model_type: strfeature: str ) β†’ OnnxConfig

Parameters

  • model_type (str) β€” The model type to retrieve the config for.

  • feature (str) β€” The feature to retrieve the config for.

Returns

OnnxConfig

config for the combination

Gets the OnnxConfig for a model_type and feature combination.

get_model_class_for_feature

( feature: strframework: str = 'pt' )

Parameters

  • feature (str) β€” The feature required.

  • framework (str, optional, defaults to "pt") β€” The framework to use for the export.

Attempts to retrieve an AutoModel class from a feature name.

get_model_from_feature

( feature: strmodel: strframework: str = Nonecache_dir: str = None )

Parameters

  • feature (str) β€” The feature required.

  • model (str) β€” The name of the model to export.

  • framework (str, optional, defaults to None) β€” The framework to use for the export. See FeaturesManager.determine_framework for the priority should none be provided.

Attempts to retrieve a model from a model’s name and the feature to be enabled.

get_supported_features_for_model_type

( model_type: strmodel_name: typing.Optional[str] = None )

Parameters

  • model_type (str) β€” The model type to retrieve the supported features for.

  • model_name (str, optional) β€” The name attribute of the model object, only used for the exception message.

Tries to retrieve the feature -> OnnxConfig constructor map from the model type.

🌍
🌍
guide
OnnxConfig
OnnxConfigWithPast
OnnxSeq2SeqConfigWithPast
<source>
<source>
<source>
<source>
<source>
<source>
<source>
<source>
<source>
<source>
<source>
<source>
<source>
<source>
<source>
<source>
<source>