Optimum
  • 🌍OVERVIEW
    • Optimum
    • Installation
    • Quick tour
    • Notebooks
    • 🌍CONCEPTUAL GUIDES
      • Quantization
  • 🌍HABANA
    • BOINC AI Optimum Habana
    • Installation
    • Quickstart
    • 🌍TUTORIALS
      • Overview
      • Single-HPU Training
      • Distributed Training
      • Run Inference
      • Stable Diffusion
      • LDM3D
    • 🌍HOW-TO GUIDES
      • Overview
      • Pretraining Transformers
      • Accelerating Training
      • Accelerating Inference
      • How to use DeepSpeed
      • Multi-node Training
    • 🌍CONCEPTUAL GUIDES
      • What are Habana's Gaudi and HPUs?
    • 🌍REFERENCE
      • Gaudi Trainer
      • Gaudi Configuration
      • Gaudi Stable Diffusion Pipeline
      • Distributed Runner
  • 🌍INTEL
    • BOINC AI Optimum Intel
    • Installation
    • 🌍NEURAL COMPRESSOR
      • Optimization
      • Distributed Training
      • Reference
    • 🌍OPENVINO
      • Models for inference
      • Optimization
      • Reference
  • 🌍AWS TRAINIUM/INFERENTIA
    • BOINC AI Optimum Neuron
  • 🌍FURIOSA
    • BOINC AI Optimum Furiosa
    • Installation
    • 🌍HOW-TO GUIDES
      • Overview
      • Modeling
      • Quantization
    • 🌍REFERENCE
      • Models
      • Configuration
      • Quantization
  • 🌍ONNX RUNTIME
    • Overview
    • Quick tour
    • 🌍HOW-TO GUIDES
      • Inference pipelines
      • Models for inference
      • How to apply graph optimization
      • How to apply dynamic and static quantization
      • How to accelerate training
      • Accelerated inference on NVIDIA GPUs
    • 🌍CONCEPTUAL GUIDES
      • ONNX And ONNX Runtime
    • 🌍REFERENCE
      • ONNX Runtime Models
      • Configuration
      • Optimization
      • Quantization
      • Trainer
  • 🌍EXPORTERS
    • Overview
    • The TasksManager
    • 🌍ONNX
      • Overview
      • 🌍HOW-TO GUIDES
        • Export a model to ONNX
        • Add support for exporting an architecture to ONNX
      • 🌍REFERENCE
        • ONNX configurations
        • Export functions
    • 🌍TFLITE
      • Overview
      • 🌍HOW-TO GUIDES
        • Export a model to TFLite
        • Add support for exporting an architecture to TFLite
      • 🌍REFERENCE
        • TFLite configurations
        • Export functions
  • 🌍TORCH FX
    • Overview
    • 🌍HOW-TO GUIDES
      • Optimization
    • 🌍CONCEPTUAL GUIDES
      • Symbolic tracer
    • 🌍REFERENCE
      • Optimization
  • 🌍BETTERTRANSFORMER
    • Overview
    • 🌍TUTORIALS
      • Convert Transformers models to use BetterTransformer
      • How to add support for new architectures?
  • 🌍LLM QUANTIZATION
    • GPTQ quantization
  • 🌍UTILITIES
    • Dummy input generators
    • Normalized configurations
Powered by GitBook
On this page
  • Export a model to TFLite with optimum.exporters.tflite
  • Summary
  • Exporting a model to TFLite using the CLI
  1. EXPORTERS
  2. TFLITE
  3. HOW-TO GUIDES

Export a model to TFLite

Export a model to TFLite with optimum.exporters.tflite

Summary

Exporting a model to TFLite is as simple as

Copied

optimum-cli export tflite --model bert-base-uncased --sequence_length 128 bert_tflite/

Check out the help for more options:

Copied

optimum-cli export tflite --help

Exporting a model to TFLite using the CLI

To export a 🌍 Transformers model to TFLite, you’ll first need to install some extra dependencies:

Copied

pip install optimum[exporters-tf]

The Optimum TFLite export can be used through Optimum command-line. As only static input shapes are supported for now, they need to be specified during the export.

Copied

optimum-cli export tflite --help

usage: optimum-cli <command> [<args>] export tflite [-h] -m MODEL [--task TASK] [--atol ATOL] [--pad_token_id PAD_TOKEN_ID] [--cache_dir CACHE_DIR]
                                                    [--trust-remote-code] [--batch_size BATCH_SIZE] [--sequence_length SEQUENCE_LENGTH]
                                                    [--num_choices NUM_CHOICES] [--width WIDTH] [--height HEIGHT] [--num_channels NUM_CHANNELS]
                                                    [--feature_size FEATURE_SIZE] [--nb_max_frames NB_MAX_FRAMES]
                                                    [--audio_sequence_length AUDIO_SEQUENCE_LENGTH]
                                                    output

optional arguments:
  -h, --help            show this help message and exit

Required arguments:
  -m MODEL, --model MODEL
                        Model ID on boincai.com or path on disk to load model from.
  output                Path indicating the directory where to store generated TFLite model.

Optional arguments:
  --task TASK           The task to export the model for. If not specified, the task will be auto-inferred based on the model. Available tasks depend on
                        the model, but are among: ['default', 'fill-mask', 'text-generation', 'text2text-generation', 'text-classification', 'token-classification',
                        'multiple-choice', 'object-detection', 'question-answering', 'image-classification', 'image-segmentation', 'masked-im', 'semantic-
                        segmentation', 'automatic-speech-recognition', 'audio-classification', 'audio-frame-classification', 'automatic-speech-recognition', 'audio-xvector', 'vision2seq-
                        lm', 'stable-diffusion', 'zero-shot-object-detection']. For decoder models, use `xxx-with-past` to export the model using past key
                        values in the decoder.
  --atol ATOL           If specified, the absolute difference tolerance when validating the model. Otherwise, the default atol for the model will be used.
  --pad_token_id PAD_TOKEN_ID
                        This is needed by some models, for some tasks. If not provided, will attempt to use the tokenizer to guess it.
  --cache_dir CACHE_DIR
                        Path indicating where to store cache.
  --trust-remote-code   Allow to use custom code for the modeling hosted in the model repository. This option should only be set for repositories you trust
                        and in which you have read the code, as it will execute on your local machine arbitrary code present in the model repository.

Input shapes:
  --batch_size BATCH_SIZE
                        Batch size that the TFLite exported model will be able to take as input.
  --sequence_length SEQUENCE_LENGTH
                        Sequence length that the TFLite exported model will be able to take as input.
  --num_choices NUM_CHOICES
                        Only for the multiple-choice task. Num choices that the TFLite exported model will be able to take as input.
  --width WIDTH         Vision tasks only. Image width that the TFLite exported model will be able to take as input.
  --height HEIGHT       Vision tasks only. Image height that the TFLite exported model will be able to take as input.
  --num_channels NUM_CHANNELS
                        Vision tasks only. Number of channels used to represent the image that the TFLite exported model will be able to take as input.
                        (GREY = 1, RGB = 3, ARGB = 4)
  --feature_size FEATURE_SIZE
                        Audio tasks only. Feature dimension of the extracted features by the feature extractor that the TFLite exported model will be able
                        to take as input.
  --nb_max_frames NB_MAX_FRAMES
                        Audio tasks only. Maximum number of frames that the TFLite exported model will be able to take as input.
  --audio_sequence_length AUDIO_SEQUENCE_LENGTH
                        Audio tasks only. Audio sequence length that the TFLite exported model will be able to take as input.

PreviousHOW-TO GUIDESNextAdd support for exporting an architecture to TFLite

Last updated 1 year ago

🌍
🌍
🌍