Transformers
search
⌘Ctrlk
Transformers
  • 🌍GET STARTED
  • 🌍TUTORIALS
  • 🌍TASK GUIDES
  • 🌍DEVELOPER GUIDES
  • 🌍PERFORMANCE AND SCALABILITY
    • Overview
    • 🌍EFFICIENT TRAINING TECHNIQUES
    • 🌍OPTIMIZING INFERENCE
      • Inference on CPU
      • Inference on one GPU
      • Inference on many GPUs
      • Inference on Specialized Hardware
    • Instantiating a big model
    • Troubleshooting
    • XLA Integration for TensorFlow Models
    • Optimize inference using `torch.compile()`
  • 🌍CONTRIBUTE
  • 🌍CONCEPTUAL GUIDES
  • 🌍API
  • 🌍INTERNAL HELPERS
gitbookPowered by GitBook
block-quoteOn this pagechevron-down
  1. 🌍PERFORMANCE AND SCALABILITY

🌍OPTIMIZING INFERENCE

Inference on CPUchevron-rightInference on one GPUchevron-rightInference on many GPUschevron-rightInference on Specialized Hardwarechevron-right
PreviousHyperparameter Search using Trainer APIchevron-leftNextInference on CPUchevron-right