Transformers
Ctrlk
  • 🌍GET STARTED
  • 🌍TUTORIALS
  • 🌍TASK GUIDES
  • 🌍DEVELOPER GUIDES
  • 🌍PERFORMANCE AND SCALABILITY
    • Overview
    • 🌍EFFICIENT TRAINING TECHNIQUES
    • 🌍OPTIMIZING INFERENCE
      • Inference on CPU
      • Inference on one GPU
      • Inference on many GPUs
      • Inference on Specialized Hardware
    • Instantiating a big model
    • Troubleshooting
    • XLA Integration for TensorFlow Models
    • Optimize inference using `torch.compile()`
  • 🌍CONTRIBUTE
  • 🌍CONCEPTUAL GUIDES
  • 🌍API
  • 🌍INTERNAL HELPERS
Powered by GitBook
On this page
  1. 🌍PERFORMANCE AND SCALABILITY

🌍OPTIMIZING INFERENCE

Inference on CPUInference on one GPUInference on many GPUsInference on Specialized Hardware
PreviousHyperparameter Search using Trainer APINextInference on CPU