Optimum
  • 🌍OVERVIEW
    • Optimum
    • Installation
    • Quick tour
    • Notebooks
    • 🌍CONCEPTUAL GUIDES
      • Quantization
  • 🌍HABANA
    • BOINC AI Optimum Habana
    • Installation
    • Quickstart
    • 🌍TUTORIALS
      • Overview
      • Single-HPU Training
      • Distributed Training
      • Run Inference
      • Stable Diffusion
      • LDM3D
    • 🌍HOW-TO GUIDES
      • Overview
      • Pretraining Transformers
      • Accelerating Training
      • Accelerating Inference
      • How to use DeepSpeed
      • Multi-node Training
    • 🌍CONCEPTUAL GUIDES
      • What are Habana's Gaudi and HPUs?
    • 🌍REFERENCE
      • Gaudi Trainer
      • Gaudi Configuration
      • Gaudi Stable Diffusion Pipeline
      • Distributed Runner
  • 🌍INTEL
    • BOINC AI Optimum Intel
    • Installation
    • 🌍NEURAL COMPRESSOR
      • Optimization
      • Distributed Training
      • Reference
    • 🌍OPENVINO
      • Models for inference
      • Optimization
      • Reference
  • 🌍AWS TRAINIUM/INFERENTIA
    • BOINC AI Optimum Neuron
  • 🌍FURIOSA
    • BOINC AI Optimum Furiosa
    • Installation
    • 🌍HOW-TO GUIDES
      • Overview
      • Modeling
      • Quantization
    • 🌍REFERENCE
      • Models
      • Configuration
      • Quantization
  • 🌍ONNX RUNTIME
    • Overview
    • Quick tour
    • 🌍HOW-TO GUIDES
      • Inference pipelines
      • Models for inference
      • How to apply graph optimization
      • How to apply dynamic and static quantization
      • How to accelerate training
      • Accelerated inference on NVIDIA GPUs
    • 🌍CONCEPTUAL GUIDES
      • ONNX And ONNX Runtime
    • 🌍REFERENCE
      • ONNX Runtime Models
      • Configuration
      • Optimization
      • Quantization
      • Trainer
  • 🌍EXPORTERS
    • Overview
    • The TasksManager
    • 🌍ONNX
      • Overview
      • 🌍HOW-TO GUIDES
        • Export a model to ONNX
        • Add support for exporting an architecture to ONNX
      • 🌍REFERENCE
        • ONNX configurations
        • Export functions
    • 🌍TFLITE
      • Overview
      • 🌍HOW-TO GUIDES
        • Export a model to TFLite
        • Add support for exporting an architecture to TFLite
      • 🌍REFERENCE
        • TFLite configurations
        • Export functions
  • 🌍TORCH FX
    • Overview
    • 🌍HOW-TO GUIDES
      • Optimization
    • 🌍CONCEPTUAL GUIDES
      • Symbolic tracer
    • 🌍REFERENCE
      • Optimization
  • 🌍BETTERTRANSFORMER
    • Overview
    • 🌍TUTORIALS
      • Convert Transformers models to use BetterTransformer
      • How to add support for new architectures?
  • 🌍LLM QUANTIZATION
    • GPTQ quantization
  • 🌍UTILITIES
    • Dummy input generators
    • Normalized configurations
Powered by GitBook
On this page
  • Gaudi Configuration
  • GaudiConfig
  1. HABANA
  2. REFERENCE

Gaudi Configuration

PreviousGaudi TrainerNextGaudi Stable Diffusion Pipeline

Last updated 1 year ago

Gaudi Configuration

In order to make the most of Gaudi, it is advised to rely on advanced features such as Habana Mixed Precision or optimized operators. You can specify which features to use in a Gaudi configuration, which will take the form of a JSON file following this template:

Copied

{
  "use_habana_mixed_precision": true/false,
  "hmp_is_verbose": true/false,
  "use_fused_adam": true/false,
  "use_fused_clip_norm": true/false,
  "hmp_bf16_ops": [
    "torch operator to compute in bf16",
    "..."
  ],
  "hmp_fp32_ops": [
    "torch operator to compute in fp32",
    "..."
  ]
}

Here is a description of each configuration parameter:

  • use_habana_mixed_precision enables to decide whether or not Habana Mixed Precision (HMP) should be used. HMP allows to mix fp32 and bf16 operations. You can find more information .

  • hmp_is_verbose enables to decide whether to log precision decisions for each operation for debugging purposes. It is disabled by default. You can find an example of such log .

  • use_fused_adam enables to decide whether to use the .

  • use_fused_clip_norm enables to decide whether to use the .

  • hmp_bf16_ops enables to specify the Torch operations that should be computed in bf16. You can find more information about casting rules .

  • hmp_fp32_ops enables to specify the Torch operations that should be computed in fp32. You can find more information about casting rules .

hmp_is_verbose, hmp_bf16_ops and hmp_fp32_ops will not be used if use_habana_mixed_precision is false.

Copied

{
  "use_habana_mixed_precision": true,
  "hmp_is_verbose": false,
  "use_fused_adam": true,
  "use_fused_clip_norm": true,
  "hmp_bf16_ops": [
    "add",
    "addmm",
    "bmm",
    "div",
    "dropout",
    "gelu",
    "iadd",
    "linear",
    "layer_norm",
    "matmul",
    "mm",
    "rsub",
    "softmax",
    "truediv"
  ],
  "hmp_fp32_ops": [
    "embedding",
    "nll_loss",
    "log_softmax"
  ]
}

To instantiate yourself a Gaudi configuration in your script, you can do the following

Copied

from optimum.habana import GaudiConfig

gaudi_config = GaudiConfig.from_pretrained(
    gaudi_config_name,
    cache_dir=model_args.cache_dir,
    revision=model_args.model_revision,
    use_auth_token=True if model_args.use_auth_token else None,
)

and pass it to the trainer with the gaudi_config argument.

GaudiConfig

class optimum.habana.GaudiConfig

( **kwargs )

You can find examples of Gaudi configurations in the . For instance, :

🌍
🌍
here
here
custom fused implementation of the ADAM optimizer provided by Habana
custom fused implementation of gradient norm clipping provided by Habana
here
here
Habana model repository on the BOINC AI Hub
for BERT Large we have
<source>