Optimum
  • 🌍OVERVIEW
    • Optimum
    • Installation
    • Quick tour
    • Notebooks
    • 🌍CONCEPTUAL GUIDES
      • Quantization
  • 🌍HABANA
    • BOINC AI Optimum Habana
    • Installation
    • Quickstart
    • 🌍TUTORIALS
      • Overview
      • Single-HPU Training
      • Distributed Training
      • Run Inference
      • Stable Diffusion
      • LDM3D
    • 🌍HOW-TO GUIDES
      • Overview
      • Pretraining Transformers
      • Accelerating Training
      • Accelerating Inference
      • How to use DeepSpeed
      • Multi-node Training
    • 🌍CONCEPTUAL GUIDES
      • What are Habana's Gaudi and HPUs?
    • 🌍REFERENCE
      • Gaudi Trainer
      • Gaudi Configuration
      • Gaudi Stable Diffusion Pipeline
      • Distributed Runner
  • 🌍INTEL
    • BOINC AI Optimum Intel
    • Installation
    • 🌍NEURAL COMPRESSOR
      • Optimization
      • Distributed Training
      • Reference
    • 🌍OPENVINO
      • Models for inference
      • Optimization
      • Reference
  • 🌍AWS TRAINIUM/INFERENTIA
    • BOINC AI Optimum Neuron
  • 🌍FURIOSA
    • BOINC AI Optimum Furiosa
    • Installation
    • 🌍HOW-TO GUIDES
      • Overview
      • Modeling
      • Quantization
    • 🌍REFERENCE
      • Models
      • Configuration
      • Quantization
  • 🌍ONNX RUNTIME
    • Overview
    • Quick tour
    • 🌍HOW-TO GUIDES
      • Inference pipelines
      • Models for inference
      • How to apply graph optimization
      • How to apply dynamic and static quantization
      • How to accelerate training
      • Accelerated inference on NVIDIA GPUs
    • 🌍CONCEPTUAL GUIDES
      • ONNX And ONNX Runtime
    • 🌍REFERENCE
      • ONNX Runtime Models
      • Configuration
      • Optimization
      • Quantization
      • Trainer
  • 🌍EXPORTERS
    • Overview
    • The TasksManager
    • 🌍ONNX
      • Overview
      • 🌍HOW-TO GUIDES
        • Export a model to ONNX
        • Add support for exporting an architecture to ONNX
      • 🌍REFERENCE
        • ONNX configurations
        • Export functions
    • 🌍TFLITE
      • Overview
      • 🌍HOW-TO GUIDES
        • Export a model to TFLite
        • Add support for exporting an architecture to TFLite
      • 🌍REFERENCE
        • TFLite configurations
        • Export functions
  • 🌍TORCH FX
    • Overview
    • 🌍HOW-TO GUIDES
      • Optimization
    • 🌍CONCEPTUAL GUIDES
      • Symbolic tracer
    • 🌍REFERENCE
      • Optimization
  • 🌍BETTERTRANSFORMER
    • Overview
    • 🌍TUTORIALS
      • Convert Transformers models to use BetterTransformer
      • How to add support for new architectures?
  • 🌍LLM QUANTIZATION
    • GPTQ quantization
  • 🌍UTILITIES
    • Dummy input generators
    • Normalized configurations
Powered by GitBook
On this page
  • Optimization
  • Transformation
  • Reversible transformation
  1. TORCH FX
  2. REFERENCE

Optimization

PreviousREFERENCENextBETTERTRANSFORMER

Last updated 1 year ago

Optimization

Transformation

class optimum.fx.optimization.Transformation

( )

Parameters

  • preserves_computation (bool, defaults to False) — Whether the transformation preserves the graph computation or not. If True, the original and the transformed graph should produce the same outputs.

A torch.fx graph transformation.

It must implement the method, and be used as a callable.

__call__

( graph_module: GraphModulelint_and_recompile: bool = True ) → torch.fx.GraphModule

Parameters

  • graph_module (torch.fx.GraphModule) — The module to transform.

  • lint_and_recompile (bool, defaults to True) — Whether the transformed module should be linted and recompiled. This can be set to False when chaining transformations together to perform this operation only once.

Returns

torch.fx.GraphModule

The transformed module.

get_transformed_nodes

( graph_module: GraphModule ) → List[torch.fx.Node]

Parameters

  • graph_module (torch.fx.GraphModule) — The graph_module to get the nodes from.

Returns

List[torch.fx.Node]

Gives the list of nodes that were transformed by the transformation.

mark_as_transformed

( node: Node )

Parameters

  • node (torch.fx.Node) — The node to mark as transformed.

Marks a node as transformed by this transformation.

transform

( graph_module: GraphModule ) → torch.fx.GraphModule

Parameters

  • graph_module (torch.fx.GraphModule) — The module to transform.

Returns

torch.fx.GraphModule

The transformed module.

transformed

( node: Node ) → bool

Parameters

  • node (torch.fx.Node) — The node to check.

Returns

bool

Specifies whether the node was transformed by this transformation or not.

Reversible transformation

class optimum.fx.optimization.ReversibleTransformation

( )

Parameters

  • preserves_computation (bool, defaults to False) — Whether the transformation preserves the graph computation or not. If True, the original and the transformed graph should produce the same outputs.

A torch.fx graph transformation that is reversible.

__call__

( graph_module: GraphModulelint_and_recompile: bool = Truereverse: bool = False ) → torch.fx.GraphModule

Parameters

  • graph_module (torch.fx.GraphModule) — The module to transform.

  • lint_and_recompile (bool, defaults to True) — Whether the transformed module should be linted and recompiled. This can be set to False when chaining transformations together to perform this operation only once.

  • reverse (bool, defaults to False) — If True, the reverse transformation is performed.

Returns

torch.fx.GraphModule

The transformed module.

mark_as_restored

( node: Node )

Parameters

  • node (torch.fx.Node) — The node to mark as restored.

Marks a node as restored back to its original state.

reverse

( graph_module: GraphModule ) → torch.fx.GraphModule

Parameters

  • graph_module (torch.fx.GraphModule) — The module to transform.

Returns

torch.fx.GraphModule

The reverse transformed module.

optimum.fx.optimization.compose

( *args: Transformationinplace: bool = True )

Parameters

  • inplace (bool, defaults to True) — Whether the resulting transformation should be inplace, or create a new graph module.

Composes a list of transformations together.

Example:

Copied

>>> from transformers import BertModel
>>> from transformers.utils.fx import symbolic_trace
>>> from optimum.fx.optimization import ChangeTrueDivToMulByInverse, MergeLinears, compose

>>> model = BertModel.from_pretrained("bert-base-uncased")
>>> traced = symbolic_trace(
...     model,
...     input_names=["input_ids", "attention_mask", "token_type_ids"],
... )
>>> composition = compose(ChangeTrueDivToMulByInverse(), MergeLinears())
>>> transformed_model = composition(traced)

Transformations

class optimum.fx.optimization.MergeLinears

( )

Parameters

  • preserves_computation (bool, defaults to False) — Whether the transformation preserves the graph computation or not. If True, the original and the transformed graph should produce the same outputs.

Transformation that merges linear layers that take the same input into one big linear layer.

Example:

Copied

>>> from transformers import BertModel
>>> from transformers.utils.fx import symbolic_trace
>>> from optimum.fx.optimization import MergeLinears

>>> model = BertModel.from_pretrained("bert-base-uncased")
>>> traced = symbolic_trace(
...     model,
...     input_names=["input_ids", "attention_mask", "token_type_ids"],
... )
>>> transformation = MergeLinears()
>>> transformed_model = transformation(traced)
>>> restored_model = transformation(transformed_model, reverse=True)

class optimum.fx.optimization.FuseBiasInLinear

( )

Parameters

  • preserves_computation (bool, defaults to False) — Whether the transformation preserves the graph computation or not. If True, the original and the transformed graph should produce the same outputs.

Transformation that fuses the bias to the weight in torch.nn.Linear.

Example:

Copied

>>> from transformers import BertModel
>>> from transformers.utils.fx import symbolic_trace
>>> from optimum.fx.optimization import FuseBiasInLinear

>>> model = BertModel.from_pretrained("bert-base-uncased")
>>> traced = symbolic_trace(
...     model,
...     input_names=["input_ids", "attention_mask", "token_type_ids"],
... )
>>> transformation = FuseBiasInLinear()
>>> transformed_model = transformation(traced)
>>> restored_model = transformation(transformed_model, reverse=True)

class optimum.fx.optimization.ChangeTrueDivToMulByInverse

( )

Parameters

  • preserves_computation (bool, defaults to False) — Whether the transformation preserves the graph computation or not. If True, the original and the transformed graph should produce the same outputs.

Transformation that changes truediv nodes to multiplication by the inverse nodes when the denominator is static. For example, that is sometimes the case for the scaling factor in attention layers.

Example:

Copied

>>> from transformers import BertModel
>>> from transformers.utils.fx import symbolic_trace
>>> from optimum.fx.optimization import ChangeTrueDivToMulByInverse

>>> model = BertModel.from_pretrained("bert-base-uncased")
>>> traced = symbolic_trace(
...     model,
...     input_names=["input_ids", "attention_mask", "token_type_ids"],
... )
>>> transformation = ChangeTrueDivToMulByInverse()
>>> transformed_model = transformation(traced)
>>> restored_model = transformation(transformed_model, reverse=True)

class optimum.fx.optimization.FuseBatchNorm2dInConv2d

( )

Parameters

  • preserves_computation (bool, defaults to False) — Whether the transformation preserves the graph computation or not. If True, the original and the transformed graph should produce the same outputs.

Transformation that fuses nn.BatchNorm2d following nn.Conv2d into a single nn.Conv2d. The fusion will be done only if the convolution has the batch normalization as sole following node.

For example, fusion will not be done in the case

Copied

     Conv2d
     /   \
    /     \
ReLU   BatchNorm2d

Example:

Copied

>>> from transformers.utils.fx import symbolic_trace
>>> from transformers import AutoModelForImageClassification

>>> from optimum.fx.optimization import FuseBatchNorm2dInConv2d

>>> model = AutoModelForImageClassification.from_pretrained("microsoft/resnet-50")
>>> model.eval()
>>> traced_model = symbolic_trace(
...     model,
...     input_names=["pixel_values"],
...     disable_check=True
... )

>>> transformation = FuseBatchNorm2dInConv2d()
>>> transformed_model = transformation(traced_model)

class optimum.fx.optimization.FuseBatchNorm1dInLinear

( )

Parameters

  • preserves_computation (bool, defaults to False) — Whether the transformation preserves the graph computation or not. If True, the original and the transformed graph should produce the same outputs.

Transformation that fuses nn.BatchNorm1d following or preceding nn.Linear into a single nn.Linear. The fusion will be done only if the linear layer has the batch normalization as sole following node, or the batch normalization has the linear layer as sole following node.

For example, fusion will not be done in the case

Copied

     Linear
     /   \
    /     \
ReLU   BatchNorm1d

Example:

Copied

>>> from transformers.utils.fx import symbolic_trace
>>> from transformers import AutoModel

>>> from optimum.fx.optimization import FuseBatchNorm1dInLinear

>>> model = AutoModel.from_pretrained("nvidia/groupvit-gcc-yfcc")
>>> model.eval()
>>> traced_model = symbolic_trace(
...     model,
...     input_names=["input_ids", "attention_mask", "pixel_values"],
...     disable_check=True
... )

>>> transformation = FuseBatchNorm1dInLinear()
>>> transformed_model = transformation(traced_model)

It must implement the and methods, and be used as a callable.

args () — The transformations to compose together.

🌍
🌍
<source>
transform()
<source>
<source>
<source>
<source>
<source>
<source>
transform()
reverse()
<source>
<source>
<source>
<source>
Transformation
<source>
<source>
<source>
<source>
<source>