🌍Quickstart

Quickstart

🌍 Optimum Neuron was designed with one goal in mind: to make training and inference straightforward for any 🌍 Transformers user while leveraging the complete power of AWS Accelerators.

Training

There are two main classes one needs to know:

  • NeuronArgumentParser: inherits the original BaArgumentParserarrow-up-right in Transformers with additional checks on the argument values to make sure that they will work well with AWS Trainium instances.

  • NeuronTrainerarrow-up-right: the trainer class that takes care of compiling and distributing the model to run on Trainium Chips, and performing training and evaluation.

The NeuronTrainerarrow-up-right is very similar to the 🌍 Transformers Trainerarrow-up-right, and adapting a script using the Trainer to make it work with Trainium will mostly consist in simply swapping the Trainer class for the NeuronTrainer one. That’s how most of the example scriptsarrow-up-right were adapted from their original counterpartsarrow-up-right.

modifications:

Copied

from transformers import TrainingArguments
-from transformers import Trainer
+from optimum.neuron import NeuronTrainer as Trainer
training_args = TrainingArguments(
  # training arguments...
)
# A lot of code here
# Initialize our Trainer
trainer = Trainer(
    model=model,
    args=training_args,  # Original training arguments.
    train_dataset=train_dataset if training_args.do_train else None,
    eval_dataset=eval_dataset if training_args.do_eval else None,
    compute_metrics=compute_metrics,
    tokenizer=tokenizer,
    data_collator=data_collator,
)

All Trainium instances come at least with 2 Neuron Cores. To leverage those we need to launch the training whith torchrun. Below you see and example of how to launch a training script on a trn1.2xlarge instance using a bert-base-uncased model.

Copied

Inference

You can compile and export your 🌍 Transformers models to a serialized format before inference on Neuron devices:

Copied

The command above will export distilbert-base-uncased-finetuned-sst-2-english with static shapes: batch_size=1 and sequence_length=32, and cast all matmul operations from FP32 to BF16. Check out the exporter guidearrow-up-right for more compilation options.

Then you can run the exported Neuron model on Neuron devices with NeuronModelForXXX classes which are similar to AutoModelForXXX classes in 🌍 Transformers:

Copied

Last updated