Transformers
  • 🌍GET STARTED
    • Transformers
    • Quick tour
    • Installation
  • 🌍TUTORIALS
    • Run inference with pipelines
    • Write portable code with AutoClass
    • Preprocess data
    • Fine-tune a pretrained model
    • Train with a script
    • Set up distributed training with BOINC AI Accelerate
    • Load and train adapters with BOINC AI PEFT
    • Share your model
    • Agents
    • Generation with LLMs
  • 🌍TASK GUIDES
    • 🌍NATURAL LANGUAGE PROCESSING
      • Text classification
      • Token classification
      • Question answering
      • Causal language modeling
      • Masked language modeling
      • Translation
      • Summarization
      • Multiple choice
    • 🌍AUDIO
      • Audio classification
      • Automatic speech recognition
    • 🌍COMPUTER VISION
      • Image classification
      • Semantic segmentation
      • Video classification
      • Object detection
      • Zero-shot object detection
      • Zero-shot image classification
      • Depth estimation
    • 🌍MULTIMODAL
      • Image captioning
      • Document Question Answering
      • Visual Question Answering
      • Text to speech
    • 🌍GENERATION
      • Customize the generation strategy
    • 🌍PROMPTING
      • Image tasks with IDEFICS
  • 🌍DEVELOPER GUIDES
    • Use fast tokenizers from BOINC AI Tokenizers
    • Run inference with multilingual models
    • Use model-specific APIs
    • Share a custom model
    • Templates for chat models
    • Run training on Amazon SageMaker
    • Export to ONNX
    • Export to TFLite
    • Export to TorchScript
    • Benchmarks
    • Notebooks with examples
    • Community resources
    • Custom Tools and Prompts
    • Troubleshoot
  • 🌍PERFORMANCE AND SCALABILITY
    • Overview
    • 🌍EFFICIENT TRAINING TECHNIQUES
      • Methods and tools for efficient training on a single GPU
      • Multiple GPUs and parallelism
      • Efficient training on CPU
      • Distributed CPU training
      • Training on TPUs
      • Training on TPU with TensorFlow
      • Training on Specialized Hardware
      • Custom hardware for training
      • Hyperparameter Search using Trainer API
    • 🌍OPTIMIZING INFERENCE
      • Inference on CPU
      • Inference on one GPU
      • Inference on many GPUs
      • Inference on Specialized Hardware
    • Instantiating a big model
    • Troubleshooting
    • XLA Integration for TensorFlow Models
    • Optimize inference using `torch.compile()`
  • 🌍CONTRIBUTE
    • How to contribute to transformers?
    • How to add a model to BOINC AI Transformers?
    • How to convert a BOINC AI Transformers model to TensorFlow?
    • How to add a pipeline to BOINC AI Transformers?
    • Testing
    • Checks on a Pull Request
  • 🌍CONCEPTUAL GUIDES
    • Philosophy
    • Glossary
    • What BOINC AI Transformers can do
    • How BOINC AI Transformers solve tasks
    • The Transformer model family
    • Summary of the tokenizers
    • Attention mechanisms
    • Padding and truncation
    • BERTology
    • Perplexity of fixed-length models
    • Pipelines for webserver inference
    • Model training anatomy
  • 🌍API
    • 🌍MAIN CLASSES
      • Agents and Tools
      • 🌍Auto Classes
        • Extending the Auto Classes
        • AutoConfig
        • AutoTokenizer
        • AutoFeatureExtractor
        • AutoImageProcessor
        • AutoProcessor
        • Generic model classes
          • AutoModel
          • TFAutoModel
          • FlaxAutoModel
        • Generic pretraining classes
          • AutoModelForPreTraining
          • TFAutoModelForPreTraining
          • FlaxAutoModelForPreTraining
        • Natural Language Processing
          • AutoModelForCausalLM
          • TFAutoModelForCausalLM
          • FlaxAutoModelForCausalLM
          • AutoModelForMaskedLM
          • TFAutoModelForMaskedLM
          • FlaxAutoModelForMaskedLM
          • AutoModelForMaskGenerationge
          • TFAutoModelForMaskGeneration
          • AutoModelForSeq2SeqLM
          • TFAutoModelForSeq2SeqLM
          • FlaxAutoModelForSeq2SeqLM
          • AutoModelForSequenceClassification
          • TFAutoModelForSequenceClassification
          • FlaxAutoModelForSequenceClassification
          • AutoModelForMultipleChoice
          • TFAutoModelForMultipleChoice
          • FlaxAutoModelForMultipleChoice
          • AutoModelForNextSentencePrediction
          • TFAutoModelForNextSentencePrediction
          • FlaxAutoModelForNextSentencePrediction
          • AutoModelForTokenClassification
          • TFAutoModelForTokenClassification
          • FlaxAutoModelForTokenClassification
          • AutoModelForQuestionAnswering
          • TFAutoModelForQuestionAnswering
          • FlaxAutoModelForQuestionAnswering
          • AutoModelForTextEncoding
          • TFAutoModelForTextEncoding
        • Computer vision
          • AutoModelForDepthEstimation
          • AutoModelForImageClassification
          • TFAutoModelForImageClassification
          • FlaxAutoModelForImageClassification
          • AutoModelForVideoClassification
          • AutoModelForMaskedImageModeling
          • TFAutoModelForMaskedImageModeling
          • AutoModelForObjectDetection
          • AutoModelForImageSegmentation
          • AutoModelForImageToImage
          • AutoModelForSemanticSegmentation
          • TFAutoModelForSemanticSegmentation
          • AutoModelForInstanceSegmentation
          • AutoModelForUniversalSegmentation
          • AutoModelForZeroShotImageClassification
          • TFAutoModelForZeroShotImageClassification
          • AutoModelForZeroShotObjectDetection
        • Audio
          • AutoModelForAudioClassification
          • AutoModelForAudioFrameClassification
          • TFAutoModelForAudioFrameClassification
          • AutoModelForCTC
          • AutoModelForSpeechSeq2Seq
          • TFAutoModelForSpeechSeq2Seq
          • FlaxAutoModelForSpeechSeq2Seq
          • AutoModelForAudioXVector
          • AutoModelForTextToSpectrogram
          • AutoModelForTextToWaveform
        • Multimodal
          • AutoModelForTableQuestionAnswering
          • TFAutoModelForTableQuestionAnswering
          • AutoModelForDocumentQuestionAnswering
          • TFAutoModelForDocumentQuestionAnswering
          • AutoModelForVisualQuestionAnswering
          • AutoModelForVision2Seq
          • TFAutoModelForVision2Seq
          • FlaxAutoModelForVision2Seq
      • Callbacks
      • Configuration
      • Data Collator
      • Keras callbacks
      • Logging
      • Models
      • Text Generation
      • ONNX
      • Optimization
      • Model outputs
      • Pipelines
      • Processors
      • Quantization
      • Tokenizer
      • Trainer
      • DeepSpeed Integration
      • Feature Extractor
      • Image Processor
    • 🌍MODELS
      • 🌍TEXT MODELS
        • ALBERT
        • BART
        • BARThez
        • BARTpho
        • BERT
        • BertGeneration
        • BertJapanese
        • Bertweet
        • BigBird
        • BigBirdPegasus
        • BioGpt
        • Blenderbot
        • Blenderbot Small
        • BLOOM
        • BORT
        • ByT5
        • CamemBERT
        • CANINE
        • CodeGen
        • CodeLlama
        • ConvBERT
        • CPM
        • CPMANT
        • CTRL
        • DeBERTa
        • DeBERTa-v2
        • DialoGPT
        • DistilBERT
        • DPR
        • ELECTRA
        • Encoder Decoder Models
        • ERNIE
        • ErnieM
        • ESM
        • Falcon
        • FLAN-T5
        • FLAN-UL2
        • FlauBERT
        • FNet
        • FSMT
        • Funnel Transformer
        • GPT
        • GPT Neo
        • GPT NeoX
        • GPT NeoX Japanese
        • GPT-J
        • GPT2
        • GPTBigCode
        • GPTSAN Japanese
        • GPTSw3
        • HerBERT
        • I-BERT
        • Jukebox
        • LED
        • LLaMA
        • LLama2
        • Longformer
        • LongT5
        • LUKE
        • M2M100
        • MarianMT
        • MarkupLM
        • MBart and MBart-50
        • MEGA
        • MegatronBERT
        • MegatronGPT2
        • Mistral
        • mLUKE
        • MobileBERT
        • MPNet
        • MPT
        • MRA
        • MT5
        • MVP
        • NEZHA
        • NLLB
        • NLLB-MoE
        • Nyströmformer
        • Open-Llama
        • OPT
        • Pegasus
        • PEGASUS-X
        • Persimmon
        • PhoBERT
        • PLBart
        • ProphetNet
        • QDQBert
        • RAG
        • REALM
        • Reformer
        • RemBERT
        • RetriBERT
        • RoBERTa
        • RoBERTa-PreLayerNorm
        • RoCBert
        • RoFormer
        • RWKV
        • Splinter
        • SqueezeBERT
        • SwitchTransformers
        • T5
        • T5v1.1
        • TAPEX
        • Transformer XL
        • UL2
        • UMT5
        • X-MOD
        • XGLM
        • XLM
        • XLM-ProphetNet
        • XLM-RoBERTa
        • XLM-RoBERTa-XL
        • XLM-V
        • XLNet
        • YOSO
      • 🌍VISION MODELS
        • BEiT
        • BiT
        • Conditional DETR
        • ConvNeXT
        • ConvNeXTV2
        • CvT
        • Deformable DETR
        • DeiT
        • DETA
        • DETR
        • DiNAT
        • DINO V2
        • DiT
        • DPT
        • EfficientFormer
        • EfficientNet
        • FocalNet
        • GLPN
        • ImageGPT
        • LeViT
        • Mask2Former
        • MaskFormer
        • MobileNetV1
        • MobileNetV2
        • MobileViT
        • MobileViTV2
        • NAT
        • PoolFormer
        • Pyramid Vision Transformer (PVT)
        • RegNet
        • ResNet
        • SegFormer
        • SwiftFormer
        • Swin Transformer
        • Swin Transformer V2
        • Swin2SR
        • Table Transformer
        • TimeSformer
        • UperNet
        • VAN
        • VideoMAE
        • Vision Transformer (ViT)
        • ViT Hybrid
        • ViTDet
        • ViTMAE
        • ViTMatte
        • ViTMSN
        • ViViT
        • YOLOS
      • 🌍AUDIO MODELS
        • Audio Spectrogram Transformer
        • Bark
        • CLAP
        • EnCodec
        • Hubert
        • MCTCT
        • MMS
        • MusicGen
        • Pop2Piano
        • SEW
        • SEW-D
        • Speech2Text
        • Speech2Text2
        • SpeechT5
        • UniSpeech
        • UniSpeech-SAT
        • VITS
        • Wav2Vec2
        • Wav2Vec2-Conformer
        • Wav2Vec2Phoneme
        • WavLM
        • Whisper
        • XLS-R
        • XLSR-Wav2Vec2
      • 🌍MULTIMODAL MODELS
        • ALIGN
        • AltCLIP
        • BLIP
        • BLIP-2
        • BridgeTower
        • BROS
        • Chinese-CLIP
        • CLIP
        • CLIPSeg
        • Data2Vec
        • DePlot
        • Donut
        • FLAVA
        • GIT
        • GroupViT
        • IDEFICS
        • InstructBLIP
        • LayoutLM
        • LayoutLMV2
        • LayoutLMV3
        • LayoutXLM
        • LiLT
        • LXMERT
        • MatCha
        • MGP-STR
        • Nougat
        • OneFormer
        • OWL-ViT
        • Perceiver
        • Pix2Struct
        • Segment Anything
        • Speech Encoder Decoder Models
        • TAPAS
        • TrOCR
        • TVLT
        • ViLT
        • Vision Encoder Decoder Models
        • Vision Text Dual Encoder
        • VisualBERT
        • X-CLIP
      • 🌍REINFORCEMENT LEARNING MODELS
        • Decision Transformer
        • Trajectory Transformer
      • 🌍TIME SERIES MODELS
        • Autoformer
        • Informer
        • Time Series Transformer
      • 🌍GRAPH MODELS
        • Graphormer
  • 🌍INTERNAL HELPERS
    • Custom Layers and Utilities
    • Utilities for pipelines
    • Utilities for Tokenizers
    • Utilities for Trainer
    • Utilities for Generation
    • Utilities for Image Processors
    • Utilities for Audio processing
    • General Utilities
    • Utilities for Time Series
Powered by GitBook
On this page
  • Callbacks
  • Available Callbacks
  • TrainerCallback
  • TrainerState
  • TrainerControl
  1. API
  2. MAIN CLASSES

Callbacks

PreviousFlaxAutoModelForVision2SeqNextConfiguration

Last updated 1 year ago

Callbacks

Callbacks are objects that can customize the behavior of the training loop in the PyTorch (this feature is not yet implemented in TensorFlow) that can inspect the training loop state (for progress reporting, logging on TensorBoard or other ML platforms…) and take decisions (like early stopping).

Callbacks are “read only” pieces of code, apart from the object they return, they cannot change anything in the training loop. For customizations that require changes in the training loop, you should subclass and override the methods you need (see for examples).

By default a will use the following callbacks:

  • which handles the default behavior for logging, saving and evaluation.

  • or to display progress and print the logs (the first one is used if you deactivate tqdm through the , otherwise it’s the second one).

  • if tensorboard is accessible (either through PyTorch >= 1.4 or tensorboardX).

  • if is installed.

  • if is installed.

  • if is installed.

  • if is installed.

  • if is installed.

  • if is installed.

  • if is installed.

  • if is installed.

  • if is installed.

Available Callbacks

class transformers.integrations.CometCallback

( )

setup

( argsstatemodel )

Setup the optional Comet.ml integration.

Environment:

  • COMET_MODE (str, optional, defaults to ONLINE): Whether to create an online, offline experiment or disable Comet logging. Can be OFFLINE, ONLINE, or DISABLED.

  • COMET_PROJECT_NAME (str, optional): Comet project name for experiments.

  • COMET_OFFLINE_DIRECTORY (str, optional): Folder to use for saving offline experiments when COMET_MODE is OFFLINE.

  • COMET_LOG_ASSETS (str, optional, defaults to TRUE): Whether or not to log training assets (tf event logs, checkpoints, etc), to Comet. Can be TRUE, or FALSE.

class transformers.DefaultFlowCallback

( )

class transformers.PrinterCallback

( )

class transformers.ProgressCallback

( )

class transformers.EarlyStoppingCallback

( early_stopping_patience: int = 1early_stopping_threshold: typing.Optional[float] = 0.0 )

Parameters

  • early_stopping_patience (int) — Use with metric_for_best_model to stop training when the specified metric worsens for early_stopping_patience evaluation calls.

  • early_stopping_threshold(float, optional) — Use with TrainingArguments metric_for_best_model and early_stopping_patience to denote how much the specified metric must improve to satisfy early stopping conditions. `

class transformers.integrations.TensorBoardCallback

( tb_writer = None )

Parameters

  • tb_writer (SummaryWriter, optional) — The writer to use. Will instantiate one if not set.

class transformers.integrations.WandbCallback

( )

setup

( argsstatemodel**kwargs )

Setup the optional Weights & Biases (wandb) integration.

Environment:

  • WANDB_LOG_MODEL (str, optional, defaults to "false"): Whether to log model and checkpoints during training. Can be "end", "checkpoint" or "false". If set to "end", the model will be uploaded at the end of training. If set to "checkpoint", the checkpoint will be uploaded every args.save_steps . If set to "false", the model will not be uploaded. Use along with load_best_model_at_end() to upload best model.

    Deprecated in 5.0

    Setting WANDB_LOG_MODEL as bool will be deprecated in version 5 of 🌍Transformers.

  • WANDB_WATCH (str, optional defaults to "false"): Can be "gradients", "all", "parameters", or "false". Set to "all" to log gradients and parameters.

  • WANDB_PROJECT (str, optional, defaults to "boincai"): Set this to a custom string to store results in a different project.

  • WANDB_DISABLED (bool, optional, defaults to False): Whether to disable wandb entirely. Set WANDB_DISABLED=true to disable.

class transformers.integrations.MLflowCallback

( )

setup

( argsstatemodel )

Setup the optional MLflow integration.

Environment:

  • MLFLOW_EXPERIMENT_NAME (str, optional, defaults to None): Whether to use an MLflow experiment_name under which to launch the run. Default to None which will point to the Default experiment in MLflow. Otherwise, it is a case sensitive name of the experiment to be activated. If an experiment with this name does not exist, a new experiment with this name is created.

  • MLFLOW_TAGS (str, optional): A string dump of a dictionary of key/value pair to be added to the MLflow run as tags. Example: os.environ['MLFLOW_TAGS']='{"release.candidate": "RC1", "release.version": "2.2.0"}'.

  • MLFLOW_NESTED_RUN (str, optional): Whether to use MLflow nested runs. If set to True or 1, will create a nested run inside the current run.

  • MLFLOW_RUN_ID (str, optional): Allow to reattach to an existing run which can be usefull when resuming training from a checkpoint. When MLFLOW_RUN_ID environment variable is set, start_run attempts to resume a run with the specified run ID and other parameters are ignored.

  • MLFLOW_FLATTEN_PARAMS (str, optional, defaults to False): Whether to flatten the parameters dictionary before logging.

class transformers.integrations.AzureMLCallback

( azureml_run = None )

class transformers.integrations.CodeCarbonCallback

( )

class transformers.integrations.NeptuneCallback

( api_token: typing.Optional[str] = Noneproject: typing.Optional[str] = Nonename: typing.Optional[str] = Nonebase_namespace: str = 'finetuning'run = Nonelog_parameters: bool = Truelog_checkpoints: typing.Optional[str] = None**neptune_run_kwargs )

Parameters

  • project (str, optional) — Name of an existing Neptune project, in the form “workspace-name/project-name”. You can find and copy the name in Neptune from the project settings -> Properties. If None (default), the value of the NEPTUNE_PROJECT environment variable is used.

  • name (str, optional) — Custom name for the run.

  • base_namespace (str, optional, defaults to “finetuning”) — In the Neptune run, the root namespace that will contain all of the metadata logged by the callback.

  • log_parameters (bool, optional, defaults to True) — If True, logs all Trainer arguments and model parameters provided by the Trainer.

  • log_checkpoints (str, optional) — If “same”, uploads checkpoints whenever they are saved by the Trainer. If “last”, uploads only the most recently saved checkpoint. If “best”, uploads the best checkpoint (among the ones saved by the Trainer). If None, does not upload checkpoints.

class transformers.integrations.ClearMLCallback

( )

Environment:

  • CLEARML_PROJECT (str, optional, defaults to BOINCAI Transformers): ClearML project name.

  • CLEARML_TASK (str, optional, defaults to Trainer): ClearML task name.

  • CLEARML_LOG_MODEL (bool, optional, defaults to False): Whether to log models as artifacts during training.

class transformers.integrations.DagsHubCallback

( )

setup

( *args**kwargs )

Setup the DagsHub’s Logging integration.

Environment:

  • HF_DAGSHUB_LOG_ARTIFACTS (str, optional): Whether to save the data and model artifacts for the experiment. Default to False.

class transformers.integrations.FlyteCallback

( save_log_history: bool = Truesync_checkpoints: bool = True )

Parameters

  • save_log_history (bool, optional, defaults to True) — When set to True, the training logs are saved as a Flyte Deck.

  • sync_checkpoints (bool, optional, defaults to True) — When set to True, checkpoints are synced with Flyte and can be used to resume training in the case of an interruption.

Example:

Copied

# Note: This example skips over some setup steps for brevity.
from flytekit import current_context, task


@task
def train_hf_transformer():
    cp = current_context().checkpoint
    trainer = Trainer(..., callbacks=[FlyteCallback()])
    output = trainer.train(resume_from_checkpoint=cp.restore())

TrainerCallback

class transformers.TrainerCallback

( )

Parameters

  • optimizer (torch.optim.Optimizer) — The optimizer used for the training steps.

  • lr_scheduler (torch.optim.lr_scheduler.LambdaLR) — The scheduler used for setting the learning rate.

  • train_dataloader (torch.utils.data.DataLoader, optional) — The current dataloader used for training.

  • eval_dataloader (torch.utils.data.DataLoader, optional) — The current dataloader used for training.

  • metrics (Dict[str, float]) — The metrics computed by the last evaluation phase.

    Those are only accessible in the event on_evaluate.

  • logs (Dict[str, float]) — The values to log.

    Those are only accessible in the event on_log.

A class for objects that will inspect the state of the training loop at some events and take some decisions. At each of those events the following arguments are available:

The control object is the only one that can be changed by the callback, in which case the event that changes it should return the modified version.

The argument args, state and control are positionals for all events, all the others are grouped in kwargs. You can unpack the ones you need in the signature of the event using them. As an example, see the code of the simple ~transformer.PrinterCallback.

Example:

Copied

class PrinterCallback(TrainerCallback):
    def on_log(self, args, state, control, logs=None, **kwargs):
        _ = logs.pop("total_flos", None)
        if state.is_local_process_zero:
            print(logs)

on_epoch_begin

( args: TrainingArgumentsstate: TrainerStatecontrol: TrainerControl**kwargs )

Event called at the beginning of an epoch.

on_epoch_end

( args: TrainingArgumentsstate: TrainerStatecontrol: TrainerControl**kwargs )

Event called at the end of an epoch.

on_evaluate

( args: TrainingArgumentsstate: TrainerStatecontrol: TrainerControl**kwargs )

Event called after an evaluation phase.

on_init_end

( args: TrainingArgumentsstate: TrainerStatecontrol: TrainerControl**kwargs )

on_log

( args: TrainingArgumentsstate: TrainerStatecontrol: TrainerControl**kwargs )

Event called after logging the last logs.

on_predict

( args: TrainingArgumentsstate: TrainerStatecontrol: TrainerControlmetrics**kwargs )

Event called after a successful prediction.

on_prediction_step

( args: TrainingArgumentsstate: TrainerStatecontrol: TrainerControl**kwargs )

Event called after a prediction step.

on_save

( args: TrainingArgumentsstate: TrainerStatecontrol: TrainerControl**kwargs )

Event called after a checkpoint save.

on_step_begin

( args: TrainingArgumentsstate: TrainerStatecontrol: TrainerControl**kwargs )

Event called at the beginning of a training step. If using gradient accumulation, one training step might take several inputs.

on_step_end

( args: TrainingArgumentsstate: TrainerStatecontrol: TrainerControl**kwargs )

Event called at the end of a training step. If using gradient accumulation, one training step might take several inputs.

on_substep_end

( args: TrainingArgumentsstate: TrainerStatecontrol: TrainerControl**kwargs )

Event called at the end of an substep during gradient accumulation.

on_train_begin

( args: TrainingArgumentsstate: TrainerStatecontrol: TrainerControl**kwargs )

Event called at the beginning of training.

on_train_end

( args: TrainingArgumentsstate: TrainerStatecontrol: TrainerControl**kwargs )

Event called at the end of training.

Copied

class MyCallback(TrainerCallback):
    "A callback that prints a message at the beginning of training"

    def on_train_begin(self, args, state, control, **kwargs):
        print("Starting training")


trainer = Trainer(
    model,
    args,
    train_dataset=train_dataset,
    eval_dataset=eval_dataset,
    callbacks=[MyCallback],  # We can either pass the callback class this way or an instance of it (MyCallback())
)

Another way to register a callback is to call trainer.add_callback() as follows:

Copied

trainer = Trainer(...)
trainer.add_callback(MyCallback)
# Alternatively, we can pass an instance of the callback class
trainer.add_callback(MyCallback())

TrainerState

class transformers.TrainerState

( epoch: typing.Optional[float] = Noneglobal_step: int = 0max_steps: int = 0logging_steps: int = 500eval_steps: int = 500save_steps: int = 500num_train_epochs: int = 0total_flos: float = 0log_history: typing.List[typing.Dict[str, float]] = Nonebest_metric: typing.Optional[float] = Nonebest_model_checkpoint: typing.Optional[str] = Noneis_local_process_zero: bool = Trueis_world_process_zero: bool = Trueis_hyper_param_search: bool = Falsetrial_name: str = Nonetrial_params: typing.Dict[str, typing.Union[str, float, int, bool]] = None )

Parameters

  • epoch (float, optional) — Only set during training, will represent the epoch the training is at (the decimal part being the percentage of the current epoch completed).

  • global_step (int, optional, defaults to 0) — During training, represents the number of update steps completed.

  • max_steps (int, optional, defaults to 0) — The number of update steps to do during the current training.

  • logging_steps (int, optional, defaults to 500) — Log every X updates steps

  • eval_steps (int, optional) — Run an evaluation every X steps.

  • save_steps (int, optional, defaults to 500) — Save checkpoint every X updates steps.

  • total_flos (float, optional, defaults to 0) — The total number of floating operations done by the model since the beginning of training (stored as floats to avoid overflow).

  • log_history (List[Dict[str, float]], optional) — The list of logs done since the beginning of training.

  • best_metric (float, optional) — When tracking the best model, the value of the best metric encountered so far.

  • best_model_checkpoint (str, optional) — When tracking the best model, the value of the name of the checkpoint for the best model encountered so far.

  • is_local_process_zero (bool, optional, defaults to True) — Whether or not this process is the local (e.g., on one machine if training in a distributed fashion on several machines) main process.

  • is_world_process_zero (bool, optional, defaults to True) — Whether or not this process is the global main process (when training in a distributed fashion on several machines, this is only going to be True for one process).

  • is_hyper_param_search (bool, optional, defaults to False) — Whether we are in the process of a hyper parameter search using Trainer.hyperparameter_search. This will impact the way data will be logged in TensorBoard.

In all this class, one step is to be understood as one update step. When using gradient accumulation, one update step may require several forward and backward passes: if you use gradient_accumulation_steps=n, then one update step requires going through n batches.

load_from_json

( json_path: str )

Create an instance from the content of json_path.

save_to_json

( json_path: str )

Save the content of this instance in JSON format inside json_path.

TrainerControl

class transformers.TrainerControl

( should_training_stop: bool = Falseshould_epoch_stop: bool = Falseshould_save: bool = Falseshould_evaluate: bool = Falseshould_log: bool = False )

Parameters

  • should_training_stop (bool, optional, defaults to False) — Whether or not the training should be interrupted.

    If True, this variable will not be set back to False. The training will just stop.

  • should_epoch_stop (bool, optional, defaults to False) — Whether or not the current epoch should be interrupted.

    If True, this variable will be set back to False at the beginning of the next epoch.

  • should_save (bool, optional, defaults to False) — Whether or not the model should be saved at this step.

    If True, this variable will be set back to False at the beginning of the next step.

  • should_evaluate (bool, optional, defaults to False) — Whether or not the model should be evaluated at this step.

    If True, this variable will be set back to False at the beginning of the next step.

  • should_log (bool, optional, defaults to False) — Whether or not the logs should be reported at this step.

    If True, this variable will be set back to False at the beginning of the next step.

The main class that implements callbacks is . It gets the used to instantiate the , can access that Trainer’s internal state via , and can take some actions on the training loop via .

Here is the list of the available in the library:

A that sends the logs to .

For a number of configurable items in the environment, see .

A that handles the default flow of the training loop for logs, evaluation and checkpoints.

A bare that just prints the logs.

A that displays the progress of training or evaluation.

A that handles early stopping.

This callback depends on argument load_best_model_at_end functionality to set best_metric in . Note that if the argument save_steps differs from eval_steps, the early stopping will not occur until the next save step.

A that sends the logs to .

A that logs metrics, media, model checkpoints to .

One can subclass and override this method to customize the setup if needed. Find more information . You can also override the following environment variables:

A that sends the logs to . Can be disabled by setting environment variable DISABLE_MLFLOW_INTEGRATION = TRUE.

HF_MLFLOW_LOG_ARTIFACTS (str, optional): Whether to use MLflow .log_artifact() facility to log artifacts. This only makes sense if logging to a remote server, e.g. s3 or GCS. If set to True or 1, will copy each saved checkpoint on each save in ’s output_dir to the local or remote artifact storage. Using it without a remote storage will just copy the files to your artifact location.

A that sends the logs to .

A that tracks the CO2 emission of training.

api_token (str, optional) — Neptune API token obtained upon registration. You can leave this argument out if you have saved your token to the NEPTUNE_API_TOKEN environment variable (strongly recommended). See full setup instructions in the .

run (Run, optional) — Pass a Neptune run object if you want to continue logging to an existing run. Read more about resuming runs in the .

**neptune_run_kwargs (optional) — Additional keyword arguments to be passed directly to the function when a new run is created.

TrainerCallback that sends the logs to .

For instructions and examples, see the in the Neptune documentation.

A that sends the logs to .

A that logs to . Extends MLflowCallback

A that sends the logs to . NOTE: This callback only works within a Flyte task.

args () — The training arguments used to instantiate the .

state () — The current state of the .

control () — The object that is returned to the and can be used to make some decisions.

model ( or torch.nn.Module) — The model being trained.

tokenizer () — The tokenizer used for encoding the data.

Event called at the end of the initialization of the .

Here is an example of how to register a custom callback with the PyTorch :

A class containing the inner state that will be saved along the model and optimizer when checkpointing and passed to the .

A class that handles the control flow. This class is used by the to activate some switches in the training loop.

🌍
🌍
Trainer
TrainerControl
Trainer
trainer
Trainer
DefaultFlowCallback
PrinterCallback
ProgressCallback
TrainingArguments
TensorBoardCallback
WandbCallback
wandb
CometCallback
comet_ml
MLflowCallback
mlflow
NeptuneCallback
neptune
AzureMLCallback
azureml-sdk
CodeCarbonCallback
codecarbon
ClearMLCallback
clearml
DagsHubCallback
dagshub
FlyteCallback
flyte
TrainerCallback
TrainingArguments
Trainer
TrainerState
TrainerControl
TrainerCallback
<source>
TrainerCallback
Comet ML
<source>
here
<source>
TrainerCallback
<source>
TrainerCallback
<source>
TrainerCallback
<source>
TrainerCallback
TrainingArguments
TrainerState
TrainingArguments
<source>
TrainerCallback
TensorBoard
<source>
TrainerCallback
Weight and Biases
<source>
here
<source>
TrainerCallback
MLflow
<source>
TrainingArguments
<source>
TrainerCallback
AzureML
<source>
TrainerCallback
<source>
docs
docs
neptune.init_run()
Neptune
Transformers integration guide
<source>
TrainerCallback
ClearML
<source>
TrainerCallback
DagsHub
<source>
<source>
TrainerCallback
Flyte
<source>
TrainingArguments
Trainer
TrainerState
Trainer
TrainerControl
Trainer
PreTrainedModel
PreTrainedTokenizer
<source>
<source>
<source>
<source>
Trainer
<source>
<source>
<source>
<source>
<source>
<source>
<source>
<source>
<source>
Trainer
<source>
Trainer
TrainerCallback
<source>
<source>
<source>
Trainer
TrainerCallback