Accelerate
  • ๐ŸŒGETTING STARTED
    • BOINC AI Accelerate
    • Installation
    • Quicktour
  • ๐ŸŒTUTORIALS
    • Overview
    • Migrating to BOINC AI Accelerate
    • Launching distributed code
    • Launching distributed training from Jupyter Notebooks
  • ๐ŸŒHOW-TO GUIDES
    • Start Here!
    • Example Zoo
    • How to perform inference on large models with small resources
    • Knowing how big of a model you can fit into memory
    • How to quantize model
    • How to perform distributed inference with normal resources
    • Performing gradient accumulation
    • Accelerating training with local SGD
    • Saving and loading training states
    • Using experiment trackers
    • Debugging timeout errors
    • How to avoid CUDA Out-of-Memory
    • How to use Apple Silicon M1 GPUs
    • How to use DeepSpeed
    • How to use Fully Sharded Data Parallelism
    • How to use Megatron-LM
    • How to use BOINC AI Accelerate with SageMaker
    • How to use BOINC AI Accelerate with Intelยฎ Extension for PyTorch for cpu
  • ๐ŸŒCONCEPTS AND FUNDAMENTALS
    • BOINC AI Accelerate's internal mechanism
    • Loading big models into memory
    • Comparing performance across distributed setups
    • Executing and deferring jobs
    • Gradient synchronization
    • TPU best practices
  • ๐ŸŒREFERENCE
    • Main Accelerator class
    • Stateful configuration classes
    • The Command Line
    • Torch wrapper classes
    • Experiment trackers
    • Distributed launchers
    • DeepSpeed utilities
    • Logging
    • Working with large models
    • Kwargs handlers
    • Utility functions and classes
    • Megatron-LM Utilities
    • Fully Sharded Data Parallelism Utilities
Powered by GitBook
On this page
  • Experiment Tracking
  • The Base Tracker Class
  • Integrated Trackers
  1. REFERENCE

Experiment trackers

PreviousTorch wrapper classesNextDistributed launchers

Last updated 1 year ago

Experiment Tracking

The Base Tracker Class

class accelerate.tracking.GeneralTracker

( _blank = False )

A base Tracker class to be used for all logging integration implementations.

Each function should take in **kwargs that will automatically be passed in from a base dictionary provided to .

Should implement name, requires_logging_directory, and tracker properties such that:

name (str): String representation of the tracker class name, such as โ€œTensorBoardโ€ requires_logging_directory (bool): Whether the logger requires a directory to store their logs. tracker (object): Should return internal tracking mechanism used by a tracker class (such as the run for wandb)

Implementations can also include a main_process_only (bool) attribute to toggle if relevent logging, init, and other functions should occur on the main process or across all processes (by default will use True)

finish

( )

Should run any finalizing functions within the tracking API. If the API should not have one, just donโ€™t overwrite that method.

log

( values: dictstep: typing.Optional[int]**kwargs )

Parameters

  • values (Dictionary str to str, float, or int) โ€” Values to be logged as key-value pairs. The values need to have type str, float, or int.

  • step (int, optional) โ€” The run step. If included, the log will be affiliated with this step.

Logs values to the current run. Base log implementations of a tracking API should go in here, along with special behavior for the `step parameter.

store_init_configuration

( values: dict )

Parameters

  • values (Dictionary str to bool, str, float or int) โ€” Values to be stored as initial hyperparameters as key-value pairs. The values need to have type bool, str, float, int, or None.

Logs values as hyperparameters for the run. Implementations should use the experiment configuration functionality of a tracking API.

Integrated Trackers

class accelerate.tracking.TensorBoardTracker

( run_name: strlogging_dir: typing.Union[str, os.PathLike]**kwargs )

Parameters

  • run_name (str) โ€” The name of the experiment run

  • logging_dir (str, os.PathLike) โ€” Location for TensorBoard logs to be stored. kwargs โ€” Additional key word arguments passed along to the tensorboard.SummaryWriter.__init__ method.

A Tracker class that supports tensorboard. Should be initialized at the start of your script.

__init__

( run_name: strlogging_dir: typing.Union[str, os.PathLike]**kwargs )

class accelerate.tracking.WandBTracker

( run_name: str**kwargs )

Parameters

  • run_name (str) โ€” The name of the experiment run. kwargs โ€” Additional key word arguments passed along to the wandb.init method.

A Tracker class that supports wandb. Should be initialized at the start of your script.

__init__

( run_name: str**kwargs )

class accelerate.tracking.CometMLTracker

( run_name: str**kwargs )

Parameters

  • run_name (str) โ€” The name of the experiment run. kwargs โ€” Additional key word arguments passed along to the Experiment.__init__ method.

A Tracker class that supports comet_ml. Should be initialized at the start of your script.

API keys must be stored in a Comet config file.

__init__

( run_name: str**kwargs )

class accelerate.tracking.AimTracker

( run_name: strlogging_dir: typing.Union[str, os.PathLike, NoneType] = '.'**kwargs )

Parameters

  • run_name (str) โ€” The name of the experiment run. kwargs โ€” Additional key word arguments passed along to the Run.__init__ method.

A Tracker class that supports aim. Should be initialized at the start of your script.

__init__

( run_name: strlogging_dir: typing.Union[str, os.PathLike, NoneType] = '.'**kwargs )

class accelerate.tracking.MLflowTracker

( experiment_name: str = Nonelogging_dir: typing.Union[str, os.PathLike, NoneType] = Nonerun_id: typing.Optional[str] = Nonetags: typing.Union[typing.Dict[str, typing.Any], str, NoneType] = Nonenested_run: typing.Optional[bool] = Falserun_name: typing.Optional[str] = Nonedescription: typing.Optional[str] = None )

Parameters

  • experiment_name (str, optional) โ€” Name of the experiment. Environment variable MLFLOW_EXPERIMENT_NAME has priority over this argument.

  • logging_dir (str or os.PathLike, defaults to ".") โ€” Location for mlflow logs to be stored.

  • run_id (str, optional) โ€” If specified, get the run with the specified UUID and log parameters and metrics under that run. The runโ€™s end time is unset and its status is set to running, but the runโ€™s other attributes (source_version, source_type, etc.) are not changed. Environment variable MLFLOW_RUN_ID has priority over this argument.

  • tags (Dict[str, str], optional) โ€” An optional dict of str keys and values, or a str dump from a dict, to set as tags on the run. If a run is being resumed, these tags are set on the resumed run. If a new run is being created, these tags are set on the new run. Environment variable MLFLOW_TAGS has priority over this argument.

  • nested_run (bool, optional, defaults to False) โ€” Controls whether run is nested in parent run. True creates a nested run. Environment variable MLFLOW_NESTED_RUN has priority over this argument.

  • run_name (str, optional) โ€” Name of new run (stored as a mlflow.runName tag). Used only when run_id is unspecified.

  • description (str, optional) โ€” An optional string that populates the description box of the run. If a run is being resumed, the description is set on the resumed run. If a new run is being created, the description is set on the new run.

A Tracker class that supports mlflow. Should be initialized at the start of your script.

__init__

( experiment_name: str = Nonelogging_dir: typing.Union[str, os.PathLike, NoneType] = Nonerun_id: typing.Optional[str] = Nonetags: typing.Union[typing.Dict[str, typing.Any], str, NoneType] = Nonenested_run: typing.Optional[bool] = Falserun_name: typing.Optional[str] = Nonedescription: typing.Optional[str] = None )

๐ŸŒ
<source>
Accelerator
<source>
<source>
<source>
<source>
<source>
<source>
<source>
<source>
<source>
<source>
<source>
<source>
<source>