DeepSpeed utilities
Last updated
Last updated
( hf_ds_config: typing.Any = Nonegradient_accumulation_steps: int = Nonegradient_clipping: float = Nonezero_stage: int = Noneis_train_batch_min: str = Trueoffload_optimizer_device: bool = Noneoffload_param_device: bool = Noneoffload_optimizer_nvme_path: str = Noneoffload_param_nvme_path: str = Nonezero3_init_flag: bool = Nonezero3_save_16bit_model: bool = None )
This plugin is used to integrate DeepSpeed.
deepspeed_config_process
( prefix = ''mismatches = Noneconfig = Nonemust_match = True**kwargs )
Process the DeepSpeed config with the values from the kwargs.
( paramslr = 0.001weight_decay = 0**kwargs )
Parameters
lr (float) โ Learning rate.
params (iterable) โ iterable of parameters to optimize or dicts defining parameter groups
weight_decay (float) โ Weight decay. **kwargs โ Other arguments.
Dummy optimizer presents model parameters or param groups, this is primarily used to follow conventional training loop when optimizer config is specified in the deepspeed config file.
( optimizertotal_num_steps = Nonewarmup_num_steps = 0lr_scheduler_callable = None**kwargs )
Parameters
optimizer (torch.optim.optimizer.Optimizer
) โ The optimizer to wrap.
total_num_steps (int, optional) โ Total number of steps.
warmup_num_steps (int, optional) โ Number of steps for warmup.
lr_scheduler_callable (callable, optional) โ A callable function that creates an LR Scheduler. It accepts only one argument optimizer
. **kwargs โ Other arguments.
Dummy scheduler presents model parameters or param groups, this is primarily used to follow conventional training loop when scheduler config is specified in the deepspeed config file.
( engine )
Parameters
engine (deepspeed.runtime.engine.DeepSpeedEngine) โ deepspeed engine to wrap
Internal wrapper for deepspeed.runtime.engine.DeepSpeedEngine. This is used to follow conventional training loop.
( optimizer )
Parameters
optimizer (torch.optim.optimizer.Optimizer
) โ The optimizer to wrap.
Internal wrapper around a deepspeed optimizer.
( scheduleroptimizers )
Parameters
scheduler (torch.optim.lr_scheduler.LambdaLR
) โ The scheduler to wrap.
optimizers (one or a list of torch.optim.Optimizer
) โ
Internal wrapper around a deepspeed scheduler.