Stateful configuration classes
Stateful Classes
Below are variations of a singleton class in the sense that all instances share the same state, which is initialized on the first instantiation.
These classes are immutable and store information about certain configurations or states.
class accelerate.PartialState
( cpu: bool = False**kwargs )
Singleton class that has information about the current training environment and functions to help with process control. Designed to be used when only process control and device execution states are needed. Does not need to be initialized from Accelerator
.
Available attributes:
device (
torch.device
) โ The device to use.distributed_type (DistributedType) โ The type of distributed environment currently in use.
local_process_index (
int
) โ The index of the current process on the current server.mixed_precision (
str
) โ Whether or not the current script will use mixed precision, and if so the type of mixed precision being performed. (Choose from โnoโ,โfp16โ,โbf16 or โfp8โ).num_processes (
int
) โ The number of processes currently launched in parallel.process_index (
int
) โ The index of the current process.is_last_process (
bool
) โ Whether or not the current process is the last one.is_main_process (
bool
) โ Whether or not the current process is the main one.is_local_main_process (
bool
) โ Whether or not the current process is the main one on the local node.debug (
bool
) โ Whether or not the current script is being run in debug mode.
local_main_process_first
( )
Lets the local main process go inside a with block.
The other processes will enter the with block after the main process exits.
Example:
Copied
main_process_first
( )
Lets the main process go first inside a with block.
The other processes will enter the with block after the main process exits.
Example:
Copied
on_last_process
( function: Callable[..., Any] )
Parameters
function (
Callable
) โ The function to decorate.
Decorator that only runs the decorated function on the last process.
Example:
Copied
on_local_main_process
( function: Callable[..., Any] = None )
Parameters
function (
Callable
) โ The function to decorate.
Decorator that only runs the decorated function on the local main process.
Example:
Copied
on_local_process
( function: Callable[..., Any] = Nonelocal_process_index: int = None )
Parameters
function (
Callable
, optional) โ The function to decorate.local_process_index (
int
, optional) โ The index of the local process on which to run the function.
Decorator that only runs the decorated function on the process with the given index on the current node.
Example:
Copied
on_main_process
( function: Callable[..., Any] = None )
Parameters
function (
Callable
) โ The function to decorate.
Decorator that only runs the decorated function on the main process.
Example:
Copied
on_process
( function: Callable[..., Any] = Noneprocess_index: int = None )
Parameters
function (
Callable
,optional
) โ The function to decorate.process_index (
int
,optional
) โ The index of the process on which to run the function.
Decorator that only runs the decorated function on the process with the given index.
Example:
Copied
split_between_processes
( inputs: list | tuple | dict | torch.Tensorapply_padding: bool = False )
Parameters
inputs (
list
,tuple
,torch.Tensor
, ordict
oflist
/tuple
/torch.Tensor
) โ The input to split between processes.apply_padding (
bool
,optional
, defaults toFalse
) โ Whether to apply padding by repeating the last element of the input so that all processes have the same number of elements. Useful when trying to perform actions such asgather()
on the outputs or passing in less inputs than there are processes. If so, just remember to drop the padded elements afterwards.
Splits input
between self.num_processes
quickly and can be then used on that process. Useful when doing distributed inference, such as with different prompts.
Note that when using a dict
, all keys need to have the same number of elements.
Example:
Copied
wait_for_everyone
( )
Will stop the execution of the current process until every other process has reached that point (so this does nothing when the script is only run in one process). Useful to do before saving a model.
Example:
Copied
class accelerate.state.AcceleratorState
( mixed_precision: str = Nonecpu: bool = Falsedynamo_plugin = Nonedeepspeed_plugin = Nonefsdp_plugin = Nonemegatron_lm_plugin = None_from_accelerator: bool = False**kwargs )
Singleton class that has information about the current training environment.
Available attributes:
device (
torch.device
) โ The device to use.distributed_type (DistributedType) โ The type of distributed environment currently in use.
initialized (
bool
) โ Whether or not theAcceleratorState
has been initialized fromAccelerator
.local_process_index (
int
) โ The index of the current process on the current server.mixed_precision (
str
) โ Whether or not the current script will use mixed precision, and if so the type of mixed precision being performed. (Choose from โnoโ,โfp16โ,โbf16 or โfp8โ).num_processes (
int
) โ The number of processes currently launched in parallel.process_index (
int
) โ The index of the current process.is_last_process (
bool
) โ Whether or not the current process is the last one.is_main_process (
bool
) โ Whether or not the current process is the main one.is_local_main_process (
bool
) โ Whether or not the current process is the main one on the local node.debug (
bool
) โ Whether or not the current script is being run in debug mode.
local_main_process_first
( )
Lets the local main process go inside a with block.
The other processes will enter the with block after the main process exits.
main_process_first
( )
Lets the main process go first inside a with block.
The other processes will enter the with block after the main process exits.
split_between_processes
( inputs: list | tuple | dict | torch.Tensorapply_padding: bool = False )
Parameters
inputs (
list
,tuple
,torch.Tensor
, ordict
oflist
/tuple
/torch.Tensor
) โ The input to split between processes.apply_padding (
bool
,optional
, defaults toFalse
) โ Whether to apply padding by repeating the last element of the input so that all processes have the same number of elements. Useful when trying to perform actions such asgather()
on the outputs or passing in less inputs than there are processes. If so, just remember to drop the padded elements afterwards.
Splits input
between self.num_processes
quickly and can be then used on that process. Useful when doing distributed inference, such as with different prompts.
Note that when using a dict
, all keys need to have the same number of elements.
Example:
Copied
class accelerate.state.GradientState
( gradient_accumulation_plugin: Optional[GradientAccumulationPlugin] = None )
Singleton class that has information related to gradient synchronization for gradient accumulation
Available attributes:
end_of_dataloader (
bool
) โ Whether we have reached the end the current dataloaderremainder (
int
) โ The number of extra samples that were added from padding the dataloadersync_gradients (
bool
) โ Whether the gradients should be synced across all devicesactive_dataloader (
Optional[DataLoader]
) โ The dataloader that is currently being iterated overdataloader_references (
List[Optional[DataLoader]]
) โ A list of references to the dataloaders that are being iterated overnum_steps (
int
) โ The number of steps to accumulate overadjust_scheduler (
bool
) โ Whether the scheduler should be adjusted to account for the gradient accumulationsync_with_dataloader (
bool
) โ Whether the gradients should be synced at the end of the dataloader iteration and the number of total steps reset
Last updated