> For the complete documentation index, see [llms.txt](https://boinc-ai.gitbook.io/accelerate/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://boinc-ai.gitbook.io/accelerate/reference/utility-functions-and-classes.md).

# Utility functions and classes

## Helpful Utilities

Below are a variety of utility functions that 🌍 Accelerate provides, broken down by use-case.

### Constants

Constants used throughout 🌍 Accelerate for reference

The following are constants used when utilizing [Accelerator.save\_state()](https://huggingface.co/docs/accelerate/v0.24.0/en/package_reference/accelerator#accelerate.Accelerator.save_state)

`utils.MODEL_NAME`: `"pytorch_model"` `utils.OPTIMIZER_NAME`: `"optimizer"` `utils.RNG_STATE_NAME`: `"random_states"` `utils.SCALER_NAME`: `"scaler.pt` `utils.SCHEDULER_NAME`: `"scheduler`

The following are constants used when utilizing [Accelerator.save\_model()](https://huggingface.co/docs/accelerate/v0.24.0/en/package_reference/accelerator#accelerate.Accelerator.save_model)

`utils.WEIGHTS_NAME`: `"pytorch_model.bin"` `utils.SAFE_WEIGHTS_NAME`: `"model.safetensors"` `utils.WEIGHTS_INDEX_NAME`: `"pytorch_model.bin.index.json"` `utils.SAFE_WEIGHTS_INDEX_NAME`: `"model.safetensors.index.json"`

### Data Classes

These are basic dataclasses used throughout 🌍 Accelerate and they can be passed in as parameters.

#### class accelerate.DistributedType

[\<source>](https://github.com/huggingface/accelerate/blob/v0.24.0/src/accelerate/utils/dataclasses.py#L227)

( valuenames = Nonemodule = Nonequalname = Nonetype = Nonestart = 1 )

Represents a type of distributed environment.

Values:

* **NO** — Not a distributed environment, just a single process.
* **MULTI\_CPU** — Distributed on multiple CPU nodes.
* **MULTI\_GPU** — Distributed on multiple GPUs.
* **MULTI\_NPU** — Distributed on multiple NPUs.
* **MULTI\_XPU** — Distributed on multiple XPUs.
* **DEEPSPEED** — Using DeepSpeed.
* **TPU** — Distributed on TPUs.

#### class accelerate.utils.DynamoBackend

[\<source>](https://github.com/huggingface/accelerate/blob/v0.24.0/src/accelerate/utils/dataclasses.py#L286)

( valuenames = Nonemodule = Nonequalname = Nonetype = Nonestart = 1 )

Represents a dynamo backend (see <https://github.com/pytorch/torchdynamo>).

Values:

* **NO** — Do not use torch dynamo.
* **EAGER** — Uses PyTorch to run the extracted GraphModule. This is quite useful in debugging TorchDynamo issues.
* **AOT\_EAGER** — Uses AotAutograd with no compiler, i.e, just using PyTorch eager for the AotAutograd’s extracted forward and backward graphs. This is useful for debugging, and unlikely to give speedups.
* **INDUCTOR** — Uses TorchInductor backend with AotAutograd and cudagraphs by leveraging codegened Triton kernels. [Read more](https://dev-discuss.pytorch.org/t/torchinductor-a-pytorch-native-compiler-with-define-by-run-ir-and-symbolic-shapes/747)
* **AOT\_TS\_NVFUSER** — nvFuser with AotAutograd/TorchScript. [Read more](https://dev-discuss.pytorch.org/t/tracing-with-primitives-update-1-nvfuser-and-its-primitives/593)
* **NVPRIMS\_NVFUSER** — nvFuser with PrimTorch. [Read more](https://dev-discuss.pytorch.org/t/tracing-with-primitives-update-1-nvfuser-and-its-primitives/593)
* **CUDAGRAPHS** — cudagraphs with AotAutograd. [Read more](https://github.com/pytorch/torchdynamo/pull/757)
* **OFI** — Uses Torchscript optimize\_for\_inference. Inference only. [Read more](https://pytorch.org/docs/stable/generated/torch.jit.optimize_for_inference.html)
* **FX2TRT** — Uses Nvidia TensorRT for inference optimizations. Inference only. [Read more](https://github.com/pytorch/TensorRT/blob/master/docsrc/tutorials/getting_started_with_fx_path.rst)
* **ONNXRT** — Uses ONNXRT for inference on CPU/GPU. Inference only. [Read more](https://onnxruntime.ai/)
* **TENSORRT** — Uses ONNXRT to run TensorRT for inference optimizations. [Read more](https://github.com/onnx/onnx-tensorrt)
* **IPEX** — Uses IPEX for inference on CPU. Inference only. [Read more](https://github.com/intel/intel-extension-for-pytorch).
* **TVM** — Uses Apach TVM for inference optimizations. [Read more](https://tvm.apache.org/)

#### class accelerate.utils.LoggerType

[\<source>](https://github.com/huggingface/accelerate/blob/v0.24.0/src/accelerate/utils/dataclasses.py#L334)

( valuenames = Nonemodule = Nonequalname = Nonetype = Nonestart = 1 )

Represents a type of supported experiment tracker

Values:

* **ALL** — all available trackers in the environment that are supported
* **TENSORBOARD** — TensorBoard as an experiment tracker
* **WANDB** — wandb as an experiment tracker
* **COMETML** — comet\_ml as an experiment tracker

#### class accelerate.utils.PrecisionType

[\<source>](https://github.com/huggingface/accelerate/blob/v0.24.0/src/accelerate/utils/dataclasses.py#L353)

( valuenames = Nonemodule = Nonequalname = Nonetype = Nonestart = 1 )

Represents a type of precision used on floating point values

Values:

* **NO** — using full precision (FP32)
* **FP16** — using half precision
* **BF16** — using brain floating point precision

#### class accelerate.utils.ProjectConfiguration

[\<source>](https://github.com/huggingface/accelerate/blob/v0.24.0/src/accelerate/utils/dataclasses.py#L396)

( project\_dir: str = Nonelogging\_dir: str = Noneautomatic\_checkpoint\_naming: bool = Falsetotal\_limit: int = Noneiteration: int = 0save\_on\_each\_node: bool = False )

Configuration for the Accelerator object based on inner-project needs.

**set\_directories**

[\<source>](https://github.com/huggingface/accelerate/blob/v0.24.0/src/accelerate/utils/dataclasses.py#L433)

( project\_dir: str = None )

Sets `self.project_dir` and `self.logging_dir` to the appropriate values.

### Plugins

These are plugins that can be passed to the [Accelerator](https://huggingface.co/docs/accelerate/v0.24.0/en/package_reference/accelerator#accelerate.Accelerator) object. While they are defined elsewhere in the documentation, for convience all of them are available to see here:

#### class accelerate.DeepSpeedPlugin

[\<source>](https://github.com/huggingface/accelerate/blob/v0.24.0/src/accelerate/utils/dataclasses.py#L501)

( ba\_ds\_config: typing.Any = Nonegradient\_accumulation\_steps: int = Nonegradient\_clipping: float = Nonezero\_stage: int = Noneis\_train\_batch\_min: str = Trueoffload\_optimizer\_device: bool = Noneoffload\_param\_device: bool = Noneoffload\_optimizer\_nvme\_path: str = Noneoffload\_param\_nvme\_path: str = Nonezero3\_init\_flag: bool = Nonezero3\_save\_16bit\_model: bool = None )

This plugin is used to integrate DeepSpeed.

**deepspeed\_config\_process**

[\<source>](https://github.com/huggingface/accelerate/blob/v0.24.0/src/accelerate/utils/dataclasses.py#L684)

( prefix = ''mismatches = Noneconfig = Nonemust\_match = True\*\*kwargs )

Process the DeepSpeed config with the values from the kwargs.

#### class accelerate.FullyShardedDataParallelPlugin

[\<source>](https://github.com/huggingface/accelerate/blob/v0.24.0/src/accelerate/utils/dataclasses.py#L801)

( sharding\_strategy: typing.Any = Nonebackward\_prefetch: typing.Any = Nonemixed\_precision\_policy: typing.Any = Noneauto\_wrap\_policy: typing.Optional\[typing.Callable] = Nonecpu\_offload: typing.Any = Noneignored\_modules: typing.Optional\[typing.Iterable\[torch.nn.modules.module.Module]] = Nonestate\_dict\_type: typing.Any = Nonestate\_dict\_config: typing.Any = Noneoptim\_state\_dict\_config: typing.Any = Nonelimit\_all\_gathers: bool = Falseuse\_orig\_params: bool = Falseparam\_init\_fn: typing.Optional\[typing.Callable\[\[torch.nn.modules.module.Module]], NoneType] = Nonesync\_module\_states: bool = Trueforward\_prefetch: bool = Falseactivation\_checkpointing: bool = False )

This plugin is used to enable fully sharded data parallelism.

**get\_module\_class\_from\_name**

[\<source>](https://github.com/huggingface/accelerate/blob/v0.24.0/src/accelerate/utils/dataclasses.py#L937)

( modulename )

Parameters

* **module** (`torch.nn.Module`) — The module to get the class from.
* **name** (`str`) — The name of the class.

Gets a class from a module by its name.

#### class accelerate.utils.GradientAccumulationPlugin

[\<source>](https://github.com/huggingface/accelerate/blob/v0.24.0/src/accelerate/utils/dataclasses.py#L444)

( num\_steps: int = Noneadjust\_scheduler: bool = Truesync\_with\_dataloader: bool = True )

A plugin to configure gradient accumulation behavior.

#### class accelerate.utils.MegatronLMPlugin

[\<source>](https://github.com/huggingface/accelerate/blob/v0.24.0/src/accelerate/utils/dataclasses.py#L1018)

( tp\_degree: int = Nonepp\_degree: int = Nonenum\_micro\_batches: int = Nonegradient\_clipping: float = Nonesequence\_parallelism: bool = Nonerecompute\_activation: bool = Noneuse\_distributed\_optimizer: bool = Nonepipeline\_model\_parallel\_split\_rank: int = Nonenum\_layers\_per\_virtual\_pipeline\_stage: int = Noneis\_train\_batch\_min: str = Truetrain\_iters: int = Nonetrain\_samples: int = Noneweight\_decay\_incr\_style: str = 'constant'start\_weight\_decay: float = Noneend\_weight\_decay: float = Nonelr\_decay\_style: str = 'linear'lr\_decay\_iters: int = Nonelr\_decay\_samples: int = Nonelr\_warmup\_iters: int = Nonelr\_warmup\_samples: int = Nonelr\_warmup\_fraction: float = Nonemin\_lr: float = 0consumed\_samples: typing.List\[int] = Noneno\_wd\_decay\_cond: typing.Optional\[typing.Callable] = Nonescale\_lr\_cond: typing.Optional\[typing.Callable] = Nonelr\_mult: float = 1.0megatron\_dataset\_flag: bool = Falseseq\_length: int = Noneencoder\_seq\_length: int = Nonedecoder\_seq\_length: int = Nonetensorboard\_dir: str = Noneset\_all\_logging\_options: bool = Falseeval\_iters: int = 100eval\_interval: int = 1000return\_logits: bool = Falsecustom\_train\_step\_class: typing.Optional\[typing.Any] = Nonecustom\_train\_step\_kwargs: typing.Union\[typing.Dict\[str, typing.Any], NoneType] = Nonecustom\_model\_provider\_function: typing.Optional\[typing.Callable] = Nonecustom\_prepare\_model\_function: typing.Optional\[typing.Callable] = Noneother\_megatron\_args: typing.Union\[typing.Dict\[str, typing.Any], NoneType] = None )

Plugin for Megatron-LM to enable tensor, pipeline, sequence and data parallelism. Also to enable selective activation recomputation and optimized fused kernels.

#### class accelerate.utils.TorchDynamoPlugin

[\<source>](https://github.com/huggingface/accelerate/blob/v0.24.0/src/accelerate/utils/dataclasses.py#L465)

( backend: DynamoBackend = Nonemode: str = Nonefullgraph: bool = Nonedynamic: bool = Noneoptions: typing.Any = Nonedisable: bool = False )

This plugin is used to compile a model with PyTorch 2.0

### Data Manipulation and Operations

These include data operations that mimic the same `torch` ops but can be used on distributed processes.

**accelerate.utils.broadcast**

[\<source>](https://github.com/huggingface/accelerate/blob/v0.24.0/src/accelerate/utils/operations.py#L440)

( tensorfrom\_process: int = 0 )

Parameters

* **tensor** (nested list/tuple/dictionary of `torch.Tensor`) — The data to gather.
* **from\_process** (`int`, *optional*, defaults to 0) — The process from which to send the data

Recursively broadcast tensor in a nested list/tuple/dictionary of tensors to all devices.

**accelerate.utils.concatenate**

[\<source>](https://github.com/huggingface/accelerate/blob/v0.24.0/src/accelerate/utils/operations.py#L503)

( datadim = 0 )

Parameters

* **data** (nested list/tuple/dictionary of lists of tensors `torch.Tensor`) — The data to concatenate.
* **dim** (`int`, *optional*, defaults to 0) — The dimension on which to concatenate.

Recursively concatenate the tensors in a nested list/tuple/dictionary of lists of tensors with the same shape.

**accelerate.utils.gather**

[\<source>](https://github.com/huggingface/accelerate/blob/v0.24.0/src/accelerate/utils/operations.py#L378)

( tensor )

Parameters

* **tensor** (nested list/tuple/dictionary of `torch.Tensor`) — The data to gather.

Recursively gather tensor in a nested list/tuple/dictionary of tensors from all devices.

**accelerate.utils.pad\_across\_processes**

[\<source>](https://github.com/huggingface/accelerate/blob/v0.24.0/src/accelerate/utils/operations.py#L525)

( tensordim = 0pad\_index = 0pad\_first = False )

Parameters

* **tensor** (nested list/tuple/dictionary of `torch.Tensor`) — The data to gather.
* **dim** (`int`, *optional*, defaults to 0) — The dimension on which to pad.
* **pad\_index** (`int`, *optional*, defaults to 0) — The value with which to pad.
* **pad\_first** (`bool`, *optional*, defaults to `False`) — Whether to pad at the beginning or the end.

Recursively pad the tensors in a nested list/tuple/dictionary of tensors from all devices to the same size so they can safely be gathered.

**accelerate.utils.reduce**

[\<source>](https://github.com/huggingface/accelerate/blob/v0.24.0/src/accelerate/utils/operations.py#L572)

( tensorreduction = 'mean'scale = 1.0 )

Parameters

* **tensor** (nested list/tuple/dictionary of `torch.Tensor`) — The data to reduce.
* **reduction** (`str`, *optional*, defaults to `"mean"`) — A reduction method. Can be of “mean”, “sum”, or “none”
* **scale** (`float`, *optional*) — A default scaling value to be applied after the reduce, only valied on XLA.

Recursively reduce the tensors in a nested list/tuple/dictionary of lists of tensors across all processes by the mean of a given operation.

**accelerate.utils.send\_to\_device**

[\<source>](https://github.com/huggingface/accelerate/blob/v0.24.0/src/accelerate/utils/operations.py#L137)

( tensordevicenon\_blocking = Falseskip\_keys = None )

Parameters

* **tensor** (nested list/tuple/dictionary of `torch.Tensor`) — The data to send to a given device.
* **device** (`torch.device`) — The device to send the data to.

Recursively sends the elements in a nested list/tuple/dictionary of tensors to a given device.

### Environment Checks

These functionalities check the state of the current working environment including information about the operating system itself, what it can support, and if particular dependencies are installed.

**accelerate.utils.is\_bf16\_available**

[\<source>](https://github.com/huggingface/accelerate/blob/v0.24.0/src/accelerate/utils/imports.py#L113)

( ignore\_tpu = False )

Checks if bf16 is supported, optionally ignoring the TPU

**accelerate.utils.is\_ipex\_available**

[\<source>](https://github.com/huggingface/accelerate/blob/v0.24.0/src/accelerate/utils/imports.py#L230)

( )

**accelerate.utils.is\_mps\_available**

[\<source>](https://github.com/huggingface/accelerate/blob/v0.24.0/src/accelerate/utils/imports.py#L226)

( )

**accelerate.utils.is\_npu\_available**

( check\_device = False )

Checks if `torch_npu` is installed and potentially if a NPU is in the environment

**accelerate.utils.is\_torch\_version**

[\<source>](https://github.com/huggingface/accelerate/blob/v0.24.0/src/accelerate/utils/versions.py#L46)

( operation: strversion: str )

Parameters

* **operation** (`str`) — A string representation of an operator, such as `">"` or `"<="`
* **version** (`str`) — A string version of PyTorch

Compares the current PyTorch version to a given reference with an operation.

**accelerate.utils.is\_tpu\_available**

( check\_device = True )

Checks if `torch_xla` is installed and potentially if a TPU is in the environment

**accelerate.utils.is\_xpu\_available**

( check\_device = False )

check if user disables it explicitly

### Environment Manipulation

**accelerate.utils.patch\_environment**

[\<source>](https://github.com/huggingface/accelerate/blob/v0.24.0/src/accelerate/utils/other.py#L176)

( \*\*kwargs )

A context manager that will add each keyword argument passed to `os.environ` and remove them when exiting.

Will convert the values in `kwargs` to strings and upper-case all the keys.

Example:

Copied

```
>>> import os
>>> from accelerate.utils import patch_environment

>>> with patch_environment(FOO="bar"):
...     print(os.environ["FOO"])  # prints "bar"
>>> print(os.environ["FOO"])  # raises KeyError
```

**accelerate.utils.clear\_environment**

[\<source>](https://github.com/huggingface/accelerate/blob/v0.24.0/src/accelerate/utils/other.py#L143)

( )

A context manager that will cache origin `os.environ` and replace it with a empty dictionary in this context.

When this context exits, the cached `os.environ` will be back.

Example:

Copied

```
>>> import os
>>> from accelerate.utils import clear_environment

>>> os.environ["FOO"] = "bar"
>>> with clear_environment():
...     print(os.environ)
...     os.environ["FOO"] = "new_bar"
...     print(os.environ["FOO"])
{}
new_bar

>>> print(os.environ["FOO"])
bar
```

**accelerate.commands.config.default.write\_basic\_config**

[\<source>](https://github.com/huggingface/accelerate/blob/v0.24.0/src/accelerate/commands/config/default.py#L29)

( mixed\_precision = 'no'save\_location: str = '/github/home/.cache/boincai/accelerate/default\_config.yaml'use\_xpu: bool = False )

Parameters

* **mixed\_precision** (`str`, *optional*, defaults to “no”) — Mixed Precision to use. Should be one of “no”, “fp16”, or “bf16”
* **save\_location** (`str`, *optional*, defaults to `default_json_config_file`) — Optional custom save location. Should be passed to `--config_file` when using `accelerate launch`. Default location is inside the boincai cache folder (`~/.cache/boincai`) but can be overriden by setting the `BA_HOME` environmental variable, followed by `accelerate/default_config.yaml`.
* **use\_xpu** (`bool`, *optional*, defaults to `False`) — Whether to use XPU if available.

Creates and saves a basic cluster config to be used on a local machine with potentially multiple GPUs. Will also set CPU if it is a CPU-only machine.

When setting up 🌍 Accelerate for the first time, rather than running `accelerate config` \[\~utils.write\_basic\_config] can be used as an alternative for quick configuration.

### Memory

**accelerate.utils.get\_max\_memory**

[\<source>](https://github.com/huggingface/accelerate/blob/v0.24.0/src/accelerate/utils/modeling.py#L629)

( max\_memory: typing.Union\[typing.Dict\[typing.Union\[int, str], typing.Union\[int, str]], NoneType] = None )

Get the maximum memory available if nothing is passed, converts string to int otherwise.

**accelerate.find\_executable\_batch\_size**

[\<source>](https://github.com/huggingface/accelerate/blob/v0.24.0/src/accelerate/utils/memory.py#L83)

( function: callable = Nonestarting\_batch\_size: int = 128 )

Parameters

* **function** (`callable`, *optional*) — A function to wrap
* **starting\_batch\_size** (`int`, *optional*) — The batch size to try and fit into memory

A basic decorator that will try to execute `function`. If it fails from exceptions related to out-of-memory or CUDNN, the batch size is cut in half and passed to `function`

`function` must take in a `batch_size` parameter as its first argument.

Example:

Copied

```
>>> from accelerate.utils import find_executable_batch_size


>>> @find_executable_batch_size(starting_batch_size=128)
... def train(batch_size, model, optimizer):
...     ...


>>> train(model, optimizer)
```

### Modeling

These utilities relate to interacting with PyTorch models

**accelerate.utils.extract\_model\_from\_parallel**

[\<source>](https://github.com/huggingface/accelerate/blob/v0.24.0/src/accelerate/utils/other.py#L55)

( modelkeep\_fp32\_wrapper: bool = True ) → `torch.nn.Module`

Parameters

* **model** (`torch.nn.Module`) — The model to extract.
* **keep\_fp32\_wrapper** (`bool`, *optional*) — Whether to remove mixed precision hooks from the model.

Returns

`torch.nn.Module`

The extracted model.

Extract a model from its distributed containers.

**accelerate.utils.get\_max\_layer\_size**

[\<source>](https://github.com/huggingface/accelerate/blob/v0.24.0/src/accelerate/utils/modeling.py#L590)

( modules: typing.List\[typing.Tuple\[str, torch.nn.modules.module.Module]]module\_sizes: typing.Dict\[str, int]no\_split\_module\_classes: typing.List\[str] ) → `Tuple[int, List[str]]`

Parameters

* **modules** (`List[Tuple[str, torch.nn.Module]]`) — The list of named modules where we want to determine the maximum layer size.
* **module\_sizes** (`Dict[str, int]`) — A dictionary mapping each layer name to its size (as generated by `compute_module_sizes`).
* **no\_split\_module\_classes** (`List[str]`) — A list of class names for layers we don’t want to be split.

Returns

`Tuple[int, List[str]]`

The maximum size of a layer with the list of layer names realizing that maximum size.

Utility function that will scan a list of named modules and return the maximum size used by one full layer. The definition of a layer being:

* a module with no direct children (just parameters and buffers)
* a module whose class name is in the list `no_split_module_classes`

**accelerate.utils.offload\_state\_dict**

[\<source>](https://github.com/huggingface/accelerate/blob/v0.24.0/src/accelerate/utils/offload.py#L86)

( save\_dir: typing.Union\[str, os.PathLike]state\_dict: typing.Dict\[str, torch.Tensor] )

Parameters

* **save\_dir** (`str` or `os.PathLike`) — The directory in which to offload the state dict.
* **state\_dict** (`Dict[str, torch.Tensor]`) — The dictionary of tensors to offload.

Offload a state dict in a given folder.

### Parallel

These include general utilities that should be used when working in parallel.

**accelerate.utils.extract\_model\_from\_parallel**

[\<source>](https://github.com/huggingface/accelerate/blob/v0.24.0/src/accelerate/utils/other.py#L55)

( modelkeep\_fp32\_wrapper: bool = True ) → `torch.nn.Module`

Parameters

* **model** (`torch.nn.Module`) — The model to extract.
* **keep\_fp32\_wrapper** (`bool`, *optional*) — Whether to remove mixed precision hooks from the model.

Returns

`torch.nn.Module`

The extracted model.

Extract a model from its distributed containers.

**accelerate.utils.save**

[\<source>](https://github.com/huggingface/accelerate/blob/v0.24.0/src/accelerate/utils/other.py#L120)

( objfsave\_on\_each\_node: bool = Falsesafe\_serialization: bool = False )

Parameters

* **save\_on\_each\_node** (`bool`, *optional*, defaults to `False`) — Whether to only save on the global main process
* **safe\_serialization** (`bool`, *optional*, defaults to `False`) — Whether to save `obj` using `safetensors`

Save the data to disk. Use in place of `torch.save()`.

**accelerate.utils.wait\_for\_everyone**

[\<source>](https://github.com/huggingface/accelerate/blob/v0.24.0/src/accelerate/utils/other.py#L107)

( )

Introduces a blocking point in the script, making sure all processes have reached this point before continuing.

Make sure all processes will reach this instruction otherwise one of your processes will hang forever.

### Random

These utilities relate to setting and synchronizing of all the random states.

**accelerate.utils.set\_seed**

[\<source>](https://github.com/huggingface/accelerate/blob/v0.24.0/src/accelerate/utils/random.py#L31)

( seed: intdevice\_specific: bool = False )

Parameters

* **seed** (`int`) — The seed to set.
* **device\_specific** (`bool`, *optional*, defaults to `False`) — Whether to differ the seed on each device slightly with `self.process_index`.

Helper function for reproducible behavior to set the seed in `random`, `numpy`, `torch`.

**accelerate.utils.synchronize\_rng\_state**

[\<source>](https://github.com/huggingface/accelerate/blob/v0.24.0/src/accelerate/utils/random.py#L57)

( rng\_type: typing.Optional\[accelerate.utils.dataclasses.RNGType] = Nonegenerator: typing.Optional\[torch.\_C.Generator] = None )

**accelerate.synchronize\_rng\_states**

[\<source>](https://github.com/huggingface/accelerate/blob/v0.24.0/src/accelerate/utils/random.py#L109)

( rng\_types: typing.List\[typing.Union\[str, accelerate.utils.dataclasses.RNGType]]generator: typing.Optional\[torch.\_C.Generator] = None )

### PyTorch XLA

These include utilities that are useful while using PyTorch with XLA.

**accelerate.utils.install\_xla**

[\<source>](https://github.com/huggingface/accelerate/blob/v0.24.0/src/accelerate/utils/torch_xla.py#L20)

( upgrade: bool = False )

Parameters

* **upgrade** (`bool`, *optional*, defaults to `False`) — Whether to upgrade `torch` and install the latest `torch_xla` wheels.

Helper function to install appropriate xla wheels based on the `torch` version in Google Colaboratory.

Example:

Copied

```
>>> from accelerate.utils import install_xla

>>> install_xla(upgrade=True)
```

### Loading model weights

These include utilities that are useful to load checkpoints.

**accelerate.load\_checkpoint\_in\_model**

[\<source>](https://github.com/huggingface/accelerate/blob/v0.24.0/src/accelerate/utils/modeling.py#L1231)

( model: Modulecheckpoint: typing.Union\[str, os.PathLike]device\_map: typing.Union\[typing.Dict\[str, typing.Union\[int, str, torch.device]], NoneType] = Noneoffload\_folder: typing.Union\[str, os.PathLike, NoneType] = Nonedtype: typing.Union\[str, torch.dtype, NoneType] = Noneoffload\_state\_dict: bool = Falseoffload\_buffers: bool = Falsekeep\_in\_fp32\_modules: typing.List\[str] = Noneoffload\_8bit\_bnb: bool = False )

Parameters

* **model** (`torch.nn.Module`) — The model in which we want to load a checkpoint.
* **checkpoint** (`str` or `os.PathLike`) — The folder checkpoint to load. It can be:
  * a path to a file containing a whole model state dict
  * a path to a `.json` file containing the index to a sharded checkpoint
  * a path to a folder containing a unique `.index.json` file and the shards of a checkpoint.
  * a path to a folder containing a unique pytorch\_model.bin or a model.safetensors file.
* **device\_map** (`Dict[str, Union[int, str, torch.device]]`, *optional*) — A map that specifies where each submodule should go. It doesn’t need to be refined to each parameter/buffer name, once a given module name is inside, every submodule of it will be sent to the same device.
* **offload\_folder** (`str` or `os.PathLike`, *optional*) — If the `device_map` contains any value `"disk"`, the folder where we will offload weights.
* **dtype** (`str` or `torch.dtype`, *optional*) — If provided, the weights will be converted to that type when loaded.
* **offload\_state\_dict** (`bool`, *optional*, defaults to `False`) — If `True`, will temporarily offload the CPU state dict on the hard drive to avoid getting out of CPU RAM if the weight of the CPU state dict + the biggest shard does not fit.
* **offload\_buffers** (`bool`, *optional*, defaults to `False`) — Whether or not to include the buffers in the weights offloaded to disk.
* **keep\_in\_fp32\_modules(`List[str]`,** *optional*) — A list of the modules that we keep in `torch.float32` dtype.
* **offload\_8bit\_bnb** (`bool`, *optional*) — Whether or not to enable offload of 8-bit modules on cpu/disk.

Loads a (potentially sharded) checkpoint inside a model, potentially sending weights to a given device as they are loaded.

Once loaded across devices, you still need to call [dispatch\_model()](https://huggingface.co/docs/accelerate/v0.24.0/en/package_reference/big_modeling#accelerate.dispatch_model) on your model to make it able to run. To group the checkpoint loading and dispatch in one single call, use [load\_checkpoint\_and\_dispatch()](https://huggingface.co/docs/accelerate/v0.24.0/en/package_reference/big_modeling#accelerate.load_checkpoint_and_dispatch).

### Quantization

These include utilities that are useful to quantize model.

**accelerate.utils.load\_and\_quantize\_model**

[\<source>](https://github.com/huggingface/accelerate/blob/v0.24.0/src/accelerate/utils/bnb.py#L44)

( model: Modulebnb\_quantization\_config: BnbQuantizationConfigweights\_location: typing.Union\[str, os.PathLike] = Nonedevice\_map: typing.Union\[typing.Dict\[str, typing.Union\[int, str, torch.device]], NoneType] = Noneno\_split\_module\_classes: typing.Optional\[typing.List\[str]] = Nonemax\_memory: typing.Union\[typing.Dict\[typing.Union\[int, str], typing.Union\[int, str]], NoneType] = Noneoffload\_folder: typing.Union\[str, os.PathLike, NoneType] = Noneoffload\_state\_dict: bool = False ) → `torch.nn.Module`

Parameters

* **model** (`torch.nn.Module`) — Input model. The model can be already loaded or on the meta device
* **bnb\_quantization\_config** (`BnbQuantizationConfig`) — The bitsandbytes quantization parameters
* **weights\_location** (`str` or `os.PathLike`) — The folder weights\_location to load. It can be:
  * a path to a file containing a whole model state dict
  * a path to a `.json` file containing the index to a sharded checkpoint
  * a path to a folder containing a unique `.index.json` file and the shards of a checkpoint.
  * a path to a folder containing a unique pytorch\_model.bin file.
* **device\_map** (`Dict[str, Union[int, str, torch.device]]`, *optional*) — A map that specifies where each submodule should go. It doesn’t need to be refined to each parameter/buffer name, once a given module name is inside, every submodule of it will be sent to the same device.
* **no\_split\_module\_classes** (`List[str]`, *optional*) — A list of layer class names that should never be split across device (for instance any layer that has a residual connection).
* **max\_memory** (`Dict`, *optional*) — A dictionary device identifier to maximum memory. Will default to the maximum memory available if unset.
* **offload\_folder** (`str` or `os.PathLike`, *optional*) — If the `device_map` contains any value `"disk"`, the folder where we will offload weights.
* **offload\_state\_dict** (`bool`, *optional*, defaults to `False`) — If `True`, will temporarily offload the CPU state dict on the hard drive to avoid getting out of CPU RAM if the weight of the CPU state dict + the biggest shard does not fit.

Returns

`torch.nn.Module`

The quantized model

This function will quantize the input model with the associated config passed in `bnb_quantization_config`. If the model is in the meta device, we will load and dispatch the weights according to the `device_map` passed. If the model is already loaded, we will quantize the model and put the model on the GPU,

#### class accelerate.utils.BnbQuantizationConfig

[\<source>](https://github.com/huggingface/accelerate/blob/v0.24.0/src/accelerate/utils/dataclasses.py#L1393)

( load\_in\_8bit: bool = Falsellm\_int8\_threshold: float = 6.0load\_in\_4bit: bool = Falsebnb\_4bit\_quant\_type: str = 'fp4'bnb\_4bit\_use\_double\_quant: bool = Falsebnb\_4bit\_compute\_dtype: bool = 'fp16'torch\_dtype: dtype = Noneskip\_modules: typing.List\[str] = Nonekeep\_in\_fp32\_modules: typing.List\[str] = None )

A plugin to enable BitsAndBytes 4bit and 8bit quantization


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://boinc-ai.gitbook.io/accelerate/reference/utility-functions-and-classes.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
