# Models

## Models

The base classes [PreTrainedModel](https://huggingface.co/docs/transformers/v4.34.1/en/main_classes/model#transformers.PreTrainedModel), [TFPreTrainedModel](https://huggingface.co/docs/transformers/v4.34.1/en/main_classes/model#transformers.TFPreTrainedModel), and [FlaxPreTrainedModel](https://huggingface.co/docs/transformers/v4.34.1/en/main_classes/model#transformers.FlaxPreTrainedModel) implement the common methods for loading/saving a model either from a local file or directory, or from a pretrained model configuration provided by the library (downloaded from BOINC AI’s AWS S3 repository).

[PreTrainedModel](https://huggingface.co/docs/transformers/v4.34.1/en/main_classes/model#transformers.PreTrainedModel) and [TFPreTrainedModel](https://huggingface.co/docs/transformers/v4.34.1/en/main_classes/model#transformers.TFPreTrainedModel) also implement a few methods which are common among all the models to:

* resize the input token embeddings when new tokens are added to the vocabulary
* prune the attention heads of the model.

The other methods that are common to each model are defined in [ModuleUtilsMixin](https://huggingface.co/docs/transformers/v4.34.1/en/main_classes/model#transformers.modeling_utils.ModuleUtilsMixin) (for the PyTorch models) and `~modeling_tf_utils.TFModuleUtilsMixin` (for the TensorFlow models) or for text generation, [GenerationMixin](https://huggingface.co/docs/transformers/v4.34.1/en/main_classes/text_generation#transformers.GenerationMixin) (for the PyTorch models), [TFGenerationMixin](https://huggingface.co/docs/transformers/v4.34.1/en/main_classes/text_generation#transformers.TFGenerationMixin) (for the TensorFlow models) and [FlaxGenerationMixin](https://huggingface.co/docs/transformers/v4.34.1/en/main_classes/text_generation#transformers.FlaxGenerationMixin) (for the Flax/JAX models).

### PreTrainedModel

#### class transformers.PreTrainedModel

[\<source>](https://github.com/huggingface/transformers/blob/v4.34.1/src/transformers/modeling_utils.py#L1069)

( config: PretrainedConfig\*inputs\*\*kwargs )

Base class for all models.

[PreTrainedModel](https://huggingface.co/docs/transformers/v4.34.1/en/main_classes/model#transformers.PreTrainedModel) takes care of storing the configuration of the models and handles methods for loading, downloading and saving models as well as a few methods common to all models to:

* resize the input embeddings,
* prune heads in the self-attention heads.

Class attributes (overridden by derived classes):

* **config\_class** ([PretrainedConfig](https://huggingface.co/docs/transformers/v4.34.1/en/main_classes/configuration#transformers.PretrainedConfig)) — A subclass of [PretrainedConfig](https://huggingface.co/docs/transformers/v4.34.1/en/main_classes/configuration#transformers.PretrainedConfig) to use as configuration class for this model architecture.
* **load\_tf\_weights** (`Callable`) — A python *method* for loading a TensorFlow checkpoint in a PyTorch model, taking as arguments:
  * **model** ([PreTrainedModel](https://huggingface.co/docs/transformers/v4.34.1/en/main_classes/model#transformers.PreTrainedModel)) — An instance of the model on which to load the TensorFlow checkpoint.
  * **config** (`PreTrainedConfig`) — An instance of the configuration associated to the model.
  * **path** (`str`) — A path to the TensorFlow checkpoint.
* **base\_model\_prefix** (`str`) — A string indicating the attribute associated to the base model in derived classes of the same architecture adding modules on top of the base model.
* **is\_parallelizable** (`bool`) — A flag indicating whether this model supports model parallelization.
* **main\_input\_name** (`str`) — The name of the principal input to the model (often `input_ids` for NLP models, `pixel_values` for vision models and `input_values` for speech models).

**push\_to\_hub**

[\<source>](https://github.com/huggingface/transformers/blob/v4.34.1/src/transformers/utils/hub.py#L786)

( repo\_id: struse\_temp\_dir: typing.Optional\[bool] = Nonecommit\_message: typing.Optional\[str] = Noneprivate: typing.Optional\[bool] = Nonetoken: typing.Union\[bool, str, NoneType] = Nonemax\_shard\_size: typing.Union\[int, str, NoneType] = '10GB'create\_pr: bool = Falsesafe\_serialization: bool = Falserevision: str = None\*\*deprecated\_kwargs )

Parameters

* **repo\_id** (`str`) — The name of the repository you want to push your model to. It should contain your organization name when pushing to a given organization.
* **use\_temp\_dir** (`bool`, *optional*) — Whether or not to use a temporary directory to store the files saved before they are pushed to the Hub. Will default to `True` if there is no directory named like `repo_id`, `False` otherwise.
* **commit\_message** (`str`, *optional*) — Message to commit while pushing. Will default to `"Upload model"`.
* **private** (`bool`, *optional*) — Whether or not the repository created should be private.
* **token** (`bool` or `str`, *optional*) — The token to use as HTTP bearer authorization for remote files. If `True`, will use the token generated when running `boincai-cli login` (stored in `~/.boincai`). Will default to `True` if `repo_url` is not specified.
* **max\_shard\_size** (`int` or `str`, *optional*, defaults to `"10GB"`) — Only applicable for models. The maximum size for a checkpoint before being sharded. Checkpoints shard will then be each of size lower than this size. If expressed as a string, needs to be digits followed by a unit (like `"5MB"`).
* **create\_pr** (`bool`, *optional*, defaults to `False`) — Whether or not to create a PR with the uploaded files or directly commit.
* **safe\_serialization** (`bool`, *optional*, defaults to `False`) — Whether or not to convert the model weights in safetensors format for safer serialization.
* **revision** (`str`, *optional*) — Branch to push the uploaded files to.

Upload the model file to the 🌍Model Hub.

Examples:

Copied

```
from transformers import AutoModel

model = AutoModel.from_pretrained("bert-base-cased")

# Push the model to your namespace with the name "my-finetuned-bert".
model.push_to_hub("my-finetuned-bert")

# Push the model to an organization with the name "my-finetuned-bert".
model.push_to_hub("boincai/my-finetuned-bert")
```

**can\_generate**

[\<source>](https://github.com/huggingface/transformers/blob/v4.34.1/src/transformers/modeling_utils.py#L1232)

( ) → `bool`

Returns

`bool`

Whether this model can generate sequences with `.generate()`.

Returns whether this model can generate sequences with `.generate()`.

**disable\_input\_require\_grads**

[\<source>](https://github.com/huggingface/transformers/blob/v4.34.1/src/transformers/modeling_utils.py#L1335)

( )

Removes the `_require_grads_hook`.

**enable\_input\_require\_grads**

[\<source>](https://github.com/huggingface/transformers/blob/v4.34.1/src/transformers/modeling_utils.py#L1324)

( )

Enables the gradients for the input embeddings. This is useful for fine-tuning adapter weights while keeping the model weights fixed.

**from\_pretrained**

[\<source>](https://github.com/huggingface/transformers/blob/v4.34.1/src/transformers/modeling_utils.py#L2201)

( pretrained\_model\_name\_or\_path: typing.Union\[str, os.PathLike, NoneType]\*model\_argsconfig: typing.Union\[transformers.configuration\_utils.PretrainedConfig, str, os.PathLike, NoneType] = Nonecache\_dir: typing.Union\[str, os.PathLike, NoneType] = Noneignore\_mismatched\_sizes: bool = Falseforce\_download: bool = Falselocal\_files\_only: bool = Falsetoken: typing.Union\[bool, str, NoneType] = Nonerevision: str = 'main'use\_safetensors: bool = None\*\*kwargs )

Parameters

* **pretrained\_model\_name\_or\_path** (`str` or `os.PathLike`, *optional*) — Can be either:
  * A string, the *model id* of a pretrained model hosted inside a model repo on boincai.com. Valid model ids can be located at the root-level, like `bert-base-uncased`, or namespaced under a user or organization name, like `dbmdz/bert-base-german-cased`.
  * A path to a *directory* containing model weights saved using [save\_pretrained()](https://huggingface.co/docs/transformers/v4.34.1/en/main_classes/model#transformers.PreTrainedModel.save_pretrained), e.g., `./my_model_directory/`.
  * A path or url to a *tensorflow index checkpoint file* (e.g, `./tf_model/model.ckpt.index`). In this case, `from_tf` should be set to `True` and a configuration object should be provided as `config` argument. This loading path is slower than converting the TensorFlow checkpoint in a PyTorch model using the provided conversion scripts and loading the PyTorch model afterwards.
  * A path or url to a model folder containing a *flax checkpoint file* in *.msgpack* format (e.g, `./flax_model/` containing `flax_model.msgpack`). In this case, `from_flax` should be set to `True`.
  * `None` if you are both providing the configuration and state dictionary (resp. with keyword arguments `config` and `state_dict`).
* **model\_args** (sequence of positional arguments, *optional*) — All remaining positional arguments will be passed to the underlying model’s `__init__` method.
* **config** (`Union[PretrainedConfig, str, os.PathLike]`, *optional*) — Can be either:

  * an instance of a class derived from [PretrainedConfig](https://huggingface.co/docs/transformers/v4.34.1/en/main_classes/configuration#transformers.PretrainedConfig),
  * a string or path valid as input to [from\_pretrained()](https://huggingface.co/docs/transformers/v4.34.1/en/main_classes/configuration#transformers.PretrainedConfig.from_pretrained).

  Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:

  * The model is a model provided by the library (loaded with the *model id* string of a pretrained model).
  * The model was saved using [save\_pretrained()](https://huggingface.co/docs/transformers/v4.34.1/en/main_classes/model#transformers.PreTrainedModel.save_pretrained) and is reloaded by supplying the save directory.
  * The model is loaded by supplying a local directory as `pretrained_model_name_or_path` and a configuration JSON file named *config.json* is found in the directory.
* **state\_dict** (`Dict[str, torch.Tensor]`, *optional*) — A state dictionary to use instead of a state dictionary loaded from saved weights file.

  This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using [save\_pretrained()](https://huggingface.co/docs/transformers/v4.34.1/en/main_classes/model#transformers.PreTrainedModel.save_pretrained) and [from\_pretrained()](https://huggingface.co/docs/transformers/v4.34.1/en/main_classes/model#transformers.PreTrainedModel.from_pretrained) is not a simpler option.
* **cache\_dir** (`Union[str, os.PathLike]`, *optional*) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.
* **from\_tf** (`bool`, *optional*, defaults to `False`) — Load the model weights from a TensorFlow checkpoint save file (see docstring of `pretrained_model_name_or_path` argument).
* **from\_flax** (`bool`, *optional*, defaults to `False`) — Load the model weights from a Flax checkpoint save file (see docstring of `pretrained_model_name_or_path` argument).
* **ignore\_mismatched\_sizes** (`bool`, *optional*, defaults to `False`) — Whether or not to raise an error if some of the weights from the checkpoint do not have the same size as the weights of the model (if for instance, you are instantiating a model with 10 labels from a checkpoint with 3 labels).
* **force\_download** (`bool`, *optional*, defaults to `False`) — Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.
* **resume\_download** (`bool`, *optional*, defaults to `False`) — Whether or not to delete incompletely received files. Will attempt to resume the download if such a file exists.
* **proxies** (`Dict[str, str]`, *optional*) — A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}`. The proxies are used on each request.
* **output\_loading\_info(`bool`,** *optional*, defaults to `False`) — Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.
* **local\_files\_only(`bool`,** *optional*, defaults to `False`) — Whether or not to only look at local files (i.e., do not try to download the model).
* **token** (`str` or `bool`, *optional*) — The token to use as HTTP bearer authorization for remote files. If `True`, or not specified, will use the token generated when running `boincai-cli login` (stored in `~/.boincai`).
* **revision** (`str`, *optional*, defaults to `"main"`) — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on boincai.com, so `revision` can be any identifier allowed by git.

  To test a pull request you made on the Hub, you can pass \`revision=“refs/pr/“.
* **mirror** (`str`, *optional*) — Mirror source to accelerate downloads in China. If you are from China and have an accessibility problem, you can set this option to resolve it. Note that we do not guarantee the timeliness or safety. Please refer to the mirror site for more information.
* **\_fast\_init(`bool`,** *optional*, defaults to `True`) — Whether or not to disable fast initialization.

  One should only disable *\_fast\_init* to ensure backwards compatibility with `transformers.__version__ < 4.6.0` for seeded model initialization. This argument will be removed at the next major version. See [pull request 11471](https://github.com/huggingface/transformers/pull/11471) for more information.

Parameters for big model inference

* **low\_cpu\_mem\_usage(`bool`,** *optional*) — Tries to not use more than 1x model size in CPU memory (including peak memory) while loading the model. This is an experimental feature and a subject to change at any moment.
* **torch\_dtype** (`str` or `torch.dtype`, *optional*) — Override the default `torch.dtype` and load the model under a specific `dtype`. The different options are:

  1. `torch.float16` or `torch.bfloat16` or `torch.float`: load in a specified `dtype`, ignoring the model’s `config.torch_dtype` if one exists. If not specified
     * the model will get loaded in `torch.float` (fp32).
  2. `"auto"` - A `torch_dtype` entry in the `config.json` file of the model will be attempted to be used. If this entry isn’t found then next check the `dtype` of the first weight in the checkpoint that’s of a floating point type and use that as `dtype`. This will load the model using the `dtype` it was saved in at the end of the training. It can’t be used as an indicator of how the model was trained. Since it could be trained in one of half precision dtypes, but saved in fp32.

  For some models the `dtype` they were trained in is unknown - you may try to check the model’s paper or reach out to the authors and ask them to add this information to the model’s card and to insert the `torch_dtype` entry in `config.json` on the hub.
* **device\_map** (`str` or `Dict[str, Union[int, str, torch.device]]` or `int` or `torch.device`, *optional*) — A map that specifies where each submodule should go. It doesn’t need to be refined to each parameter/buffer name, once a given module name is inside, every submodule of it will be sent to the same device. If we only pass the device (*e.g.*, `"cpu"`, `"cuda:1"`, `"mps"`, or a GPU ordinal rank like `1`) on which the model will be allocated, the device map will map the entire model to this device. Passing `device_map = 0` means put the whole model on GPU 0.

  To have Accelerate compute the most optimized `device_map` automatically, set `device_map="auto"`. For more information about each option see [designing a device map](https://hf.co/docs/accelerate/main/en/usage_guides/big_modeling#designing-a-device-map).
* **max\_memory** (`Dict`, *optional*) — A dictionary device identifier to maximum memory. Will default to the maximum memory available for each GPU and the available CPU RAM if unset.
* **offload\_folder** (`str` or `os.PathLike`, *optional*) — If the `device_map` contains any value `"disk"`, the folder where we will offload weights.
* **offload\_state\_dict** (`bool`, *optional*) — If `True`, will temporarily offload the CPU state dict to the hard drive to avoid getting out of CPU RAM if the weight of the CPU state dict + the biggest shard of the checkpoint does not fit. Defaults to `True` when there is some disk offload.
* **load\_in\_8bit** (`bool`, *optional*, defaults to `False`) — If `True`, will convert the loaded model into mixed-8bit quantized model. To use this feature please install `bitsandbytes` (`pip install -U bitsandbytes`).
* **load\_in\_4bit** (`bool`, *optional*, defaults to `False`) — If `True`, will convert the loaded model into 4bit precision quantized model. To use this feature install the latest version of `bitsandbytes` (`pip install -U bitsandbytes`).
* **quantization\_config** (`Union[QuantizationConfigMixin,Dict]`, *optional*) — A dictionary of configuration parameters or a QuantizationConfigMixin object for quantization (e.g bitsandbytes, gptq)
* **subfolder** (`str`, *optional*, defaults to `""`) — In case the relevant files are located inside a subfolder of the model repo on boincai.com, you can specify the folder name here.
* **variant** (`str`, *optional*) — If specified load weights from `variant` filename, *e.g.* pytorch\_model..bin. `variant` is ignored when using `from_tf` or `from_flax`.
* **use\_safetensors** (`bool`, *optional*, defaults to `None`) — Whether or not to use `safetensors` checkpoints. Defaults to `None`. If not specified and `safetensors` is not installed, it will be set to `False`.
* **kwargs** (remaining dictionary of keyword arguments, *optional*) — Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., `output_attentions=True`). Behaves differently depending on whether a `config` is provided or automatically loaded:
  * If a configuration is provided with `config`, `**kwargs` will be directly passed to the underlying model’s `__init__` method (we assume all relevant updates to the configuration have already been done)
  * If a configuration is not provided, `kwargs` will be first passed to the configuration class initialization function ([from\_pretrained()](https://huggingface.co/docs/transformers/v4.34.1/en/main_classes/configuration#transformers.PretrainedConfig.from_pretrained)). Each key of `kwargs` that corresponds to a configuration attribute will be used to override said attribute with the supplied `kwargs` value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s `__init__` function.

Instantiate a pretrained pytorch model from a pre-trained model configuration.

The model is set in evaluation mode by default using `model.eval()` (Dropout modules are deactivated). To train the model, you should first set it back in training mode with `model.train()`.

The warning *Weights from XXX not initialized from pretrained model* means that the weights of XXX do not come pretrained with the rest of the model. It is up to you to train those weights with a downstream fine-tuning task.

The warning *Weights from XXX not used in YYY* means that the layer XXX is not used by YYY, therefore those weights are discarded.

Activate the special [“offline-mode”](https://huggingface.co/transformers/installation.html#offline-mode) to use this method in a firewalled environment.

Examples:

Copied

```
>>> from transformers import BertConfig, BertModel

>>> # Download model and configuration from boincai.com and cache.
>>> model = BertModel.from_pretrained("bert-base-uncased")
>>> # Model was saved using *save_pretrained('./test/saved_model/')* (for example purposes, not runnable).
>>> model = BertModel.from_pretrained("./test/saved_model/")
>>> # Update configuration during loading.
>>> model = BertModel.from_pretrained("bert-base-uncased", output_attentions=True)
>>> assert model.config.output_attentions == True
>>> # Loading from a TF checkpoint file instead of a PyTorch model (slower, for example purposes, not runnable).
>>> config = BertConfig.from_json_file("./tf_model/my_tf_model_config.json")
>>> model = BertModel.from_pretrained("./tf_model/my_tf_checkpoint.ckpt.index", from_tf=True, config=config)
>>> # Loading from a Flax checkpoint file instead of a PyTorch model (slower)
>>> model = BertModel.from_pretrained("bert-base-uncased", from_flax=True)
```

* `low_cpu_mem_usage` algorithm:

This is an experimental function that loads the model using \~1x model size CPU memory

Here is how it works:

1. save which state\_dict keys we have
2. drop state\_dict before the model is created, since the latter takes 1x model size CPU memory
3. after the model has been instantiated switch to the meta device all params/buffers that are going to be replaced from the loaded state\_dict
4. load state\_dict 2nd time
5. replace the params/buffers from the state\_dict

Currently, it can’t handle deepspeed ZeRO stage 3 and ignores loading errors

**get\_input\_embeddings**

[\<source>](https://github.com/huggingface/transformers/blob/v4.34.1/src/transformers/modeling_utils.py#L1341)

( ) → `nn.Module`

Returns

`nn.Module`

A torch module mapping vocabulary to hidden states.

Returns the model’s input embeddings.

**get\_memory\_footprint**

[\<source>](https://github.com/huggingface/transformers/blob/v4.34.1/src/transformers/modeling_utils.py#L2141)

( return\_buffers = True )

Parameters

* **return\_buffers** (`bool`, *optional*, defaults to `True`) — Whether to return the size of the buffer tensors in the computation of the memory footprint. Buffers are tensors that do not require gradients and not registered as parameters. E.g. mean and std in batch norm layers. Please see: <https://discuss.pytorch.org/t/what-pytorch-means-by-buffers/120266/2>

Get the memory footprint of a model. This will return the memory footprint of the current model in bytes. Useful to benchmark the memory footprint of the current model and design some tests. Solution inspired from the PyTorch discussions: <https://discuss.pytorch.org/t/gpu-memory-that-model-uses/56822/2>

**get\_output\_embeddings**

[\<source>](https://github.com/huggingface/transformers/blob/v4.34.1/src/transformers/modeling_utils.py#L1367)

( ) → `nn.Module`

Returns

`nn.Module`

A torch module mapping hidden states to vocabulary.

Returns the model’s output embeddings.

**gradient\_checkpointing\_disable**

[\<source>](https://github.com/huggingface/transformers/blob/v4.34.1/src/transformers/modeling_utils.py#L1837)

( )

Deactivates gradient checkpointing for the current model.

Note that in other frameworks this feature can be referred to as “activation checkpointing” or “checkpoint activations”.

**gradient\_checkpointing\_enable**

[\<source>](https://github.com/huggingface/transformers/blob/v4.34.1/src/transformers/modeling_utils.py#L1819)

( )

Activates gradient checkpointing for the current model.

Note that in other frameworks this feature can be referred to as “activation checkpointing” or “checkpoint activations”.

**init\_weights**

[\<source>](https://github.com/huggingface/transformers/blob/v4.34.1/src/transformers/modeling_utils.py#L1785)

( )

If needed prunes and maybe initializes weights. If using a custom `PreTrainedModel`, you need to implement any initialization logic in `_init_weights`.

**post\_init**

[\<source>](https://github.com/huggingface/transformers/blob/v4.34.1/src/transformers/modeling_utils.py#L1151)

( )

A method executed at the end of each Transformer model initialization, to execute code that needs the model’s modules properly initialized (such as weight initialization).

**prune\_heads**

[\<source>](https://github.com/huggingface/transformers/blob/v4.34.1/src/transformers/modeling_utils.py#L1802)

( heads\_to\_prune: typing.Dict\[int, typing.List\[int]] )

Parameters

* **heads\_to\_prune** (`Dict[int, List[int]]`) — Dictionary with keys being selected layer indices (`int`) and associated values being the list of heads to prune in said layer (list of `int`). For instance {1: \[0, 2], 2: \[2, 3]} will prune heads 0 and 2 on layer 1 and heads 2 and 3 on layer 2.

Prunes heads of the base model.

**register\_for\_auto\_class**

[\<source>](https://github.com/huggingface/transformers/blob/v4.34.1/src/transformers/modeling_utils.py#L3852)

( auto\_class = 'AutoModel' )

Parameters

* **auto\_class** (`str` or `type`, *optional*, defaults to `"AutoModel"`) — The auto class to register this new model with.

Register this class with a given auto class. This should only be used for custom models as the ones in the library are already mapped with an auto class.

This API is experimental and may have some slight breaking changes in the next releases.

**resize\_token\_embeddings**

[\<source>](https://github.com/huggingface/transformers/blob/v4.34.1/src/transformers/modeling_utils.py#L1507)

( new\_num\_tokens: typing.Optional\[int] = Nonepad\_to\_multiple\_of: typing.Optional\[int] = None ) → `torch.nn.Embedding`

Parameters

* **new\_num\_tokens** (`int`, *optional*) — The number of new tokens in the embedding matrix. Increasing the size will add newly initialized vectors at the end. Reducing the size will remove vectors from the end. If not provided or `None`, just returns a pointer to the input tokens `torch.nn.Embedding` module of the model without doing anything.
* **pad\_to\_multiple\_of** (`int`, *optional*) — If set will pad the embedding matrix to a multiple of the provided value.If `new_num_tokens` is set to `None` will just pad the embedding to a multiple of `pad_to_multiple_of`.

  This is especially useful to enable the use of Tensor Cores on NVIDIA hardware with compute capability `>= 7.5` (Volta), or on TPUs which benefit from having sequence lengths be a multiple of 128. For more details about this, or help on choosing the correct value for resizing, refer to this guide: <https://docs.nvidia.com/deeplearning/performance/dl-performance-matrix-multiplication/index.html#requirements-tc>

Returns

`torch.nn.Embedding`

Pointer to the input tokens Embeddings Module of the model.

Resizes input token embeddings matrix of the model if `new_num_tokens != config.vocab_size`.

Takes care of tying weights embeddings afterwards if the model class has a `tie_weights()` method.

**reverse\_bettertransformer**

[\<source>](https://github.com/huggingface/transformers/blob/v4.34.1/src/transformers/modeling_utils.py#L3906)

( ) → [PreTrainedModel](https://huggingface.co/docs/transformers/v4.34.1/en/main_classes/model#transformers.PreTrainedModel)

Returns

[PreTrainedModel](https://huggingface.co/docs/transformers/v4.34.1/en/main_classes/model#transformers.PreTrainedModel)

The model converted back to the original modeling.

Reverts the transformation from [to\_bettertransformer()](https://huggingface.co/docs/transformers/v4.34.1/en/main_classes/model#transformers.PreTrainedModel.to_bettertransformer) so that the original modeling is used, for example in order to save the model.

**save\_pretrained**

[\<source>](https://github.com/huggingface/transformers/blob/v4.34.1/src/transformers/modeling_utils.py#L1860)

( save\_directory: typing.Union\[str, os.PathLike]is\_main\_process: bool = Truestate\_dict: typing.Optional\[dict] = Nonesave\_function: typing.Callable = \<function save at 0x7f8ef4ddc4c0>push\_to\_hub: bool = Falsemax\_shard\_size: typing.Union\[int, str] = '10GB'safe\_serialization: bool = Falsevariant: typing.Optional\[str] = Nonetoken: typing.Union\[bool, str, NoneType] = Nonesave\_peft\_format: bool = True\*\*kwargs )

Parameters

* **save\_directory** (`str` or `os.PathLike`) — Directory to which to save. Will be created if it doesn’t exist.
* **is\_main\_process** (`bool`, *optional*, defaults to `True`) — Whether the process calling this is the main process or not. Useful when in distributed training like TPUs and need to call this function on all processes. In this case, set `is_main_process=True` only on the main process to avoid race conditions.
* **state\_dict** (nested dictionary of `torch.Tensor`) — The state dictionary of the model to save. Will default to `self.state_dict()`, but can be used to only save parts of the model or if special precautions need to be taken when recovering the state dictionary of a model (like when using model parallelism).
* **save\_function** (`Callable`) — The function to use to save the state dictionary. Useful on distributed training like TPUs when one need to replace `torch.save` by another method.
* **push\_to\_hub** (`bool`, *optional*, defaults to `False`) — Whether or not to push your model to the BOINC AI model hub after saving it. You can specify the repository you want to push to with `repo_id` (will default to the name of `save_directory` in your namespace).
* **max\_shard\_size** (`int` or `str`, *optional*, defaults to `"10GB"`) — The maximum size for a checkpoint before being sharded. Checkpoints shard will then be each of size lower than this size. If expressed as a string, needs to be digits followed by a unit (like `"5MB"`).

  If a single weight of the model is bigger than `max_shard_size`, it will be in its own checkpoint shard which will be bigger than `max_shard_size`.
* **safe\_serialization** (`bool`, *optional*, defaults to `False`) — Whether to save the model using `safetensors` or the traditional PyTorch way (that uses `pickle`).
* **variant** (`str`, *optional*) — If specified, weights are saved in the format pytorch\_model..bin.
* **token** (`str` or `bool`, *optional*) — The token to use as HTTP bearer authorization for remote files. If `True`, or not specified, will use the token generated when running `boincai-cli login` (stored in `~/.boincai`).
* **save\_peft\_format** (`bool`, *optional*, defaults to `True`) — For backward compatibility with PEFT library, in case adapter weights are attached to the model, all keys of the state dict of adapters needs to be pre-pended with `base_model.model`. Advanced users can disable this behaviours by setting `save_peft_format` to `False`.
* **kwargs** (`Dict[str, Any]`, *optional*) — Additional key word arguments passed along to the [push\_to\_hub()](https://huggingface.co/docs/transformers/v4.34.1/en/main_classes/processors#transformers.ProcessorMixin.push_to_hub) method.

Save a model and its configuration file to a directory, so that it can be re-loaded using the [from\_pretrained()](https://huggingface.co/docs/transformers/v4.34.1/en/main_classes/model#transformers.PreTrainedModel.from_pretrained) class method.

**set\_input\_embeddings**

[\<source>](https://github.com/huggingface/transformers/blob/v4.34.1/src/transformers/modeling_utils.py#L1354)

( value: Module )

Parameters

* **value** (`nn.Module`) — A module mapping vocabulary to hidden states.

Set model’s input embeddings.

**tie\_weights**

[\<source>](https://github.com/huggingface/transformers/blob/v4.34.1/src/transformers/modeling_utils.py#L1391)

( )

Tie the weights between the input embeddings and the output embeddings.

If the `torchscript` flag is set in the configuration, can’t handle parameter sharing so we are cloning the weights instead.

**to\_bettertransformer**

[\<source>](https://github.com/huggingface/transformers/blob/v4.34.1/src/transformers/modeling_utils.py#L3878)

( ) → [PreTrainedModel](https://huggingface.co/docs/transformers/v4.34.1/en/main_classes/model#transformers.PreTrainedModel)

Returns

[PreTrainedModel](https://huggingface.co/docs/transformers/v4.34.1/en/main_classes/model#transformers.PreTrainedModel)

The model converted to BetterTransformer.

Converts the model to use [PyTorch’s native attention implementation](https://pytorch.org/docs/stable/generated/torch.nn.MultiheadAttention.html), integrated to Transformers through [Optimum library](https://huggingface.co/docs/optimum/bettertransformer/overview). Only a subset of all Transformers models are supported.

PyTorch’s attention fastpath allows to speed up inference through kernel fusions and the use of [nested tensors](https://pytorch.org/docs/stable/nested.html). Detailed benchmarks can be found in [this blog post](https://medium.com/pytorch/bettertransformer-out-of-the-box-performance-for-huggingface-transformers-3fbe27d50ab2).

**warn\_if\_padding\_and\_no\_attention\_mask**

[\<source>](https://github.com/huggingface/transformers/blob/v4.34.1/src/transformers/modeling_utils.py#L3928)

( input\_idsattention\_mask )

Shows a one-time warning if the input\_ids appear to contain padding and no attention mask was given.

#### Large model loading

In Transformers 4.20.0, the [from\_pretrained()](https://huggingface.co/docs/transformers/v4.34.1/en/main_classes/model#transformers.PreTrainedModel.from_pretrained) method has been reworked to accommodate large models using [Accelerate](https://huggingface.co/docs/accelerate/big_modeling). This requires Accelerate >= 0.9.0 and PyTorch >= 1.9.0. Instead of creating the full model, then loading the pretrained weights inside it (which takes twice the size of the model in RAM, one for the randomly initialized model, one for the weights), there is an option to create the model as an empty shell, then only materialize its parameters when the pretrained weights are loaded.

This option can be activated with `low_cpu_mem_usage=True`. The model is first created on the Meta device (with empty weights) and the state dict is then loaded inside it (shard by shard in the case of a sharded checkpoint). This way the maximum RAM used is the full size of the model only.

Copied

```
from transformers import AutoModelForSeq2SeqLM

t0pp = AutoModelForSeq2SeqLM.from_pretrained("bigscience/T0pp", low_cpu_mem_usage=True)
```

Moreover, you can directly place the model on different devices if it doesn’t fully fit in RAM (only works for inference for now). With `device_map="auto"`, Accelerate will determine where to put each layer to maximize the use of your fastest devices (GPUs) and offload the rest on the CPU, or even the hard drive if you don’t have enough GPU RAM (or CPU RAM). Even if the model is split across several devices, it will run as you would normally expect.

When passing a `device_map`, `low_cpu_mem_usage` is automatically set to `True`, so you don’t need to specify it:

Copied

```
from transformers import AutoModelForSeq2SeqLM

t0pp = AutoModelForSeq2SeqLM.from_pretrained("bigscience/T0pp", device_map="auto")
```

You can inspect how the model was split across devices by looking at its `hf_device_map` attribute:

Copied

```
t0pp.hf_device_map
```

Copied

```
{'shared': 0,
 'decoder.embed_tokens': 0,
 'encoder': 0,
 'decoder.block.0': 0,
 'decoder.block.1': 1,
 'decoder.block.2': 1,
 'decoder.block.3': 1,
 'decoder.block.4': 1,
 'decoder.block.5': 1,
 'decoder.block.6': 1,
 'decoder.block.7': 1,
 'decoder.block.8': 1,
 'decoder.block.9': 1,
 'decoder.block.10': 1,
 'decoder.block.11': 1,
 'decoder.block.12': 1,
 'decoder.block.13': 1,
 'decoder.block.14': 1,
 'decoder.block.15': 1,
 'decoder.block.16': 1,
 'decoder.block.17': 1,
 'decoder.block.18': 1,
 'decoder.block.19': 1,
 'decoder.block.20': 1,
 'decoder.block.21': 1,
 'decoder.block.22': 'cpu',
 'decoder.block.23': 'cpu',
 'decoder.final_layer_norm': 'cpu',
 'decoder.dropout': 'cpu',
 'lm_head': 'cpu'}
```

You can also write your own device map following the same format (a dictionary layer name to device). It should map all parameters of the model to a given device, but you don’t have to detail where all the submodules of one layer go if that layer is entirely on the same device. For instance, the following device map would work properly for T0pp (as long as you have the GPU memory):

Copied

```
device_map = {"shared": 0, "encoder": 0, "decoder": 1, "lm_head": 1}
```

Another way to minimize the memory impact of your model is to instantiate it at a lower precision dtype (like `torch.float16`) or use direct quantization techniques as described below.

#### Model Instantiation dtype

Under Pytorch a model normally gets instantiated with `torch.float32` format. This can be an issue if one tries to load a model whose weights are in fp16, since it’d require twice as much memory. To overcome this limitation, you can either explicitly pass the desired `dtype` using `torch_dtype` argument:

Copied

```
model = T5ForConditionalGeneration.from_pretrained("t5", torch_dtype=torch.float16)
```

or, if you want the model to always load in the most optimal memory pattern, you can use the special value `"auto"`, and then `dtype` will be automatically derived from the model’s weights:

Copied

```
model = T5ForConditionalGeneration.from_pretrained("t5", torch_dtype="auto")
```

Models instantiated from scratch can also be told which `dtype` to use with:

Copied

```
config = T5Config.from_pretrained("t5")
model = AutoModel.from_config(config)
```

Due to Pytorch design, this functionality is only available for floating dtypes.

### ModuleUtilsMixin

#### class transformers.modeling\_utils.ModuleUtilsMixin

[\<source>](https://github.com/huggingface/transformers/blob/v4.34.1/src/transformers/modeling_utils.py#L765)

( )

A few utilities for `torch.nn.Modules`, to be used as a mixin.

**add\_memory\_hooks**

[\<source>](https://github.com/huggingface/transformers/blob/v4.34.1/src/transformers/modeling_utils.py#L796)

( )

Add a memory hook before and after each sub-module forward pass to record increase in memory consumption.

Increase in memory consumption is stored in a `mem_rss_diff` attribute for each module and can be reset to zero with `model.reset_memory_hooks_state()`.

**estimate\_tokens**

[\<source>](https://github.com/huggingface/transformers/blob/v4.34.1/src/transformers/modeling_utils.py#L1021)

( input\_dict: typing.Dict\[str, typing.Union\[torch.Tensor, typing.Any]] ) → `int`

Parameters

* **inputs** (`dict`) — The model inputs.

Returns

`int`

The total number of tokens.

Helper function to estimate the total number of tokens from the model inputs.

**floating\_point\_ops**

[\<source>](https://github.com/huggingface/transformers/blob/v4.34.1/src/transformers/modeling_utils.py#L1042)

( input\_dict: typing.Dict\[str, typing.Union\[torch.Tensor, typing.Any]]exclude\_embeddings: bool = True ) → `int`

Parameters

* **batch\_size** (`int`) — The batch size for the forward pass.
* **sequence\_length** (`int`) — The number of tokens in each line of the batch.
* **exclude\_embeddings** (`bool`, *optional*, defaults to `True`) — Whether or not to count embedding and softmax operations.

Returns

`int`

The number of floating-point operations.

Get number of (optionally, non-embeddings) floating-point operations for the forward and backward passes of a batch with this transformer model. Default approximation neglects the quadratic dependency on the number of tokens (valid if `12 * d_model << sequence_length`) as laid out in [this paper](https://arxiv.org/pdf/2001.08361.pdf) section 2.1. Should be overridden for transformers with parameter re-use e.g. Albert or Universal Transformers, or if doing long-range modeling with very high sequence lengths.

**get\_extended\_attention\_mask**

[\<source>](https://github.com/huggingface/transformers/blob/v4.34.1/src/transformers/modeling_utils.py#L884)

( attention\_mask: Tensorinput\_shape: typing.Tuple\[int]device: device = Nonedtype: torch.float32 = None )

Parameters

* **attention\_mask** (`torch.Tensor`) — Mask with ones indicating tokens to attend to, zeros for tokens to ignore.
* **input\_shape** (`Tuple[int]`) — The shape of the input to the model.

Makes broadcastable attention and causal masks so that future and masked tokens are ignored.

**get\_head\_mask**

[\<source>](https://github.com/huggingface/transformers/blob/v4.34.1/src/transformers/modeling_utils.py#L936)

( head\_mask: typing.Optional\[torch.Tensor]num\_hidden\_layers: intis\_attention\_chunked: bool = False )

Parameters

* **head\_mask** (`torch.Tensor` with shape `[num_heads]` or `[num_hidden_layers x num_heads]`, *optional*) — The mask indicating if we should keep the heads or not (1.0 for keep, 0.0 for discard).
* **num\_hidden\_layers** (`int`) — The number of hidden layers in the model.
* **is\_attention\_chunked** (`bool`, *optional*, defaults to `False`) — Whether or not the attentions scores are computed by chunks or not.

Prepare the head mask if needed.

**invert\_attention\_mask**

[\<source>](https://github.com/huggingface/transformers/blob/v4.34.1/src/transformers/modeling_utils.py#L832)

( encoder\_attention\_mask: Tensor ) → `torch.Tensor`

Parameters

* **encoder\_attention\_mask** (`torch.Tensor`) — An attention mask.

Returns

`torch.Tensor`

The inverted attention mask.

Invert an attention mask (e.g., switches 0. and 1.).

**num\_parameters**

[\<source>](https://github.com/huggingface/transformers/blob/v4.34.1/src/transformers/modeling_utils.py#L974)

( only\_trainable: bool = Falseexclude\_embeddings: bool = False ) → `int`

Parameters

* **only\_trainable** (`bool`, *optional*, defaults to `False`) — Whether or not to return only the number of trainable parameters
* **exclude\_embeddings** (`bool`, *optional*, defaults to `False`) — Whether or not to return only the number of non-embeddings parameters

Returns

`int`

The number of parameters.

Get number of (optionally, trainable or non-embeddings) parameters in the module.

**reset\_memory\_hooks\_state**

[\<source>](https://github.com/huggingface/transformers/blob/v4.34.1/src/transformers/modeling_utils.py#L808)

( )

Reset the `mem_rss_diff` attribute of each module (see [add\_memory\_hooks()](https://huggingface.co/docs/transformers/v4.34.1/en/main_classes/model#transformers.modeling_utils.ModuleUtilsMixin.add_memory_hooks)).

### TFPreTrainedModel

#### class transformers.TFPreTrainedModel

[\<source>](https://github.com/huggingface/transformers/blob/v4.34.1/src/transformers/modeling_tf_utils.py#L1057)

( \*args\*\*kwargs )

Base class for all TF models.

[TFPreTrainedModel](https://huggingface.co/docs/transformers/v4.34.1/en/main_classes/model#transformers.TFPreTrainedModel) takes care of storing the configuration of the models and handles methods for loading, downloading and saving models as well as a few methods common to all models to:

* resize the input embeddings,
* prune heads in the self-attention heads.

Class attributes (overridden by derived classes):

* **config\_class** ([PretrainedConfig](https://huggingface.co/docs/transformers/v4.34.1/en/main_classes/configuration#transformers.PretrainedConfig)) — A subclass of [PretrainedConfig](https://huggingface.co/docs/transformers/v4.34.1/en/main_classes/configuration#transformers.PretrainedConfig) to use as configuration class for this model architecture.
* **base\_model\_prefix** (`str`) — A string indicating the attribute associated to the base model in derived classes of the same architecture adding modules on top of the base model.
* **main\_input\_name** (`str`) — The name of the principal input to the model (often `input_ids` for NLP models, `pixel_values` for vision models and `input_values` for speech models).

**push\_to\_hub**

[\<source>](https://github.com/huggingface/transformers/blob/v4.34.1/src/transformers/modeling_tf_utils.py#L3050)

( repo\_id: struse\_temp\_dir: Optional\[bool] = Nonecommit\_message: Optional\[str] = Noneprivate: Optional\[bool] = Nonemax\_shard\_size: Optional\[Union\[int, str]] = '10GB'token: Optional\[Union\[bool, str]] = Noneuse\_auth\_token: Optional\[Union\[bool, str]] = Nonecreate\_pr: bool = False\*\*base\_model\_card\_args )

Parameters

* **repo\_id** (`str`) — The name of the repository you want to push your model to. It should contain your organization name when pushing to a given organization.
* **use\_temp\_dir** (`bool`, *optional*) — Whether or not to use a temporary directory to store the files saved before they are pushed to the Hub. Will default to `True` if there is no directory named like `repo_id`, `False` otherwise.
* **commit\_message** (`str`, *optional*) — Message to commit while pushing. Will default to `"Upload model"`.
* **private** (`bool`, *optional*) — Whether or not the repository created should be private.
* **token** (`bool` or `str`, *optional*) — The token to use as HTTP bearer authorization for remote files. If `True`, will use the token generated when running `boincai-cli login` (stored in `~/.boincai`). Will default to `True` if `repo_url` is not specified.
* **max\_shard\_size** (`int` or `str`, *optional*, defaults to `"10GB"`) — Only applicable for models. The maximum size for a checkpoint before being sharded. Checkpoints shard will then be each of size lower than this size. If expressed as a string, needs to be digits followed by a unit (like `"5MB"`).
* **create\_pr** (`bool`, *optional*, defaults to `False`) — Whether or not to create a PR with the uploaded files or directly commit.

Upload the model files to the BOINCAI  Model Hub while synchronizing a local clone of the repo in `repo_path_or_name`.

Examples:

Copied

```
from transformers import TFAutoModel

model = TFAutoModel.from_pretrained("bert-base-cased")

# Push the model to your namespace with the name "my-finetuned-bert".
model.push_to_hub("my-finetuned-bert")

# Push the model to an organization with the name "my-finetuned-bert".
model.push_to_hub("boincai/my-finetuned-bert")
```

**can\_generate**

[\<source>](https://github.com/huggingface/transformers/blob/v4.34.1/src/transformers/modeling_tf_utils.py#L1302)

( ) → `bool`

Returns

`bool`

Whether this model can generate sequences with `.generate()`.

Returns whether this model can generate sequences with `.generate()`.

**compile**

[\<source>](https://github.com/huggingface/transformers/blob/v4.34.1/src/transformers/modeling_tf_utils.py#L1497)

( optimizer = 'rmsprop'loss = 'auto\_with\_warning'metrics = Noneloss\_weights = Noneweighted\_metrics = Nonerun\_eagerly = Nonesteps\_per\_execution = None\*\*kwargs )

This is a thin wrapper that sets the model’s loss output head as the loss if the user does not specify a loss function themselves.

**create\_model\_card**

[\<source>](https://github.com/huggingface/transformers/blob/v4.34.1/src/transformers/modeling_tf_utils.py#L1792)

( output\_dirmodel\_name: strlanguage: Optional\[str] = Nonelicense: Optional\[str] = Nonetags: Optional\[str] = Nonefinetuned\_from: Optional\[str] = Nonetasks: Optional\[str] = Nonedataset\_tags: Optional\[Union\[str, List\[str]]] = Nonedataset: Optional\[Union\[str, List\[str]]] = Nonedataset\_args: Optional\[Union\[str, List\[str]]] = None )

Parameters

* **output\_dir** (`str` or `os.PathLike`) — The folder in which to create the model card.
* **model\_name** (`str`, *optional*) — The name of the model.
* **language** (`str`, *optional*) — The language of the model (if applicable)
* **license** (`str`, *optional*) — The license of the model. Will default to the license of the pretrained model used, if the original model given to the `Trainer` comes from a repo on the Hub.
* **tags** (`str` or `List[str]`, *optional*) — Some tags to be included in the metadata of the model card.
* **finetuned\_from** (`str`, *optional*) — The name of the model used to fine-tune this one (if applicable). Will default to the name of the repo of the original model given to the `Trainer` (if it comes from the Hub).
* **tasks** (`str` or `List[str]`, *optional*) — One or several task identifiers, to be included in the metadata of the model card.
* **dataset\_tags** (`str` or `List[str]`, *optional*) — One or several dataset tags, to be included in the metadata of the model card.
* **dataset** (`str` or `List[str]`, *optional*) — One or several dataset identifiers, to be included in the metadata of the model card.
* **dataset\_args** (`str` or `List[str]`, *optional*) — One or several dataset arguments, to be included in the metadata of the model card.

Creates a draft of a model card using the information available to the `Trainer`.

**eager\_serving**

[\<source>](https://github.com/huggingface/transformers/blob/v4.34.1/src/transformers/modeling_tf_utils.py#L1214)

( inputs )

Parameters

* **inputs** (`Dict[str, tf.Tensor]`) — The input of the saved model as a dictionary of tensors.

Method used for serving the model. This method is deprecated, and will be removed.

**from\_pretrained**

[\<source>](https://github.com/huggingface/transformers/blob/v4.34.1/src/transformers/modeling_tf_utils.py#L2499)

( pretrained\_model\_name\_or\_path: Optional\[Union\[str, os.PathLike]]\*model\_argsconfig: Optional\[Union\[PretrainedConfig, str, os.PathLike]] = Nonecache\_dir: Optional\[Union\[str, os.PathLike]] = Noneignore\_mismatched\_sizes: bool = Falseforce\_download: bool = Falselocal\_files\_only: bool = Falsetoken: Optional\[Union\[str, bool]] = Nonerevision: str = 'main'\*\*kwargs )

Parameters

* **pretrained\_model\_name\_or\_path** (`str`, *optional*) — Can be either:
  * A string, the *model id* of a pretrained model hosted inside a model repo on boincai.com. Valid model ids can be located at the root-level, like `bert-base-uncased`, or namespaced under a user or organization name, like `dbmdz/bert-base-german-cased`.
  * A path to a *directory* containing model weights saved using [save\_pretrained()](https://huggingface.co/docs/transformers/v4.34.1/en/main_classes/model#transformers.TFPreTrainedModel.save_pretrained), e.g., `./my_model_directory/`.
  * A path or url to a *PyTorch state\_dict save file* (e.g, `./pt_model/pytorch_model.bin`). In this case, `from_pt` should be set to `True` and a configuration object should be provided as `config` argument. This loading path is slower than converting the PyTorch model in a TensorFlow model using the provided conversion scripts and loading the TensorFlow model afterwards.
  * `None` if you are both providing the configuration and state dictionary (resp. with keyword arguments `config` and `state_dict`).
* **model\_args** (sequence of positional arguments, *optional*) — All remaining positional arguments will be passed to the underlying model’s `__init__` method.
* **config** (`Union[PretrainedConfig, str]`, *optional*) — Can be either:

  * an instance of a class derived from [PretrainedConfig](https://huggingface.co/docs/transformers/v4.34.1/en/main_classes/configuration#transformers.PretrainedConfig),
  * a string valid as input to [from\_pretrained()](https://huggingface.co/docs/transformers/v4.34.1/en/main_classes/configuration#transformers.PretrainedConfig.from_pretrained).

  Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:

  * The model is a model provided by the library (loaded with the *model id* string of a pretrained model).
  * The model was saved using [save\_pretrained()](https://huggingface.co/docs/transformers/v4.34.1/en/main_classes/model#transformers.TFPreTrainedModel.save_pretrained) and is reloaded by supplying the save directory.
  * The model is loaded by supplying a local directory as `pretrained_model_name_or_path` and a configuration JSON file named *config.json* is found in the directory.
* **from\_pt** (`bool`, *optional*, defaults to `False`) — Load the model weights from a PyTorch state\_dict save file (see docstring of `pretrained_model_name_or_path` argument).
* **ignore\_mismatched\_sizes** (`bool`, *optional*, defaults to `False`) — Whether or not to raise an error if some of the weights from the checkpoint do not have the same size as the weights of the model (if for instance, you are instantiating a model with 10 labels from a checkpoint with 3 labels).
* **cache\_dir** (`str`, *optional*) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.
* **force\_download** (`bool`, *optional*, defaults to `False`) — Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.
* **resume\_download** (`bool`, *optional*, defaults to `False`) — Whether or not to delete incompletely received files. Will attempt to resume the download if such a file exists. proxies — (`Dict[str, str],` optional`): A dictionary of proxy servers to use by protocol or endpoint, e.g.,` {‘http’: ‘foo.bar:3128’, ‘<http://hostname’>: ‘foo.bar:4012’}`. The proxies are used on each request. output_loading_info(`bool`, *optional*, defaults to` False\`): Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.
* **local\_files\_only(`bool`,** *optional*, defaults to `False`) — Whether or not to only look at local files (e.g., not try downloading the model).
* **token** (`str` or `bool`, *optional*) — The token to use as HTTP bearer authorization for remote files. If `True`, or not specified, will use the token generated when running `boincai-cli login` (stored in `~/.boincai`).
* **revision** (`str`, *optional*, defaults to `"main"`) — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on boincai.com, so `revision` can be any identifier allowed by git.

Instantiate a pretrained TF 2.0 model from a pre-trained model configuration.

The warning *Weights from XXX not initialized from pretrained model* means that the weights of XXX do not come pretrained with the rest of the model. It is up to you to train those weights with a downstream fine-tuning task.

The warning *Weights from XXX not used in YYY* means that the layer XXX is not used by YYY, therefore those weights are discarded.

Examples:

Copied

```
>>> from transformers import BertConfig, TFBertModel

>>> # Download model and configuration from boincai.com and cache.
>>> model = TFBertModel.from_pretrained("bert-base-uncased")
>>> # Model was saved using *save_pretrained('./test/saved_model/')* (for example purposes, not runnable).
>>> model = TFBertModel.from_pretrained("./test/saved_model/")
>>> # Update configuration during loading.
>>> model = TFBertModel.from_pretrained("bert-base-uncased", output_attentions=True)
>>> assert model.config.output_attentions == True
>>> # Loading from a Pytorch model file instead of a TensorFlow checkpoint (slower, for example purposes, not runnable).
>>> config = BertConfig.from_json_file("./pt_model/my_pt_model_config.json")
>>> model = TFBertModel.from_pretrained("./pt_model/my_pytorch_model.bin", from_pt=True, config=config)
```

**get\_bias**

[\<source>](https://github.com/huggingface/transformers/blob/v4.34.1/src/transformers/modeling_tf_utils.py#L1932)

( ) → `tf.Variable`

Returns

`tf.Variable`

The weights representing the bias, None if not an LM model.

Dict of bias attached to an LM head. The key represents the name of the bias attribute.

**get\_head\_mask**

[\<source>](https://github.com/huggingface/transformers/blob/v4.34.1/src/transformers/modeling_tf_utils.py#L1169)

( head\_mask: tf.Tensor | Nonenum\_hidden\_layers: int )

Parameters

* **head\_mask** (`tf.Tensor` with shape `[num_heads]` or `[num_hidden_layers x num_heads]`, *optional*) — The mask indicating if we should keep the heads or not (1.0 for keep, 0.0 for discard).
* **num\_hidden\_layers** (`int`) — The number of hidden layers in the model.

Prepare the head mask if needed.

**get\_input\_embeddings**

[\<source>](https://github.com/huggingface/transformers/blob/v4.34.1/src/transformers/modeling_tf_utils.py#L1316)

( ) → `tf.Variable`

Returns

`tf.Variable`

The embeddings layer mapping vocabulary to hidden states.

Returns the model’s input embeddings layer.

**get\_lm\_head**

[\<source>](https://github.com/huggingface/transformers/blob/v4.34.1/src/transformers/modeling_tf_utils.py#L1965)

( ) → `tf.keras.layers.Layer`

Returns

`tf.keras.layers.Layer`

The LM head layer if the model has one, None if not.

The LM Head layer. This method must be overwritten by all the models that have a lm head.

**get\_output\_embeddings**

[\<source>](https://github.com/huggingface/transformers/blob/v4.34.1/src/transformers/modeling_tf_utils.py#L1872)

( ) → `tf.Variable`

Returns

`tf.Variable`

The new weights mapping vocabulary to hidden states.

Returns the model’s output embeddings

**get\_output\_layer\_with\_bias**

[\<source>](https://github.com/huggingface/transformers/blob/v4.34.1/src/transformers/modeling_tf_utils.py#L1909)

( ) → `tf.keras.layers.Layer`

Returns

`tf.keras.layers.Layer`

The layer that handles the bias, None if not an LM model.

Get the layer that handles a bias attribute in case the model has an LM head with weights tied to the embeddings

**get\_prefix\_bias\_name**

[\<source>](https://github.com/huggingface/transformers/blob/v4.34.1/src/transformers/modeling_tf_utils.py#L1922)

( ) → `str`

Returns

`str`

The \_prefix name of the bias.

Get the concatenated \_prefix name of the bias from the model name to the parent layer

**load\_repo\_checkpoint**

[\<source>](https://github.com/huggingface/transformers/blob/v4.34.1/src/transformers/modeling_tf_utils.py#L1343)

( repo\_path\_or\_name ) → `dict`

Parameters

* **repo\_path\_or\_name** (`str`) — Can either be a repository name for your {object} in the Hub or a path to a local folder (in which case the repository will have the name of that local folder).

Returns

`dict`

A dictionary of extra metadata from the checkpoint, most commonly an “epoch” count.

Loads a saved checkpoint (model weights and optimizer state) from a repo. Returns the current epoch count when the checkpoint was made.

**prepare\_tf\_dataset**

[\<source>](https://github.com/huggingface/transformers/blob/v4.34.1/src/transformers/modeling_tf_utils.py#L1392)

( dataset: 'datasets.Dataset'batch\_size: int = 8shuffle: bool = Truetokenizer: Optional\['PreTrainedTokenizerBase'] = Nonecollate\_fn: Optional\[Callable] = Nonecollate\_fn\_args: Optional\[Dict\[str, Any]] = Nonedrop\_remainder: Optional\[bool] = Noneprefetch: bool = True ) → `Dataset`

Parameters

* **dataset** (`Any`) — A \[\~`datasets.Dataset`] to be wrapped as a `tf.data.Dataset`.
* **batch\_size** (`int`, defaults to 8) — The size of batches to return.
* **shuffle** (`bool`, defaults to `True`) — Whether to return samples from the dataset in random order. Usually `True` for training datasets and `False` for validation/test datasets.
* **tokenizer** ([PreTrainedTokenizerBase](https://huggingface.co/docs/transformers/v4.34.1/en/internal/tokenization_utils#transformers.PreTrainedTokenizerBase), *optional*) — A `PreTrainedTokenizer` that will be used to pad samples to create batches. Has no effect if a specific `collate_fn` is passed instead.
* **collate\_fn** (`Callable`, *optional*) — A function that collates samples from the dataset into a single batch. Defaults to `DefaultDataCollator` if no `tokenizer` is supplied or `DataCollatorWithPadding` if a `tokenizer` is passed.
* **collate\_fn\_args** (`Dict[str, Any]`, *optional*) — A dict of arguments to pass to the `collate_fn` alongside the list of samples.
* **drop\_remainder** (`bool`, *optional*) — Whether to drop the final batch, if the batch\_size does not evenly divide the dataset length. Defaults to the same setting as `shuffle`.
* **prefetch** (`bool`, defaults to `True`) — Whether to add prefetching to the end of the `tf.data` pipeline. This is almost always beneficial for performance, but can be disabled in edge cases.

Returns

`Dataset`

A `tf.data.Dataset` which is ready to pass to the Keras API.

Wraps a BOINC AI [Dataset](https://huggingface.co/docs/datasets/v2.14.5/en/package_reference/main_classes#datasets.Dataset) as a `tf.data.Dataset` with collation and batching. This method is designed to create a “ready-to-use” dataset that can be passed directly to Keras methods like `fit()` without further modification. The method will drop columns from the dataset if they don’t match input names for the model. If you want to specify the column names to return rather than using the names that match this model, we recommend using `Dataset.to_tf_dataset()` instead.

**prune\_heads**

[\<source>](https://github.com/huggingface/transformers/blob/v4.34.1/src/transformers/modeling_tf_utils.py#L2312)

( heads\_to\_prune )

Parameters

* **heads\_to\_prune** (`Dict[int, List[int]]`) — Dictionary with keys being selected layer indices (`int`) and associated values being the list of heads to prune in said layer (list of `int`). For instance {1: \[0, 2], 2: \[2, 3]} will prune heads 0 and 2 on layer 1 and heads 2 and 3 on layer 2.

Prunes heads of the base model.

**register\_for\_auto\_class**

[\<source>](https://github.com/huggingface/transformers/blob/v4.34.1/src/transformers/modeling_tf_utils.py#L3158)

( auto\_class = 'TFAutoModel' )

Parameters

* **auto\_class** (`str` or `type`, *optional*, defaults to `"TFAutoModel"`) — The auto class to register this new model with.

Register this class with a given auto class. This should only be used for custom models as the ones in the library are already mapped with an auto class.

This API is experimental and may have some slight breaking changes in the next releases.

**resize\_token\_embeddings**

[\<source>](https://github.com/huggingface/transformers/blob/v4.34.1/src/transformers/modeling_tf_utils.py#L1974)

( new\_num\_tokens: Optional\[int] = None ) → `tf.Variable` or `tf.keras.layers.Embedding`

Parameters

* **new\_num\_tokens** (`int`, *optional*) — The number of new tokens in the embedding matrix. Increasing the size will add newly initialized vectors at the end. Reducing the size will remove vectors from the end. If not provided or `None`, just returns a pointer to the input tokens without doing anything.

Returns

`tf.Variable` or `tf.keras.layers.Embedding`

Pointer to the input tokens of the model.

Resizes input token embeddings matrix of the model if `new_num_tokens != config.vocab_size`.

Takes care of tying weights embeddings afterwards if the model class has a `tie_weights()` method.

**save\_pretrained**

[\<source>](https://github.com/huggingface/transformers/blob/v4.34.1/src/transformers/modeling_tf_utils.py#L2324)

( save\_directorysaved\_model = Falseversion = 1push\_to\_hub = Falsesignatures = Nonemax\_shard\_size: Union\[int, str] = '10GB'create\_pr: bool = Falsesafe\_serialization: bool = Falsetoken: Optional\[Union\[str, bool]] = None\*\*kwargs )

Parameters

* **save\_directory** (`str`) — Directory to which to save. Will be created if it doesn’t exist.
* **saved\_model** (`bool`, *optional*, defaults to `False`) — If the model has to be saved in saved model format as well or not.
* **version** (`int`, *optional*, defaults to 1) — The version of the saved model. A saved model needs to be versioned in order to be properly loaded by TensorFlow Serving as detailed in the official documentation <https://www.tensorflow.org/tfx/serving/serving_basic>
* **push\_to\_hub** (`bool`, *optional*, defaults to `False`) — Whether or not to push your model to the BOINC AI model hub after saving it. You can specify the repository you want to push to with `repo_id` (will default to the name of `save_directory` in your namespace).
* **signatures** (`dict` or `tf.function`, *optional*) — Model’s signature used for serving. This will be passed to the `signatures` argument of model.save().
* **max\_shard\_size** (`int` or `str`, *optional*, defaults to `"10GB"`) — The maximum size for a checkpoint before being sharded. Checkpoints shard will then be each of size lower than this size. If expressed as a string, needs to be digits followed by a unit (like `"5MB"`).

  If a single weight of the model is bigger than `max_shard_size`, it will be in its own checkpoint shard which will be bigger than `max_shard_size`.
* **create\_pr** (`bool`, *optional*, defaults to `False`) — Whether or not to create a PR with the uploaded files or directly commit.
* **safe\_serialization** (`bool`, *optional*, defaults to `False`) — Whether to save the model using `safetensors` or the traditional PyTorch way (that uses `pickle`).
* **token** (`str` or `bool`, *optional*) — The token to use as HTTP bearer authorization for remote files. If `True`, or not specified, will use the token generated when running `boincai-cli login` (stored in `~/.boincai`).
* **kwargs** (`Dict[str, Any]`, *optional*) — Additional key word arguments passed along to the [push\_to\_hub()](https://huggingface.co/docs/transformers/v4.34.1/en/main_classes/processors#transformers.ProcessorMixin.push_to_hub) method.

Save a model and its configuration file to a directory, so that it can be re-loaded using the [from\_pretrained()](https://huggingface.co/docs/transformers/v4.34.1/en/main_classes/model#transformers.TFPreTrainedModel.from_pretrained) class method.

**serving**

( inputs )

Parameters

* **Method** used for serving the model. Does not have a specific signature, but will be specialized as concrete —
* **functions** when saving with `save_pretrained`. — inputs (`Dict[str, tf.Tensor]`): The input of the saved model as a dictionary of tensors.

**serving\_output**

[\<source>](https://github.com/huggingface/transformers/blob/v4.34.1/src/transformers/modeling_tf_utils.py#L1278)

( output )

Prepare the output of the saved model. Can be overridden if specific serving modifications are required.

**set\_bias**

[\<source>](https://github.com/huggingface/transformers/blob/v4.34.1/src/transformers/modeling_tf_utils.py#L1949)

( value )

Parameters

* **value** (`Dict[tf.Variable]`) — All the new bias attached to an LM head.

Set all the bias in the LM head.

**set\_input\_embeddings**

[\<source>](https://github.com/huggingface/transformers/blob/v4.34.1/src/transformers/modeling_tf_utils.py#L1852)

( value )

Parameters

* **value** (`tf.Variable`) — The new weights mapping hidden states to vocabulary.

Set model’s input embeddings

**set\_output\_embeddings**

[\<source>](https://github.com/huggingface/transformers/blob/v4.34.1/src/transformers/modeling_tf_utils.py#L1892)

( value )

Parameters

* **value** (`tf.Variable`) — The new weights mapping hidden states to vocabulary.

Set model’s output embeddings

**test\_step**

[\<source>](https://github.com/huggingface/transformers/blob/v4.34.1/src/transformers/modeling_tf_utils.py#L1688)

( data )

A modification of Keras’s default `train_step` that correctly handles matching outputs to labels for our models and supports directly training on the loss output head. In addition, it ensures input keys are copied to the labels where appropriate. It will also copy label keys into the input dict when using the dummy loss, to ensure that they are available to the model during the forward pass.

**train\_step**

[\<source>](https://github.com/huggingface/transformers/blob/v4.34.1/src/transformers/modeling_tf_utils.py#L1580)

( data )

A modification of Keras’s default `train_step` that correctly handles matching outputs to labels for our models and supports directly training on the loss output head. In addition, it ensures input keys are copied to the labels where appropriate. It will also copy label keys into the input dict when using the dummy loss, to ensure that they are available to the model during the forward pass.

### TFModelUtilsMixin

#### class transformers.modeling\_tf\_utils.TFModelUtilsMixin

[\<source>](https://github.com/huggingface/transformers/blob/v4.34.1/src/transformers/modeling_tf_utils.py#L105)

( )

A few utilities for `tf.keras.Model`, to be used as a mixin.

**num\_parameters**

[\<source>](https://github.com/huggingface/transformers/blob/v4.34.1/src/transformers/modeling_tf_utils.py#L110)

( only\_trainable: bool = False ) → `int`

Parameters

* **only\_trainable** (`bool`, *optional*, defaults to `False`) — Whether or not to return only the number of trainable parameters

Returns

`int`

The number of parameters.

Get the number of (optionally, trainable) parameters in the model.

### FlaxPreTrainedModel

#### class transformers.FlaxPreTrainedModel

[\<source>](https://github.com/huggingface/transformers/blob/v4.34.1/src/transformers/modeling_flax_utils.py#L158)

( config: PretrainedConfigmodule: Moduleinput\_shape: typing.Tuple = (1, 1)seed: int = 0dtype: dtype = \<class 'jax.numpy.float32'>\_do\_init: bool = True )

Base class for all models.

[FlaxPreTrainedModel](https://huggingface.co/docs/transformers/v4.34.1/en/main_classes/model#transformers.FlaxPreTrainedModel) takes care of storing the configuration of the models and handles methods for loading, downloading and saving models.

Class attributes (overridden by derived classes):

* **config\_class** ([PretrainedConfig](https://huggingface.co/docs/transformers/v4.34.1/en/main_classes/configuration#transformers.PretrainedConfig)) — A subclass of [PretrainedConfig](https://huggingface.co/docs/transformers/v4.34.1/en/main_classes/configuration#transformers.PretrainedConfig) to use as configuration class for this model architecture.
* **base\_model\_prefix** (`str`) — A string indicating the attribute associated to the base model in derived classes of the same architecture adding modules on top of the base model.
* **main\_input\_name** (`str`) — The name of the principal input to the model (often `input_ids` for NLP models, `pixel_values` for vision models and `input_values` for speech models).

**push\_to\_hub**

[\<source>](https://github.com/huggingface/transformers/blob/v4.34.1/src/transformers/utils/hub.py#L786)

( repo\_id: struse\_temp\_dir: typing.Optional\[bool] = Nonecommit\_message: typing.Optional\[str] = Noneprivate: typing.Optional\[bool] = Nonetoken: typing.Union\[bool, str, NoneType] = Nonemax\_shard\_size: typing.Union\[int, str, NoneType] = '10GB'create\_pr: bool = Falsesafe\_serialization: bool = Falserevision: str = None\*\*deprecated\_kwargs )

Parameters

* **repo\_id** (`str`) — The name of the repository you want to push your model to. It should contain your organization name when pushing to a given organization.
* **use\_temp\_dir** (`bool`, *optional*) — Whether or not to use a temporary directory to store the files saved before they are pushed to the Hub. Will default to `True` if there is no directory named like `repo_id`, `False` otherwise.
* **commit\_message** (`str`, *optional*) — Message to commit while pushing. Will default to `"Upload model"`.
* **private** (`bool`, *optional*) — Whether or not the repository created should be private.
* **token** (`bool` or `str`, *optional*) — The token to use as HTTP bearer authorization for remote files. If `True`, will use the token generated when running `boincai-cli login` (stored in `~/.boincai`). Will default to `True` if `repo_url` is not specified.
* **max\_shard\_size** (`int` or `str`, *optional*, defaults to `"10GB"`) — Only applicable for models. The maximum size for a checkpoint before being sharded. Checkpoints shard will then be each of size lower than this size. If expressed as a string, needs to be digits followed by a unit (like `"5MB"`).
* **create\_pr** (`bool`, *optional*, defaults to `False`) — Whether or not to create a PR with the uploaded files or directly commit.
* **safe\_serialization** (`bool`, *optional*, defaults to `False`) — Whether or not to convert the model weights in safetensors format for safer serialization.
* **revision** (`str`, *optional*) — Branch to push the uploaded files to.

Upload the model checkpoint to the BOINC AI Model Hub.

Examples:

Copied

```
from transformers import FlaxAutoModel

model = FlaxAutoModel.from_pretrained("bert-base-cased")

# Push the model to your namespace with the name "my-finetuned-bert".
model.push_to_hub("my-finetuned-bert")

# Push the model to an organization with the name "my-finetuned-bert".
model.push_to_hub("boincai/my-finetuned-bert")
```

**can\_generate**

[\<source>](https://github.com/huggingface/transformers/blob/v4.34.1/src/transformers/modeling_flax_utils.py#L472)

( )

Returns whether this model can generate sequences with `.generate()`. Returns: `bool`: Whether this model can generate sequences with `.generate()`.

**from\_pretrained**

[\<source>](https://github.com/huggingface/transformers/blob/v4.34.1/src/transformers/modeling_flax_utils.py#L484)

( pretrained\_model\_name\_or\_path: typing.Union\[str, os.PathLike]dtype: dtype = \<class 'jax.numpy.float32'>\*model\_argsconfig: typing.Union\[transformers.configuration\_utils.PretrainedConfig, str, os.PathLike, NoneType] = Nonecache\_dir: typing.Union\[str, os.PathLike, NoneType] = Noneignore\_mismatched\_sizes: bool = Falseforce\_download: bool = Falselocal\_files\_only: bool = Falsetoken: typing.Union\[bool, str, NoneType] = Nonerevision: str = 'main'\*\*kwargs )

Parameters

* **pretrained\_model\_name\_or\_path** (`str` or `os.PathLike`) — Can be either:
  * A string, the *model id* of a pretrained model hosted inside a model repo on boincai.com. Valid model ids can be located at the root-level, like `bert-base-uncased`, or namespaced under a user or organization name, like `dbmdz/bert-base-german-cased`.
  * A path to a *directory* containing model weights saved using [save\_pretrained()](https://huggingface.co/docs/transformers/v4.34.1/en/main_classes/model#transformers.FlaxPreTrainedModel.save_pretrained), e.g., `./my_model_directory/`.
  * A path or url to a *pt index checkpoint file* (e.g, `./tf_model/model.ckpt.index`). In this case, `from_pt` should be set to `True`.
* **dtype** (`jax.numpy.dtype`, *optional*, defaults to `jax.numpy.float32`) — The data type of the computation. Can be one of `jax.numpy.float32`, `jax.numpy.float16` (on GPUs) and `jax.numpy.bfloat16` (on TPUs).

  This can be used to enable mixed-precision training or half-precision inference on GPUs or TPUs. If specified all the computation will be performed with the given `dtype`.

  **Note that this only specifies the dtype of the computation and does not influence the dtype of model parameters.**

  If you wish to change the dtype of the model parameters, see [to\_fp16()](https://huggingface.co/docs/transformers/v4.34.1/en/main_classes/model#transformers.FlaxPreTrainedModel.to_fp16) and [to\_bf16()](https://huggingface.co/docs/transformers/v4.34.1/en/main_classes/model#transformers.FlaxPreTrainedModel.to_bf16).
* **model\_args** (sequence of positional arguments, *optional*) — All remaining positional arguments will be passed to the underlying model’s `__init__` method.
* **config** (`Union[PretrainedConfig, str, os.PathLike]`, *optional*) — Can be either:

  * an instance of a class derived from [PretrainedConfig](https://huggingface.co/docs/transformers/v4.34.1/en/main_classes/configuration#transformers.PretrainedConfig),
  * a string or path valid as input to [from\_pretrained()](https://huggingface.co/docs/transformers/v4.34.1/en/main_classes/configuration#transformers.PretrainedConfig.from_pretrained).

  Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:

  * The model is a model provided by the library (loaded with the *model id* string of a pretrained model).
  * The model was saved using [save\_pretrained()](https://huggingface.co/docs/transformers/v4.34.1/en/main_classes/model#transformers.PreTrainedModel.save_pretrained) and is reloaded by supplying the save directory.
  * The model is loaded by supplying a local directory as `pretrained_model_name_or_path` and a configuration JSON file named *config.json* is found in the directory.
* **cache\_dir** (`Union[str, os.PathLike]`, *optional*) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.
* **from\_pt** (`bool`, *optional*, defaults to `False`) — Load the model weights from a PyTorch checkpoint save file (see docstring of `pretrained_model_name_or_path` argument).
* **ignore\_mismatched\_sizes** (`bool`, *optional*, defaults to `False`) — Whether or not to raise an error if some of the weights from the checkpoint do not have the same size as the weights of the model (if for instance, you are instantiating a model with 10 labels from a checkpoint with 3 labels).
* **force\_download** (`bool`, *optional*, defaults to `False`) — Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.
* **resume\_download** (`bool`, *optional*, defaults to `False`) — Whether or not to delete incompletely received files. Will attempt to resume the download if such a file exists.
* **proxies** (`Dict[str, str]`, *optional*) — A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}`. The proxies are used on each request.
* **local\_files\_only(`bool`,** *optional*, defaults to `False`) — Whether or not to only look at local files (i.e., do not try to download the model).
* **token** (`str` or `bool`, *optional*) — The token to use as HTTP bearer authorization for remote files. If `True`, or not specified, will use the token generated when running `boincai-cli login` (stored in `~/.boincai`).
* **revision** (`str`, *optional*, defaults to `"main"`) — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on boincai.com, so `revision` can be any identifier allowed by git.

Instantiate a pretrained flax model from a pre-trained model configuration.

The warning *Weights from XXX not initialized from pretrained model* means that the weights of XXX do not come pretrained with the rest of the model. It is up to you to train those weights with a downstream fine-tuning task.

The warning *Weights from XXX not used in YYY* means that the layer XXX is not used by YYY, therefore those weights are discarded.

Examples:

Copied

```
>>> from transformers import BertConfig, FlaxBertModel

>>> # Download model and configuration from boincai.com and cache.
>>> model = FlaxBertModel.from_pretrained("bert-base-cased")
>>> # Model was saved using *save_pretrained('./test/saved_model/')* (for example purposes, not runnable).
>>> model = FlaxBertModel.from_pretrained("./test/saved_model/")
>>> # Loading from a PyTorch checkpoint file instead of a PyTorch model (slower, for example purposes, not runnable).
>>> config = BertConfig.from_json_file("./pt_model/config.json")
>>> model = FlaxBertModel.from_pretrained("./pt_model/pytorch_model.bin", from_pt=True, config=config)
```

**load\_flax\_sharded\_weights**

[\<source>](https://github.com/huggingface/transformers/blob/v4.34.1/src/transformers/modeling_flax_utils.py#L425)

( shard\_files ) → `Dict`

Parameters

* **shard\_files** (`List[str]` — The list of shard files to load.

Returns

`Dict`

A nested dictionary of the model parameters, in the expected format for flax models : `{'model': {'params': {'...'}}}`.

This is the same as `flax.serialization.from_bytes` (https:lax.readthedocs.io/en/latest/\_modules/flax/serialization.html#from\_bytes) but for a sharded checkpoint.

This load is performed efficiently: each checkpoint shard is loaded one by one in RAM and deleted after being loaded in the model.

**register\_for\_auto\_class**

[\<source>](https://github.com/huggingface/transformers/blob/v4.34.1/src/transformers/modeling_flax_utils.py#L1152)

( auto\_class = 'FlaxAutoModel' )

Parameters

* **auto\_class** (`str` or `type`, *optional*, defaults to `"FlaxAutoModel"`) — The auto class to register this new model with.

Register this class with a given auto class. This should only be used for custom models as the ones in the library are already mapped with an auto class.

This API is experimental and may have some slight breaking changes in the next releases.

**save\_pretrained**

[\<source>](https://github.com/huggingface/transformers/blob/v4.34.1/src/transformers/modeling_flax_utils.py#L1025)

( save\_directory: typing.Union\[str, os.PathLike]params = Nonepush\_to\_hub = Falsemax\_shard\_size = '10GB'token: typing.Union\[bool, str, NoneType] = None\*\*kwargs )

Parameters

* **save\_directory** (`str` or `os.PathLike`) — Directory to which to save. Will be created if it doesn’t exist.
* **push\_to\_hub** (`bool`, *optional*, defaults to `False`) — Whether or not to push your model to the BOINC AI model hub after saving it. You can specify the repository you want to push to with `repo_id` (will default to the name of `save_directory` in your namespace).
* **max\_shard\_size** (`int` or `str`, *optional*, defaults to `"10GB"`) — The maximum size for a checkpoint before being sharded. Checkpoints shard will then be each of size lower than this size. If expressed as a string, needs to be digits followed by a unit (like `"5MB"`).

  If a single weight of the model is bigger than `max_shard_size`, it will be in its own checkpoint shard which will be bigger than `max_shard_size`.
* **token** (`str` or `bool`, *optional*) — The token to use as HTTP bearer authorization for remote files. If `True`, or not specified, will use the token generated when running `boincai-cli login` (stored in `~/.boincai`).
* **kwargs** (`Dict[str, Any]`, *optional*) — Additional key word arguments passed along to the [push\_to\_hub()](https://huggingface.co/docs/transformers/v4.34.1/en/main_classes/processors#transformers.ProcessorMixin.push_to_hub) method.

Save a model and its configuration file to a directory, so that it can be re-loaded using the `[from_pretrained()](/docs/transformers/v4.34.1/en/main_classes/model#transformers.FlaxPreTrainedModel.from_pretrained)` class method

**to\_bf16**

[\<source>](https://github.com/huggingface/transformers/blob/v4.34.1/src/transformers/modeling_flax_utils.py#L320)

( params: typing.Union\[typing.Dict, flax.core.frozen\_dict.FrozenDict]mask: typing.Any = None )

Parameters

* **params** (`Union[Dict, FrozenDict]`) — A `PyTree` of model parameters.
* **mask** (`Union[Dict, FrozenDict]`) — A `PyTree` with same structure as the `params` tree. The leaves should be booleans, `True` for params you want to cast, and should be `False` for those you want to skip.

Cast the floating-point `params` to `jax.numpy.bfloat16`. This returns a new `params` tree and does not cast the `params` in place.

This method can be used on TPU to explicitly convert the model parameters to bfloat16 precision to do full half-precision training or to save weights in bfloat16 for inference in order to save memory and improve speed.

Examples:

Copied

```
>>> from transformers import FlaxBertModel

>>> # load model
>>> model = FlaxBertModel.from_pretrained("bert-base-cased")
>>> # By default, the model parameters will be in fp32 precision, to cast these to bfloat16 precision
>>> model.params = model.to_bf16(model.params)
>>> # If you want don't want to cast certain parameters (for example layer norm bias and scale)
>>> # then pass the mask as follows
>>> from flax import traverse_util

>>> model = FlaxBertModel.from_pretrained("bert-base-cased")
>>> flat_params = traverse_util.flatten_dict(model.params)
>>> mask = {
...     path: (path[-2] != ("LayerNorm", "bias") and path[-2:] != ("LayerNorm", "scale"))
...     for path in flat_params
... }
>>> mask = traverse_util.unflatten_dict(mask)
>>> model.params = model.to_bf16(model.params, mask)
```

**to\_fp16**

[\<source>](https://github.com/huggingface/transformers/blob/v4.34.1/src/transformers/modeling_flax_utils.py#L386)

( params: typing.Union\[typing.Dict, flax.core.frozen\_dict.FrozenDict]mask: typing.Any = None )

Parameters

* **params** (`Union[Dict, FrozenDict]`) — A `PyTree` of model parameters.
* **mask** (`Union[Dict, FrozenDict]`) — A `PyTree` with same structure as the `params` tree. The leaves should be booleans, `True` for params you want to cast, and should be `False` for those you want to skip

Cast the floating-point `parmas` to `jax.numpy.float16`. This returns a new `params` tree and does not cast the `params` in place.

This method can be used on GPU to explicitly convert the model parameters to float16 precision to do full half-precision training or to save weights in float16 for inference in order to save memory and improve speed.

Examples:

Copied

```
>>> from transformers import FlaxBertModel

>>> # load model
>>> model = FlaxBertModel.from_pretrained("bert-base-cased")
>>> # By default, the model params will be in fp32, to cast these to float16
>>> model.params = model.to_fp16(model.params)
>>> # If you want don't want to cast certain parameters (for example layer norm bias and scale)
>>> # then pass the mask as follows
>>> from flax import traverse_util

>>> model = FlaxBertModel.from_pretrained("bert-base-cased")
>>> flat_params = traverse_util.flatten_dict(model.params)
>>> mask = {
...     path: (path[-2] != ("LayerNorm", "bias") and path[-2:] != ("LayerNorm", "scale"))
...     for path in flat_params
... }
>>> mask = traverse_util.unflatten_dict(mask)
>>> model.params = model.to_fp16(model.params, mask)
```

**to\_fp32**

[\<source>](https://github.com/huggingface/transformers/blob/v4.34.1/src/transformers/modeling_flax_utils.py#L359)

( params: typing.Union\[typing.Dict, flax.core.frozen\_dict.FrozenDict]mask: typing.Any = None )

Parameters

* **params** (`Union[Dict, FrozenDict]`) — A `PyTree` of model parameters.
* **mask** (`Union[Dict, FrozenDict]`) — A `PyTree` with same structure as the `params` tree. The leaves should be booleans, `True` for params you want to cast, and should be `False` for those you want to skip

Cast the floating-point `parmas` to `jax.numpy.float32`. This method can be used to explicitly convert the model parameters to fp32 precision. This returns a new `params` tree and does not cast the `params` in place.

Examples:

Copied

```
>>> from transformers import FlaxBertModel

>>> # Download model and configuration from boincai.com
>>> model = FlaxBertModel.from_pretrained("bert-base-cased")
>>> # By default, the model params will be in fp32, to illustrate the use of this method,
>>> # we'll first cast to fp16 and back to fp32
>>> model.params = model.to_f16(model.params)
>>> # now cast back to fp32
>>> model.params = model.to_fp32(model.params)
```

### Pushing to the Hub

#### class transformers.utils.PushToHubMixin

[\<source>](https://github.com/huggingface/transformers/blob/v4.34.1/src/transformers/utils/hub.py#L672)

( )

A Mixin containing the functionality to push a model or tokenizer to the hub.

**push\_to\_hub**

[\<source>](https://github.com/huggingface/transformers/blob/v4.34.1/src/transformers/utils/hub.py#L786)

( repo\_id: struse\_temp\_dir: typing.Optional\[bool] = Nonecommit\_message: typing.Optional\[str] = Noneprivate: typing.Optional\[bool] = Nonetoken: typing.Union\[bool, str, NoneType] = Nonemax\_shard\_size: typing.Union\[int, str, NoneType] = '10GB'create\_pr: bool = Falsesafe\_serialization: bool = Falserevision: str = None\*\*deprecated\_kwargs )

Parameters

* **repo\_id** (`str`) — The name of the repository you want to push your {object} to. It should contain your organization name when pushing to a given organization.
* **use\_temp\_dir** (`bool`, *optional*) — Whether or not to use a temporary directory to store the files saved before they are pushed to the Hub. Will default to `True` if there is no directory named like `repo_id`, `False` otherwise.
* **commit\_message** (`str`, *optional*) — Message to commit while pushing. Will default to `"Upload {object}"`.
* **private** (`bool`, *optional*) — Whether or not the repository created should be private.
* **token** (`bool` or `str`, *optional*) — The token to use as HTTP bearer authorization for remote files. If `True`, will use the token generated when running `boincai-cli login` (stored in `~/.boincai`). Will default to `True` if `repo_url` is not specified.
* **max\_shard\_size** (`int` or `str`, *optional*, defaults to `"10GB"`) — Only applicable for models. The maximum size for a checkpoint before being sharded. Checkpoints shard will then be each of size lower than this size. If expressed as a string, needs to be digits followed by a unit (like `"5MB"`).
* **create\_pr** (`bool`, *optional*, defaults to `False`) — Whether or not to create a PR with the uploaded files or directly commit.
* **safe\_serialization** (`bool`, *optional*, defaults to `False`) — Whether or not to convert the model weights in safetensors format for safer serialization.
* **revision** (`str`, *optional*) — Branch to push the uploaded files to.

Upload the {object\_files} to the BOINC AI Model Hub.

Examples:

Copied

```
from transformers import {object_class}

{object} = {object_class}.from_pretrained("bert-base-cased")

# Push the {object} to your namespace with the name "my-finetuned-bert".
{object}.push_to_hub("my-finetuned-bert")

# Push the {object} to an organization with the name "my-finetuned-bert".
{object}.push_to_hub("boincai/my-finetuned-bert")
```

### Sharded checkpoints

**transformers.modeling\_utils.load\_sharded\_checkpoint**

[\<source>](https://github.com/huggingface/transformers/blob/v4.34.1/src/transformers/modeling_utils.py#L373)

( modelfolderstrict = Trueprefer\_safe = True ) → `NamedTuple`

Parameters

* **model** (`torch.nn.Module`) — The model in which to load the checkpoint.
* **folder** (`str` or `os.PathLike`) — A path to a folder containing the sharded checkpoint.
* **strict** (`bool`, \*optional`, defaults to` True\`) — Whether to strictly enforce that the keys in the model state dict match the keys in the sharded checkpoint.
* **prefer\_safe** (`bool`, *optional*, defaults to `False`) — If both safetensors and PyTorch save files are present in checkpoint and `prefer_safe` is True, the safetensors files will be loaded. Otherwise, PyTorch files are always loaded when possible.

Returns

`NamedTuple`

A named tuple with `missing_keys` and `unexpected_keys` fields

* `missing_keys` is a list of str containing the missing keys
* `unexpected_keys` is a list of str containing the unexpected keys

This is the same as [`torch.nn.Module.load_state_dict`](https://pytorch.org/docs/stable/generated/torch.nn.Module.html?highlight=load_state_dict#torch.nn.Module.load_state_dict) but for a sharded checkpoint.

This load is performed efficiently: each checkpoint shard is loaded one by one in RAM and deleted after being loaded in the model.


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://boinc-ai.gitbook.io/transformers/api/main-classes/models.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
