Mixins & serialization methods
Last updated
Last updated
The huggingface_hub
library offers a range of mixins that can be used as a parent class for your objects, in order to provide simple uploading and downloading functions. Check out our to learn how to integrate any ML framework with the Hub.
( )
A generic mixin to integrate ANY machine learning framework with the Hub.
To integrate your framework, your model class must inherit from this class. Custom logic for saving/loading models have to be overwritten in _from_pretrained
and _save_pretrained
. is a good example of mixin integration with the Hub. Check out our for more instructions.
_save_pretrained
( save_directory: Path )
Parameters
save_directory (str
or Path
) — Path to directory in which the model weights and configuration will be saved.
Overwrite this method in subclass to define how to save your model. Check out our for instructions.
_from_pretrained
( model_id: strrevision: typing.Optional[str]cache_dir: typing.Union[str, pathlib.Path, NoneType]force_download: boolproxies: typing.Optional[typing.Dict]resume_download: boollocal_files_only: booltoken: typing.Union[str, bool, NoneType]**model_kwargs )
Parameters
model_id (str
) — ID of the model to load from the Huggingface Hub (e.g. bigscience/bloom
).
revision (str
, optional) — Revision of the model on the Hub. Can be a branch name, a git tag or any commit id. Defaults to the latest commit on main
branch.
force_download (bool
, optional, defaults to False
) — Whether to force (re-)downloading the model weights and configuration files from the Hub, overriding the existing cache.
resume_download (bool
, optional, defaults to False
) — Whether to delete incompletely received files. Will attempt to resume the download if such a file exists.
proxies (Dict[str, str]
, optional) — A dictionary of proxy servers to use by protocol or endpoint (e.g., {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}
).
token (str
or bool
, optional) — The token to use as HTTP bearer authorization for remote files. By default, it will use the token cached when running huggingface-cli login
.
cache_dir (str
, Path
, optional) — Path to the folder where cached files are stored.
Overwrite this method in subclass to define how to load your model from pretrained.
from_pretrained
( pretrained_model_name_or_path: typing.Union[str, pathlib.Path]force_download: bool = Falseresume_download: bool = Falseproxies: typing.Optional[typing.Dict] = Nonetoken: typing.Union[str, bool, NoneType] = Nonecache_dir: typing.Union[str, pathlib.Path, NoneType] = Nonelocal_files_only: bool = Falserevision: typing.Optional[str] = None**model_kwargs )
Parameters
pretrained_model_name_or_path (str
, Path
) —
Either the model_id
(string) of a model hosted on the Hub, e.g. bigscience/bloom
.
revision (str
, optional) — Revision of the model on the Hub. Can be a branch name, a git tag or any commit id. Defaults to the latest commit on main
branch.
force_download (bool
, optional, defaults to False
) — Whether to force (re-)downloading the model weights and configuration files from the Hub, overriding the existing cache.
resume_download (bool
, optional, defaults to False
) — Whether to delete incompletely received files. Will attempt to resume the download if such a file exists.
proxies (Dict[str, str]
, optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g., {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}
. The proxies are used on every request.
token (str
or bool
, optional) — The token to use as HTTP bearer authorization for remote files. By default, it will use the token cached when running huggingface-cli login
.
cache_dir (str
, Path
, optional) — Path to the folder where cached files are stored.
local_files_only (bool
, optional, defaults to False
) — If True
, avoid downloading the file and return the path to the local cached file if it exists.
model_kwargs (Dict
, optional) — Additional kwargs to pass to the model during initialization.
Download a model from the Huggingface Hub and instantiate it.
push_to_hub
( repo_id: strconfig: typing.Optional[dict] = Nonecommit_message: str = 'Push model using huggingface_hub.'private: bool = Falseapi_endpoint: typing.Optional[str] = Nonetoken: typing.Optional[str] = Nonebranch: typing.Optional[str] = Nonecreate_pr: typing.Optional[bool] = Noneallow_patterns: typing.Union[typing.List[str], str, NoneType] = Noneignore_patterns: typing.Union[typing.List[str], str, NoneType] = Nonedelete_patterns: typing.Union[typing.List[str], str, NoneType] = None )
Parameters
repo_id (str
) — ID of the repository to push to (example: "username/my-model"
).
config (dict
, optional) — Configuration object to be saved alongside the model weights.
commit_message (str
, optional) — Message to commit while pushing.
private (bool
, optional, defaults to False
) — Whether the repository created should be private.
api_endpoint (str
, optional) — The API endpoint to use when pushing the model to the hub.
token (str
, optional) — The token to use as HTTP bearer authorization for remote files. By default, it will use the token cached when running huggingface-cli login
.
branch (str
, optional) — The git branch on which to push the model. This defaults to "main"
.
create_pr (boolean
, optional) — Whether or not to create a Pull Request from branch
with that commit. Defaults to False
.
allow_patterns (List[str]
or str
, optional) — If provided, only files matching at least one pattern are pushed.
ignore_patterns (List[str]
or str
, optional) — If provided, files matching any of the patterns are not pushed.
delete_patterns (List[str]
or str
, optional) — If provided, remote files matching any of the patterns will be deleted from the repo.
Upload model checkpoint to the Hub.
save_pretrained
( save_directory: typing.Union[str, pathlib.Path]config: typing.Optional[dict] = Nonerepo_id: typing.Optional[str] = Nonepush_to_hub: bool = False**kwargs )
Parameters
save_directory (str
or Path
) — Path to directory in which the model weights and configuration will be saved.
config (dict
, optional) — Model configuration specified as a key/value dictionary.
push_to_hub (bool
, optional, defaults to False
) — Whether or not to push your model to the Huggingface Hub after saving it.
Save weights in local directory.
( )
Example:
Copied
( )
Copied
huggingface_hub.from_pretrained_keras
( *args**kwargs )
Parameters
pretrained_model_name_or_path (str
or os.PathLike
) — Can be either:
A string, the model id
of a pretrained model hosted inside a model repo on huggingface.co. Valid model ids can be located at the root-level, like bert-base-uncased
, or namespaced under a user or organization name, like dbmdz/bert-base-german-cased
.
You can add revision
by appending @
at the end of model_id simply like this: dbmdz/bert-base-german-cased@main
Revision is the specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so revision
can be any identifier allowed by git.
None
if you are both providing the configuration and state dictionary (resp. with keyword arguments config
and state_dict
).
force_download (bool
, optional, defaults to False
) — Whether to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.
resume_download (bool
, optional, defaults to False
) — Whether to delete incompletely received files. Will attempt to resume the download if such a file exists.
proxies (Dict[str, str]
, optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g., {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}
. The proxies are used on each request.
token (str
or bool
, optional) — The token to use as HTTP bearer authorization for remote files. If True
, will use the token generated when running transformers-cli login
(stored in ~/.huggingface
).
cache_dir (Union[str, os.PathLike]
, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.
local_files_only(bool
, optional, defaults to False
) — Whether to only look at local files (i.e., do not try to download the model).
model_kwargs (Dict
, optional) — model_kwargs will be passed to the model during initialization
Instantiate a pretrained Keras model from a pre-trained model from the Hub. The model is expected to be in SavedModel
format.
Passing token=True
is required when you want to use a private model.
huggingface_hub.push_to_hub_keras
( modelrepo_id: strconfig: typing.Optional[dict] = Nonecommit_message: str = 'Push Keras model using huggingface_hub.'private: bool = Falseapi_endpoint: typing.Optional[str] = Nonetoken: typing.Optional[str] = Nonebranch: typing.Optional[str] = Nonecreate_pr: typing.Optional[bool] = Noneallow_patterns: typing.Union[typing.List[str], str, NoneType] = Noneignore_patterns: typing.Union[typing.List[str], str, NoneType] = Nonedelete_patterns: typing.Union[typing.List[str], str, NoneType] = Nonelog_dir: typing.Optional[str] = Noneinclude_optimizer: bool = Falsetags: typing.Union[list, str, NoneType] = Noneplot_model: bool = True**model_save_kwargs )
Parameters
repo_id (str
) — ID of the repository to push to (example: "username/my-model"
).
commit_message (str
, optional, defaults to “Add Keras model”) — Message to commit while pushing.
private (bool
, optional, defaults to False
) — Whether the repository created should be private.
api_endpoint (str
, optional) — The API endpoint to use when pushing the model to the hub.
token (str
, optional) — The token to use as HTTP bearer authorization for remote files. If not set, will use the token set when logging in with huggingface-cli login
(stored in ~/.huggingface
).
branch (str
, optional) — The git branch on which to push the model. This defaults to the default branch as specified in your repository, which defaults to "main"
.
create_pr (boolean
, optional) — Whether or not to create a Pull Request from branch
with that commit. Defaults to False
.
config (dict
, optional) — Configuration object to be saved alongside the model weights.
allow_patterns (List[str]
or str
, optional) — If provided, only files matching at least one pattern are pushed.
ignore_patterns (List[str]
or str
, optional) — If provided, files matching any of the patterns are not pushed.
delete_patterns (List[str]
or str
, optional) — If provided, remote files matching any of the patterns will be deleted from the repo.
log_dir (str
, optional) — TensorBoard logging directory to be pushed. The Hub automatically hosts and displays a TensorBoard instance if log files are included in the repository.
include_optimizer (bool
, optional, defaults to False
) — Whether or not to include optimizer during serialization.
plot_model (bool
, optional, defaults to True
) — Setting this to True
will plot the model and put it in the model card. Requires graphviz and pydot to be installed.
Upload model checkpoint to the Hub.
huggingface_hub.save_pretrained_keras
( modelsave_directory: typing.Union[str, pathlib.Path]config: typing.Union[typing.Dict[str, typing.Any], NoneType] = Noneinclude_optimizer: bool = Falseplot_model: bool = Truetags: typing.Union[list, str, NoneType] = None**model_save_kwargs )
Parameters
save_directory (str
or Path
) — Specify directory in which you want to save the Keras model.
config (dict
, optional) — Configuration object to be saved alongside the model weights.
include_optimizer(bool
, optional, defaults to False
) — Whether or not to include optimizer in serialization.
plot_model (bool
, optional, defaults to True
) — Setting this to True
will plot the model and put it in the model card. Requires graphviz and pydot to be installed.
Saves a Keras model to save_directory in SavedModel format. Use this if you’re using the Functional or Sequential APIs.
huggingface_hub.from_pretrained_fastai
( repo_id: strrevision: typing.Optional[str] = None )
Parameters
repo_id (str
) — The location where the pickled fastai.Learner is. It can be either of the two:
Hosted on the Hugging Face Hub. E.g.: ‘espejelomar/fatai-pet-breeds-classification’ or ‘distilgpt2’. You can add a revision
by appending @
at the end of repo_id
. E.g.: dbmdz/bert-base-german-cased@main
. Revision is the specific model version to use. Since we use a git-based system for storing models and other artifacts on the Hugging Face Hub, it can be a branch name, a tag name, or a commit id.
Hosted locally. repo_id
would be a directory containing the pickle and a pyproject.toml indicating the fastai and fastcore versions used to build the fastai.Learner
. E.g.: ./my_model_directory/
.
revision (str
, optional) — Revision at which the repo’s files are downloaded. See documentation of snapshot_download
.
Load pretrained fastai model from the Hub or from a local directory.
huggingface_hub.push_to_hub_fastai
( learnerrepo_id: strcommit_message: str = 'Push FastAI model using huggingface_hub.'private: bool = Falsetoken: typing.Optional[str] = Noneconfig: typing.Optional[dict] = Nonebranch: typing.Optional[str] = Nonecreate_pr: typing.Optional[bool] = Noneallow_patterns: typing.Union[typing.List[str], str, NoneType] = Noneignore_patterns: typing.Union[typing.List[str], str, NoneType] = Nonedelete_patterns: typing.Union[typing.List[str], str, NoneType] = Noneapi_endpoint: typing.Optional[str] = None )
Parameters
learner (Learner) — The *fastai.Learner’ you’d like to push to the Hub.
repo_id (str) — The repository id for your model in Hub in the format of “namespace/repo_name”. The namespace can be your individual account or an organization to which you have write access (for example, ‘stanfordnlp/stanza-de’).
commit_message (str`, optional*) — Message to commit while pushing. Will default to "add model"
.
private (bool, optional, defaults to False) — Whether or not the repository created should be private.
token (str, optional) — The Hugging Face account token to use as HTTP bearer authorization for remote files. If None
, the token will be asked by a prompt.
config (dict, optional) — Configuration object to be saved alongside the model weights.
branch (str, optional) — The git branch on which to push the model. This defaults to the default branch as specified in your repository, which defaults to “main”.
create_pr (boolean, optional) — Whether or not to create a Pull Request from branch with that commit. Defaults to False.
api_endpoint (str, optional) — The API endpoint to use when pushing the model to the hub.
allow_patterns (List[str] or str, optional) — If provided, only files matching at least one pattern are pushed.
ignore_patterns (List[str] or str, optional) — If provided, files matching any of the patterns are not pushed.
delete_patterns (List[str] or str, optional) — If provided, remote files matching any of the patterns will be deleted from the repo.
Upload learner checkpoint files to the Hub.
Use allow_patterns and ignore_patterns to precisely filter which files should be pushed to the hub. Use delete_patterns to delete existing remote files in the same commit. See [upload_folder] reference for more details.
Raises the following error:
local_files_only (bool
, optional, defaults to False
) — If True
, avoid downloading the file and return the path to the local cached file if it exists. model_kwargs — Additional keyword arguments passed along to the method.
Use or to download files from the Hub before loading them. Most args taken as input can be directly passed to those 2 methods. If needed, you can add more arguments to this method using “model_kwargs”. For example PyTorchModelHubMixin._from_pretrained()
takes as input a map_location
parameter to set on which device the model should be loaded.
Check out our for more instructions.
Or a path to a directory
containing model weights saved using , e.g., ../path/to/my_model_directory/
.
Use allow_patterns
and ignore_patterns
to precisely filter which files should be pushed to the hub. Use delete_patterns
to delete existing remote files in the same commit. See reference for more details.
repo_id (str
, optional) — ID of your repository on the Hub. Used only if push_to_hub=True
. Will default to the folder name if not provided. kwargs — Additional key word arguments passed along to the method.
Implementation of to provide model Hub upload/download capabilities to PyTorch models. The model is set in evaluation mode by default using model.eval()
(dropout modules are deactivated). To train the model, you should first set it back in training mode with model.train()
.
Implementation of to provide model Hub upload/download capabilities to Keras models.
A path to a directory
containing model weights saved using , e.g., ./my_model_directory/
.
model (Keras.Model
) — The you’d like to push to the Hub. The model must be compiled and built.
tags (Union[list
, str
], optional) — List of tags that are related to model or string of a single tag. See example tags .
model_save_kwargs(dict
, optional) — model_save_kwargs will be passed to .
Use allow_patterns
and ignore_patterns
to precisely filter which files should be pushed to the hub. Use delete_patterns
to delete existing remote files in the same commit. See reference for more details.
model (Keras.Model
) — The you’d like to save. The model must be compiled and built.
tags (Union[str
,list
], optional) — List of tags that are related to model or string of a single tag. See example tags .
model_save_kwargs(dict
, optional) — model_save_kwargs will be passed to .
if the user is not log on to the Hugging Face Hub.