Repo Cards and Repo Card Data
Last updated
Last updated
The huggingface_hub library provides a Python interface to create, share, and update Model/Dataset Cards. Visit the for a deeper view of what Model Cards on the Hub are, and how they work under the hood. You can also check out our to get a feel for how you would use these utilities in your own projects.
The RepoCard
object is the parent class of , and SpaceCard
.
( content: strignore_metadata_errors: bool = False )
__init__
( content: strignore_metadata_errors: bool = False )
Parameters
content (str
) — The content of the Markdown file.
Initialize a RepoCard from string content. The content should be a Markdown file with a YAML block at the beginning and a Markdown body.
Example:
Copied
Raises the following error:
from_template
Parameters
card_data (huggingface_hub.CardData
) — A huggingface_hub.CardData instance containing the metadata you want to include in the YAML header of the repo card on the Hugging Face Hub.
template_path (str
, optional) — A path to a markdown file with optional Jinja template variables that can be filled in with template_kwargs
. Defaults to the default template.
Returns
A RepoCard instance with the specified card data and content from the template.
Initialize a RepoCard from a template. By default, it uses the default template.
Templates are Jinja2 templates that can be customized by passing keyword arguments.
load
Parameters
repo_id_or_path (Union[str, Path]
) — The repo ID associated with a Hugging Face Hub repo or a local filepath.
repo_type (str
, optional) — The type of Hugging Face repo to push to. Defaults to None, which will use use “model”. Other options are “dataset” and “space”. Not used when loading from a local filepath. If this is called from a child class, the default value will be the child class’s repo_type
.
token (str
, optional) — Authentication token, obtained with huggingface_hub.HfApi.login
method. Will default to the stored token.
ignore_metadata_errors (str
) — If True, errors while parsing the metadata section will be ignored. Some information might be lost during the process. Use it at your own risk.
Returns
The RepoCard (or subclass) initialized from the repo’s README.md file or filepath.
Initialize a RepoCard from a Hugging Face Hub repo’s README.md or a local filepath.
Example:
Copied
push_to_hub
( repo_id: strtoken: typing.Optional[str] = Nonerepo_type: typing.Optional[str] = Nonecommit_message: typing.Optional[str] = Nonecommit_description: typing.Optional[str] = Nonerevision: typing.Optional[str] = Nonecreate_pr: typing.Optional[bool] = Noneparent_commit: typing.Optional[str] = None ) → str
Parameters
repo_id (str
) — The repo ID of the Hugging Face Hub repo to push to. Example: “nateraw/food”.
token (str
, optional) — Authentication token, obtained with huggingface_hub.HfApi.login
method. Will default to the stored token.
repo_type (str
, optional, defaults to “model”) — The type of Hugging Face repo to push to. Options are “model”, “dataset”, and “space”. If this function is called by a child class, it will default to the child class’s repo_type
.
commit_message (str
, optional) — The summary / title / first line of the generated commit.
commit_description (str
, optional) — The description of the generated commit.
revision (str
, optional) — The git revision to commit from. Defaults to the head of the "main"
branch.
create_pr (bool
, optional) — Whether or not to create a Pull Request with this commit. Defaults to False
.
parent_commit (str
, optional) — The OID / SHA of the parent commit, as a hexadecimal string. Shorthands (7 first characters) are also supported. If specified and create_pr
is False
, the commit will fail if revision
does not point to parent_commit
. If specified and create_pr
is True
, the pull request will be created from parent_commit
. Specifying parent_commit
ensures the repo has not changed before committing the changes, and can be especially useful if the repo is updated / committed to concurrently.
Returns
str
URL of the commit which updated the card metadata.
Push a RepoCard to a Hugging Face Hub repo.
save
( filepath: typing.Union[pathlib.Path, str] )
Parameters
filepath (Union[Path, str]
) — Filepath to the markdown file to save.
Save a RepoCard to a file.
Example:
Copied
validate
( repo_type: typing.Optional[str] = None )
Parameters
repo_type (str
, optional, defaults to “model”) — The type of Hugging Face repo to push to. Options are “model”, “dataset”, and “space”. If this function is called from a child class, the default will be the child class’s repo_type
.
Raises the following errors:
( ignore_metadata_errors: bool = False**kwargs )
Structure containing metadata from a RepoCard.
Metadata can be exported as a dictionary or YAML. Export can be customized to alter the representation of the data (example: flatten evaluation results). CardData
behaves as a dictionary (can get, pop, set values) but do not inherit from dict
to allow this export step.
get
( key: strdefault: typing.Any = None )
Get value for a given metadata key.
pop
( key: strdefault: typing.Any = None )
Pop value for a given metadata key.
to_dict
( ) → dict
Returns
dict
CardData represented as a dictionary ready to be dumped to a YAML block for inclusion in a README.md file.
Converts CardData to a dict.
to_yaml
( line_break = None ) → str
Parameters
line_break (str, optional) — The line break to use when dumping to yaml.
Returns
str
CardData represented as a YAML block.
Dumps CardData to a YAML block for inclusion in a README.md file.
( content: strignore_metadata_errors: bool = False )
from_template
Parameters
card_data (huggingface_hub.ModelCardData
) — A huggingface_hub.ModelCardData instance containing the metadata you want to include in the YAML header of the model card on the Hugging Face Hub.
template_path (str
, optional) — A path to a markdown file with optional Jinja template variables that can be filled in with template_kwargs
. Defaults to the default template.
Returns
A ModelCard instance with the specified card data and content from the template.
Templates are Jinja2 templates that can be customized by passing keyword arguments.
Example:
Copied
( language: typing.Union[typing.List[str], str, NoneType] = Nonelicense: typing.Optional[str] = Nonelibrary_name: typing.Optional[str] = Nonetags: typing.Optional[typing.List[str]] = Nonedatasets: typing.Optional[typing.List[str]] = Nonemetrics: typing.Optional[typing.List[str]] = Noneeval_results: typing.Optional[typing.List[huggingface_hub.repocard_data.EvalResult]] = Nonemodel_name: typing.Optional[str] = Noneignore_metadata_errors: bool = False**kwargs )
Parameters
language (Union[str, List[str]]
, optional) — Language of model’s training data or metadata. It must be an ISO 639-1, 639-2 or 639-3 code (two/three letters), or a special value like “code”, “multilingual”. Defaults to None
.
tags (List[str]
, optional) — List of tags to add to your model that can be used when filtering on the Hugging Face Hub. Defaults to None.
eval_results (Union[List[EvalResult], EvalResult]
, optional) — List of huggingface_hub.EvalResult
that define evaluation results of the model. If provided, model_name
is used to as a name on PapersWithCode’s leaderboards. Defaults to None
.
model_name (str
, optional) — A name for this model. It is used along with eval_results
to construct the model-index
within the card’s metadata. The name you supply here is what will be used on PapersWithCode’s leaderboards. If None is provided then the repo name is used as a default. Defaults to None.
ignore_metadata_errors (str
) — If True, errors while parsing the metadata section will be ignored. Some information might be lost during the process. Use it at your own risk.
kwargs (dict
, optional) — Additional metadata that will be added to the model card. Defaults to None.
Model Card Metadata that is used by Hugging Face Hub when included at the top of your README.md
Example:
Copied
Dataset cards are also known as Data Cards in the ML Community.
( content: strignore_metadata_errors: bool = False )
from_template
Parameters
card_data (huggingface_hub.DatasetCardData
) — A huggingface_hub.DatasetCardData instance containing the metadata you want to include in the YAML header of the dataset card on the Hugging Face Hub.
template_path (str
, optional) — A path to a markdown file with optional Jinja template variables that can be filled in with template_kwargs
. Defaults to the default template.
Returns
A DatasetCard instance with the specified card data and content from the template.
Templates are Jinja2 templates that can be customized by passing keyword arguments.
Example:
Copied
( language: typing.Union[typing.List[str], str, NoneType] = Nonelicense: typing.Union[typing.List[str], str, NoneType] = Noneannotations_creators: typing.Union[typing.List[str], str, NoneType] = Nonelanguage_creators: typing.Union[typing.List[str], str, NoneType] = Nonemultilinguality: typing.Union[typing.List[str], str, NoneType] = Nonesize_categories: typing.Union[typing.List[str], str, NoneType] = Nonesource_datasets: typing.Optional[typing.List[str]] = Nonetask_categories: typing.Union[typing.List[str], str, NoneType] = Nonetask_ids: typing.Union[typing.List[str], str, NoneType] = Nonepaperswithcode_id: typing.Optional[str] = Nonepretty_name: typing.Optional[str] = Nonetrain_eval_index: typing.Optional[typing.Dict] = Noneconfig_names: typing.Union[typing.List[str], str, NoneType] = Noneignore_metadata_errors: bool = False**kwargs )
Parameters
language (List[str]
, optional) — Language of dataset’s data or metadata. It must be an ISO 639-1, 639-2 or 639-3 code (two/three letters), or a special value like “code”, “multilingual”.
annotations_creators (Union[str, List[str]]
, optional) — How the annotations for the dataset were created. Options are: ‘found’, ‘crowdsourced’, ‘expert-generated’, ‘machine-generated’, ‘no-annotation’, ‘other’.
language_creators (Union[str, List[str]]
, optional) — How the text-based data in the dataset was created. Options are: ‘found’, ‘crowdsourced’, ‘expert-generated’, ‘machine-generated’, ‘other’
multilinguality (Union[str, List[str]]
, optional) — Whether the dataset is multilingual. Options are: ‘monolingual’, ‘multilingual’, ‘translation’, ‘other’.
size_categories (Union[str, List[str]]
, optional) — The number of examples in the dataset. Options are: ‘n<1K’, ‘1K1T’, and ‘other’.
source_datasets (List[str]]
, optional) — Indicates whether the dataset is an original dataset or extended from another existing dataset. Options are: ‘original’ and ‘extended’.
task_categories (Union[str, List[str]]
, optional) — What categories of task does the dataset support?
task_ids (Union[str, List[str]]
, optional) — What specific tasks does the dataset support?
paperswithcode_id (str
, optional) — ID of the dataset on PapersWithCode.
pretty_name (str
, optional) — A more human-readable name for the dataset. (ex. “Cats vs. Dogs”)
train_eval_index (Dict
, optional) — A dictionary that describes the necessary spec for doing evaluation on the Hub. If not provided, it will be gathered from the ‘train-eval-index’ key of the kwargs.
config_names (Union[str, List[str]]
, optional) — A list of the available dataset configs for the dataset.
Dataset Card Metadata that is used by Hugging Face Hub when included at the top of your README.md
( content: strignore_metadata_errors: bool = False )
( title: typing.Optional[str] = Nonesdk: typing.Optional[str] = Nonesdk_version: typing.Optional[str] = Nonepython_version: typing.Optional[str] = Noneapp_file: typing.Optional[str] = Noneapp_port: typing.Optional[int] = Nonelicense: typing.Optional[str] = Noneduplicated_from: typing.Optional[str] = Nonemodels: typing.Optional[typing.List[str]] = Nonedatasets: typing.Optional[typing.List[str]] = Nonetags: typing.Optional[typing.List[str]] = Noneignore_metadata_errors: bool = False**kwargs )
Parameters
title (str
, optional) — Title of the Space.
sdk (str
, optional) — SDK of the Space (one of gradio
, streamlit
, docker
, or static
).
sdk_version (str
, optional) — Version of the used SDK (if Gradio/Streamlit sdk).
python_version (str
, optional) — Python version used in the Space (if Gradio/Streamlit sdk).
app_file (str
, optional) — Path to your main application file (which contains either gradio or streamlit Python code, or static html code). Path is relative to the root of the repository.
app_port (str
, optional) — Port on which your application is running. Used only if sdk is docker
.
duplicated_from (str
, optional) — ID of the original Space if this is a duplicated Space.
tags (List[str]
, optional) — List of tags to add to your Space that can be used when filtering on the Hub.
ignore_metadata_errors (str
) — If True, errors while parsing the metadata section will be ignored. Some information might be lost during the process. Use it at your own risk.
kwargs (dict
, optional) — Additional metadata that will be added to the space card.
Space Card Metadata that is used by Hugging Face Hub when included at the top of your README.md
Example:
Copied
( task_type: strdataset_type: strdataset_name: strmetric_type: strmetric_value: typing.Anytask_name: typing.Optional[str] = Nonedataset_config: typing.Optional[str] = Nonedataset_split: typing.Optional[str] = Nonedataset_revision: typing.Optional[str] = Nonedataset_args: typing.Union[typing.Dict[str, typing.Any], NoneType] = Nonemetric_name: typing.Optional[str] = Nonemetric_config: typing.Optional[str] = Nonemetric_args: typing.Union[typing.Dict[str, typing.Any], NoneType] = Noneverified: typing.Optional[bool] = Noneverify_token: typing.Optional[str] = None )
Parameters
task_type (str
) — The task identifier. Example: “image-classification”.
dataset_name (str
) — A pretty name for the dataset. Example: “Common Voice (French)“.
metric_value (Any
) — The metric value. Example: 0.9 or “20.0 ± 1.2”.
task_name (str
, optional) — A pretty name for the task. Example: “Speech Recognition”.
dataset_split (str
, optional) — The split used in load_dataset()
. Example: “test”.
dataset_revision (str
, optional) — The revision (AKA Git Sha) of the dataset used in load_dataset()
. Example: 5503434ddd753f426f4b38109466949a1217c2bb
dataset_args (Dict[str, Any]
, optional) — The arguments passed during Metric.compute()
. Example for bleu
: {"max_order": 4}
metric_name (str
, optional) — A pretty name for the metric. Example: “Test WER”.
metric_args (Dict[str, Any]
, optional) — The arguments passed during Metric.compute()
. Example for bleu
: max_order: 4
Flattened representation of individual evaluation results found in model-index of Model Cards.
is_equal_except_value
( other: EvalResult )
Return True if self
and other
describe exactly the same metric but with a different value.
huggingface_hub.repocard_data.model_index_to_eval_results
( model_index: typing.List[typing.Dict[str, typing.Any]] ) → model_name (str
)
Parameters
model_index (List[Dict[str, Any]]
) — A model index data structure, likely coming from a README.md file on the Hugging Face Hub.
Returns
model_name (str
)
The name of the model as found in the model index. This is used as the identifier for the model on leaderboards like PapersWithCode. eval_results (List[EvalResult]
): A list of huggingface_hub.EvalResult
objects containing the metrics reported in the provided model_index.
Takes in a model index and returns the model name and a list of huggingface_hub.EvalResult
objects.
Example:
Copied
huggingface_hub.repocard_data.eval_results_to_model_index
( model_name: streval_results: typing.List[huggingface_hub.repocard_data.EvalResult] ) → model_index (List[Dict[str, Any]]
)
Parameters
model_name (str
) — Name of the model (ex. “my-cool-model”). This is used as the identifier for the model on leaderboards like PapersWithCode.
eval_results (List[EvalResult]
) — List of huggingface_hub.EvalResult
objects containing the metrics to be reported in the model-index.
Returns
model_index (List[Dict[str, Any]]
)
The eval_results converted to a model-index.
Takes in given model name and list of huggingface_hub.EvalResult
and returns a valid model-index that will be compatible with the format expected by the Hugging Face Hub.
Example:
Copied
huggingface_hub.metadata_eval_result
( model_pretty_name: strtask_pretty_name: strtask_id: strmetrics_pretty_name: strmetrics_id: strmetrics_value: typing.Anydataset_pretty_name: strdataset_id: strmetrics_config: typing.Optional[str] = Nonemetrics_verified: bool = Falsedataset_config: typing.Optional[str] = Nonedataset_split: typing.Optional[str] = Nonedataset_revision: typing.Optional[str] = Nonemetrics_verification_token: typing.Optional[str] = None ) → dict
Parameters
model_pretty_name (str
) — The name of the model in natural language.
task_pretty_name (str
) — The name of a task in natural language.
task_id (str
) — Example: automatic-speech-recognition. A task id.
metrics_pretty_name (str
) — A name for the metric in natural language. Example: Test WER.
metrics_value (Any
) — The value from the metric. Example: 20.0 or “20.0 ± 1.2”.
dataset_pretty_name (str
) — The name of the dataset in natural language.
metrics_config (str
, optional) — The name of the metric configuration used in load_metric()
. Example: bleurt-large-512 in load_metric("bleurt", "bleurt-large-512")
.
dataset_config (str
, optional) — Example: fr. The name of the dataset configuration used in load_dataset()
.
dataset_split (str
, optional) — Example: test. The name of the dataset split used in load_dataset()
.
dataset_revision (str
, optional) — Example: 5503434ddd753f426f4b38109466949a1217c2bb. The name of the dataset dataset revision used in load_dataset()
.
Returns
dict
a metadata dict with the result from a model evaluated on a dataset.
Creates a metadata dict with the result from a model evaluated on a dataset.
Example:
Copied
huggingface_hub.metadata_update
( repo_id: strmetadata: typing.Dictrepo_type: typing.Optional[str] = Noneoverwrite: bool = Falsetoken: typing.Optional[str] = Nonecommit_message: typing.Optional[str] = Nonecommit_description: typing.Optional[str] = Nonerevision: typing.Optional[str] = Nonecreate_pr: bool = Falseparent_commit: typing.Optional[str] = None ) → str
Parameters
repo_id (str
) — The name of the repository.
metadata (dict
) — A dictionary containing the metadata to be updated.
repo_type (str
, optional) — Set to "dataset"
or "space"
if updating to a dataset or space, None
or "model"
if updating to a model. Default is None
.
overwrite (bool
, optional, defaults to False
) — If set to True
an existing field can be overwritten, otherwise attempting to overwrite an existing field will cause an error.
token (str
, optional) — The Hugging Face authentication token.
commit_message (str
, optional) — The summary / title / first line of the generated commit. Defaults to f"Update metadata with huggingface_hub"
commit_description (str
optional) — The description of the generated commit
revision (str
, optional) — The git revision to commit from. Defaults to the head of the "main"
branch.
create_pr (boolean
, optional) — Whether or not to create a Pull Request from revision
with that commit. Defaults to False
.
parent_commit (str
, optional) — The OID / SHA of the parent commit, as a hexadecimal string. Shorthands (7 first characters) are also supported. If specified and create_pr
is False
, the commit will fail if revision
does not point to parent_commit
. If specified and create_pr
is True
, the pull request will be created from parent_commit
. Specifying parent_commit
ensures the repo has not changed before committing the changes, and can be especially useful if the repo is updated / committed to concurrently.
Returns
str
URL of the commit which updated the card metadata.
Updates the metadata in the README.md of a repository on the Hugging Face Hub. If the README.md file doesn’t exist yet, a new one is created with metadata and an the default ModelCard or DatasetCard template. For space
repo, an error is thrown as a Space cannot exist without a README.md
file.
Example:
Copied
when the content of the repo card metadata is not a dictionary.
( card_data: CardDatatemplate_path: typing.Optional[str] = None**template_kwargs ) →
( repo_id_or_path: typing.Union[str, pathlib.Path]repo_type: typing.Optional[str] = Nonetoken: typing.Optional[str] = Noneignore_metadata_errors: bool = False ) →
Validates card against Hugging Face Hub’s card validation logic. Using this function requires access to the internet, so it is only called internally by .
if the card fails validation checks.
if the request to the Hub API fails for any other reason.
The object is the parent class of and .
is the parent class of and .
( card_data: ModelCardDatatemplate_path: typing.Optional[str] = None**template_kwargs ) →
Initialize a ModelCard from a template. By default, it uses the default template, which can be found here:
license (str
, optional) — License of this model. Example: apache-2.0 or any license from . Defaults to None.
library_name (str
, optional) — Name of library used by this model. Example: keras or any library from . Defaults to None.
datasets (List[str]
, optional) — List of datasets that were used to train this model. Should be a dataset ID found on . Defaults to None.
metrics (List[str]
, optional) — List of metrics used to evaluate this model. Should be a metric name that can be found at . Example: ‘accuracy’. Defaults to None.
( card_data: DatasetCardDatatemplate_path: typing.Optional[str] = None**template_kwargs ) →
Initialize a DatasetCard from a template. By default, it uses the default template, which can be found here:
license (Union[str, List[str]]
, optional) — License(s) of this dataset. Example: apache-2.0 or any license from .
license (str
, optional) — License of this model. Example: apache-2.0 or any license from .
models (Liststr
, optional) — List of models related to this Space. Should be a dataset ID found on .
datasets (List[str]
, optional) — List of datasets related to this Space. Should be a dataset ID found on .
To get an exhaustive reference of Spaces configuration, please visit .
dataset_type (str
) — The dataset identifier. Example: “common_voice”. Use dataset id from .
metric_type (str
) — The metric identifier. Example: “wer”. Use metric id from .
dataset_config (str
, optional) — The name of the dataset configuration used in load_dataset()
. Example: fr in load_dataset("common_voice", "fr")
. See the datasets
docs for more info:
metric_config (str
, optional) — The name of the metric configuration used in load_metric()
. Example: bleurt-large-512 in load_metric("bleurt", "bleurt-large-512")
. See the datasets
docs for more info:
verified (bool
, optional) — Indicates whether the metrics originate from Hugging Face’s or not. Automatically computed by Hugging Face, do not set.
verify_token (str
, optional) — A JSON Web Token that is used to verify whether the metrics originate from Hugging Face’s or not.
For more information on the model-index spec, see .
A detailed spec of the model index can be found here:
metrics_id (str
) — Example: wer. A metric id from .
dataset_id (str
) — Example: common_voice. A dataset id from .
metrics_verified (bool
, optional, defaults to False
) — Indicates whether the metrics originate from Hugging Face’s or not. Automatically computed by Hugging Face, do not set.
metrics_verification_token (bool
, optional) — A JSON Web Token that is used to verify whether the metrics originate from Hugging Face’s or not.