Metadata Parsing
Metadata Parsing
Given the simplicity of the format, itβs very simple and efficient to fetch and parse metadata about Safetensors weights β i.e. the list of tensors, their types, and their shapes or numbers of parameters β using small (Range) HTTP requests.
This parsing has been implemented in JS in huggingface.js (sample code follows below), but it would be similar in any language.
Example use case
There can be many potential use cases. For instance, we use it on the HuggingFace Hub to display info about models which have safetensors weights:


Usage
JavaScript/TypeScript
Using huggingface.js
Copied
import { parseSafetensorsMetadata } from "@huggingface/hub";
const info = await parseSafetensorsMetadata({
repo: { type: "model", name: "bigscience/bloom" },
});
console.log(info)
// {
// sharded: true,
// index: {
// metadata: { total_size: 352494542848 },
// weight_map: {
// 'h.0.input_layernorm.bias': 'model_00002-of-00072.safetensors',
// ...
// }
// },
// headers: {
// __metadata__: {'format': 'pt'},
// 'h.2.attn.c_attn.weight': {'dtype': 'F32', 'shape': [768, 2304], 'data_offsets': [541012992, 548090880]},
// ...
// }
// }Depending on whether the safetensors weights are sharded into multiple files or not, the output of the call above will be:
Copied
where the underlying types are the following:
Copied
Python
In this example python script, we are parsing metadata of gpt2.
Copied
Example output
For instance, here are the number of params per dtype for a few models on the HuggingFace Hub. Also see this issue for more examples of usage.
Last updated