Transformers
  • ๐ŸŒGET STARTED
    • Transformers
    • Quick tour
    • Installation
  • ๐ŸŒTUTORIALS
    • Run inference with pipelines
    • Write portable code with AutoClass
    • Preprocess data
    • Fine-tune a pretrained model
    • Train with a script
    • Set up distributed training with BOINC AI Accelerate
    • Load and train adapters with BOINC AI PEFT
    • Share your model
    • Agents
    • Generation with LLMs
  • ๐ŸŒTASK GUIDES
    • ๐ŸŒNATURAL LANGUAGE PROCESSING
      • Text classification
      • Token classification
      • Question answering
      • Causal language modeling
      • Masked language modeling
      • Translation
      • Summarization
      • Multiple choice
    • ๐ŸŒAUDIO
      • Audio classification
      • Automatic speech recognition
    • ๐ŸŒCOMPUTER VISION
      • Image classification
      • Semantic segmentation
      • Video classification
      • Object detection
      • Zero-shot object detection
      • Zero-shot image classification
      • Depth estimation
    • ๐ŸŒMULTIMODAL
      • Image captioning
      • Document Question Answering
      • Visual Question Answering
      • Text to speech
    • ๐ŸŒGENERATION
      • Customize the generation strategy
    • ๐ŸŒPROMPTING
      • Image tasks with IDEFICS
  • ๐ŸŒDEVELOPER GUIDES
    • Use fast tokenizers from BOINC AI Tokenizers
    • Run inference with multilingual models
    • Use model-specific APIs
    • Share a custom model
    • Templates for chat models
    • Run training on Amazon SageMaker
    • Export to ONNX
    • Export to TFLite
    • Export to TorchScript
    • Benchmarks
    • Notebooks with examples
    • Community resources
    • Custom Tools and Prompts
    • Troubleshoot
  • ๐ŸŒPERFORMANCE AND SCALABILITY
    • Overview
    • ๐ŸŒEFFICIENT TRAINING TECHNIQUES
      • Methods and tools for efficient training on a single GPU
      • Multiple GPUs and parallelism
      • Efficient training on CPU
      • Distributed CPU training
      • Training on TPUs
      • Training on TPU with TensorFlow
      • Training on Specialized Hardware
      • Custom hardware for training
      • Hyperparameter Search using Trainer API
    • ๐ŸŒOPTIMIZING INFERENCE
      • Inference on CPU
      • Inference on one GPU
      • Inference on many GPUs
      • Inference on Specialized Hardware
    • Instantiating a big model
    • Troubleshooting
    • XLA Integration for TensorFlow Models
    • Optimize inference using `torch.compile()`
  • ๐ŸŒCONTRIBUTE
    • How to contribute to transformers?
    • How to add a model to BOINC AI Transformers?
    • How to convert a BOINC AI Transformers model to TensorFlow?
    • How to add a pipeline to BOINC AI Transformers?
    • Testing
    • Checks on a Pull Request
  • ๐ŸŒCONCEPTUAL GUIDES
    • Philosophy
    • Glossary
    • What BOINC AI Transformers can do
    • How BOINC AI Transformers solve tasks
    • The Transformer model family
    • Summary of the tokenizers
    • Attention mechanisms
    • Padding and truncation
    • BERTology
    • Perplexity of fixed-length models
    • Pipelines for webserver inference
    • Model training anatomy
  • ๐ŸŒAPI
    • ๐ŸŒMAIN CLASSES
      • Agents and Tools
      • ๐ŸŒAuto Classes
        • Extending the Auto Classes
        • AutoConfig
        • AutoTokenizer
        • AutoFeatureExtractor
        • AutoImageProcessor
        • AutoProcessor
        • Generic model classes
          • AutoModel
          • TFAutoModel
          • FlaxAutoModel
        • Generic pretraining classes
          • AutoModelForPreTraining
          • TFAutoModelForPreTraining
          • FlaxAutoModelForPreTraining
        • Natural Language Processing
          • AutoModelForCausalLM
          • TFAutoModelForCausalLM
          • FlaxAutoModelForCausalLM
          • AutoModelForMaskedLM
          • TFAutoModelForMaskedLM
          • FlaxAutoModelForMaskedLM
          • AutoModelForMaskGenerationge
          • TFAutoModelForMaskGeneration
          • AutoModelForSeq2SeqLM
          • TFAutoModelForSeq2SeqLM
          • FlaxAutoModelForSeq2SeqLM
          • AutoModelForSequenceClassification
          • TFAutoModelForSequenceClassification
          • FlaxAutoModelForSequenceClassification
          • AutoModelForMultipleChoice
          • TFAutoModelForMultipleChoice
          • FlaxAutoModelForMultipleChoice
          • AutoModelForNextSentencePrediction
          • TFAutoModelForNextSentencePrediction
          • FlaxAutoModelForNextSentencePrediction
          • AutoModelForTokenClassification
          • TFAutoModelForTokenClassification
          • FlaxAutoModelForTokenClassification
          • AutoModelForQuestionAnswering
          • TFAutoModelForQuestionAnswering
          • FlaxAutoModelForQuestionAnswering
          • AutoModelForTextEncoding
          • TFAutoModelForTextEncoding
        • Computer vision
          • AutoModelForDepthEstimation
          • AutoModelForImageClassification
          • TFAutoModelForImageClassification
          • FlaxAutoModelForImageClassification
          • AutoModelForVideoClassification
          • AutoModelForMaskedImageModeling
          • TFAutoModelForMaskedImageModeling
          • AutoModelForObjectDetection
          • AutoModelForImageSegmentation
          • AutoModelForImageToImage
          • AutoModelForSemanticSegmentation
          • TFAutoModelForSemanticSegmentation
          • AutoModelForInstanceSegmentation
          • AutoModelForUniversalSegmentation
          • AutoModelForZeroShotImageClassification
          • TFAutoModelForZeroShotImageClassification
          • AutoModelForZeroShotObjectDetection
        • Audio
          • AutoModelForAudioClassification
          • AutoModelForAudioFrameClassification
          • TFAutoModelForAudioFrameClassification
          • AutoModelForCTC
          • AutoModelForSpeechSeq2Seq
          • TFAutoModelForSpeechSeq2Seq
          • FlaxAutoModelForSpeechSeq2Seq
          • AutoModelForAudioXVector
          • AutoModelForTextToSpectrogram
          • AutoModelForTextToWaveform
        • Multimodal
          • AutoModelForTableQuestionAnswering
          • TFAutoModelForTableQuestionAnswering
          • AutoModelForDocumentQuestionAnswering
          • TFAutoModelForDocumentQuestionAnswering
          • AutoModelForVisualQuestionAnswering
          • AutoModelForVision2Seq
          • TFAutoModelForVision2Seq
          • FlaxAutoModelForVision2Seq
      • Callbacks
      • Configuration
      • Data Collator
      • Keras callbacks
      • Logging
      • Models
      • Text Generation
      • ONNX
      • Optimization
      • Model outputs
      • Pipelines
      • Processors
      • Quantization
      • Tokenizer
      • Trainer
      • DeepSpeed Integration
      • Feature Extractor
      • Image Processor
    • ๐ŸŒMODELS
      • ๐ŸŒTEXT MODELS
        • ALBERT
        • BART
        • BARThez
        • BARTpho
        • BERT
        • BertGeneration
        • BertJapanese
        • Bertweet
        • BigBird
        • BigBirdPegasus
        • BioGpt
        • Blenderbot
        • Blenderbot Small
        • BLOOM
        • BORT
        • ByT5
        • CamemBERT
        • CANINE
        • CodeGen
        • CodeLlama
        • ConvBERT
        • CPM
        • CPMANT
        • CTRL
        • DeBERTa
        • DeBERTa-v2
        • DialoGPT
        • DistilBERT
        • DPR
        • ELECTRA
        • Encoder Decoder Models
        • ERNIE
        • ErnieM
        • ESM
        • Falcon
        • FLAN-T5
        • FLAN-UL2
        • FlauBERT
        • FNet
        • FSMT
        • Funnel Transformer
        • GPT
        • GPT Neo
        • GPT NeoX
        • GPT NeoX Japanese
        • GPT-J
        • GPT2
        • GPTBigCode
        • GPTSAN Japanese
        • GPTSw3
        • HerBERT
        • I-BERT
        • Jukebox
        • LED
        • LLaMA
        • LLama2
        • Longformer
        • LongT5
        • LUKE
        • M2M100
        • MarianMT
        • MarkupLM
        • MBart and MBart-50
        • MEGA
        • MegatronBERT
        • MegatronGPT2
        • Mistral
        • mLUKE
        • MobileBERT
        • MPNet
        • MPT
        • MRA
        • MT5
        • MVP
        • NEZHA
        • NLLB
        • NLLB-MoE
        • Nystrรถmformer
        • Open-Llama
        • OPT
        • Pegasus
        • PEGASUS-X
        • Persimmon
        • PhoBERT
        • PLBart
        • ProphetNet
        • QDQBert
        • RAG
        • REALM
        • Reformer
        • RemBERT
        • RetriBERT
        • RoBERTa
        • RoBERTa-PreLayerNorm
        • RoCBert
        • RoFormer
        • RWKV
        • Splinter
        • SqueezeBERT
        • SwitchTransformers
        • T5
        • T5v1.1
        • TAPEX
        • Transformer XL
        • UL2
        • UMT5
        • X-MOD
        • XGLM
        • XLM
        • XLM-ProphetNet
        • XLM-RoBERTa
        • XLM-RoBERTa-XL
        • XLM-V
        • XLNet
        • YOSO
      • ๐ŸŒVISION MODELS
        • BEiT
        • BiT
        • Conditional DETR
        • ConvNeXT
        • ConvNeXTV2
        • CvT
        • Deformable DETR
        • DeiT
        • DETA
        • DETR
        • DiNAT
        • DINO V2
        • DiT
        • DPT
        • EfficientFormer
        • EfficientNet
        • FocalNet
        • GLPN
        • ImageGPT
        • LeViT
        • Mask2Former
        • MaskFormer
        • MobileNetV1
        • MobileNetV2
        • MobileViT
        • MobileViTV2
        • NAT
        • PoolFormer
        • Pyramid Vision Transformer (PVT)
        • RegNet
        • ResNet
        • SegFormer
        • SwiftFormer
        • Swin Transformer
        • Swin Transformer V2
        • Swin2SR
        • Table Transformer
        • TimeSformer
        • UperNet
        • VAN
        • VideoMAE
        • Vision Transformer (ViT)
        • ViT Hybrid
        • ViTDet
        • ViTMAE
        • ViTMatte
        • ViTMSN
        • ViViT
        • YOLOS
      • ๐ŸŒAUDIO MODELS
        • Audio Spectrogram Transformer
        • Bark
        • CLAP
        • EnCodec
        • Hubert
        • MCTCT
        • MMS
        • MusicGen
        • Pop2Piano
        • SEW
        • SEW-D
        • Speech2Text
        • Speech2Text2
        • SpeechT5
        • UniSpeech
        • UniSpeech-SAT
        • VITS
        • Wav2Vec2
        • Wav2Vec2-Conformer
        • Wav2Vec2Phoneme
        • WavLM
        • Whisper
        • XLS-R
        • XLSR-Wav2Vec2
      • ๐ŸŒMULTIMODAL MODELS
        • ALIGN
        • AltCLIP
        • BLIP
        • BLIP-2
        • BridgeTower
        • BROS
        • Chinese-CLIP
        • CLIP
        • CLIPSeg
        • Data2Vec
        • DePlot
        • Donut
        • FLAVA
        • GIT
        • GroupViT
        • IDEFICS
        • InstructBLIP
        • LayoutLM
        • LayoutLMV2
        • LayoutLMV3
        • LayoutXLM
        • LiLT
        • LXMERT
        • MatCha
        • MGP-STR
        • Nougat
        • OneFormer
        • OWL-ViT
        • Perceiver
        • Pix2Struct
        • Segment Anything
        • Speech Encoder Decoder Models
        • TAPAS
        • TrOCR
        • TVLT
        • ViLT
        • Vision Encoder Decoder Models
        • Vision Text Dual Encoder
        • VisualBERT
        • X-CLIP
      • ๐ŸŒREINFORCEMENT LEARNING MODELS
        • Decision Transformer
        • Trajectory Transformer
      • ๐ŸŒTIME SERIES MODELS
        • Autoformer
        • Informer
        • Time Series Transformer
      • ๐ŸŒGRAPH MODELS
        • Graphormer
  • ๐ŸŒINTERNAL HELPERS
    • Custom Layers and Utilities
    • Utilities for pipelines
    • Utilities for Tokenizers
    • Utilities for Trainer
    • Utilities for Generation
    • Utilities for Image Processors
    • Utilities for Audio processing
    • General Utilities
    • Utilities for Time Series
Powered by GitBook
On this page
  • Share a model
  • Repository features
  • Setup
  • Convert a model for all frameworks
  • Push a model during training
  • Use the push_to_hub function
  • Upload with the web interface
  • Add a model card
  1. TUTORIALS

Share your model

PreviousLoad and train adapters with BOINC AI PEFTNextAgents

Last updated 1 year ago

Share a model

The last two tutorials showed how you can fine-tune a model with PyTorch, Keras, and ๐ŸŒŽ Accelerate for distributed setups. The next step is to share your model with the community! At BOINC AI, we believe in openly sharing knowledge and resources to democratize artificial intelligence for everyone. We encourage you to consider sharing your model with the community to help others save time and resources.

In this tutorial, you will learn two methods for sharing a trained or fine-tuned model on the :

  • Programmatically push your files to the Hub.

  • Drag-and-drop your files to the Hub with the web interface.

To share a model with the community, you need an account on . You can also join an existing organization or create a new one.

Repository features

Each repository on the Model Hub behaves like a typical GitHub repository. Our repositories offer versioning, commit history, and the ability to visualize differences.

The Model Hubโ€™s built-in versioning is based on git and . In other words, you can treat one model as one repository, enabling greater access control and scalability. Version control allows revisions, a method for pinning a specific version of a model with a commit hash, tag or branch.

As a result, you can load a specific model version with the revision parameter:

Copied

>>> model = AutoModel.from_pretrained(
...     "julien-c/EsperBERTo-small", revision="v2.0.1"  # tag name, or branch name, or commit hash
... )

Files are also easily edited in a repository, and you can view the commit history as well as the difference:

vis_diff

Setup

Before sharing a model to the Hub, you will need your BOINC AI credentials. If you have access to a terminal, run the following command in the virtual environment where ๐ŸŒŽ Transformers is installed. This will store your access token in your BOINC AI cache folder (~/.cache/ by default):

Copied

boincai-cli login

Copied

pip install boincai_hub

Copied

>>> from boincai_hub import notebook_login

>>> notebook_login()

Convert a model for all frameworks

To ensure your model can be used by someone working with a different framework, we recommend you convert and upload your model with both PyTorch and TensorFlow checkpoints. While users are still able to load your model from a different framework if you skip this step, it will be slower because ๐ŸŒŽ Transformers will need to convert the checkpoint on-the-fly.

PytorchHide Pytorch content

Specify from_tf=True to convert a checkpoint from TensorFlow to PyTorch:

Copied

>>> pt_model = DistilBertForSequenceClassification.from_pretrained("path/to/awesome-name-you-picked", from_tf=True)
>>> pt_model.save_pretrained("path/to/awesome-name-you-picked")

TensorFlowHide TensorFlow content

Specify from_pt=True to convert a checkpoint from PyTorch to TensorFlow:

Copied

>>> tf_model = TFDistilBertForSequenceClassification.from_pretrained("path/to/awesome-name-you-picked", from_pt=True)

Then you can save your new TensorFlow model with its new checkpoint:

Copied

>>> tf_model.save_pretrained("path/to/awesome-name-you-picked")

JAXHide JAX content

If a model is available in Flax, you can also convert a checkpoint from PyTorch to Flax:

Copied

>>> flax_model = FlaxDistilBertForSequenceClassification.from_pretrained(
...     "path/to/awesome-name-you-picked", from_pt=True
... )

Push a model during training

Copied

>>> training_args = TrainingArguments(output_dir="my-awesome-model", push_to_hub=True)

Copied

>>> trainer = Trainer(
...     model=model,
...     args=training_args,
...     train_dataset=small_train_dataset,
...     eval_dataset=small_eval_dataset,
...     compute_metrics=compute_metrics,
... )

Copied

>>> trainer.push_to_hub()

TensorFlowHide TensorFlow content

  • An output directory for your model.

  • A tokenizer.

  • The hub_model_id, which is your Hub username and model name.

Copied

>>> from transformers import PushToHubCallback

>>> push_to_hub_callback = PushToHubCallback(
...     output_dir="./your_model_save_path", tokenizer=tokenizer, hub_model_id="your-username/my-awesome-model"
... )

Copied

>>> model.fit(tf_train_dataset, validation_data=tf_validation_dataset, epochs=3, callbacks=push_to_hub_callback)

Use the push_to_hub function

You can also call push_to_hub directly on your model to upload it to the Hub.

Specify your model name in push_to_hub:

Copied

>>> pt_model.push_to_hub("my-awesome-model")

This creates a repository under your username with the model name my-awesome-model. Users can now load your model with the from_pretrained function:

Copied

>>> from transformers import AutoModel

>>> model = AutoModel.from_pretrained("your_username/my-awesome-model")

If you belong to an organization and want to push your model under the organization name instead, just add it to the repo_id:

Copied

>>> pt_model.push_to_hub("my-awesome-org/my-awesome-model")

The push_to_hub function can also be used to add other files to a model repository. For example, add a tokenizer to a model repository:

Copied

>>> tokenizer.push_to_hub("my-awesome-model")

Or perhaps youโ€™d like to add the TensorFlow version of your fine-tuned PyTorch model:

Copied

>>> tf_model.push_to_hub("my-awesome-model")

Now when you navigate to your BOINC AI profile, you should see your newly created model repository. Clicking on the Files tab will display all the files youโ€™ve uploaded to the repository.

Upload with the web interface

From here, add some information about your model:

  • Select the owner of the repository. This can be yourself or any of the organizations you belong to.

  • Pick a name for your model, which will also be the repository name.

  • Choose whether your model is public or private.

  • Specify the license usage for your model.

Now click on the Files tab and click on the Add file button to upload a new file to your repository. Then drag-and-drop a file to upload and add a commit message.

Add a model card

To make sure users understand your modelโ€™s capabilities, limitations, potential biases and ethical considerations, please add a model card to your repository. The model card is defined in the README.md file. You can add a model card by:

  • Manually creating and uploading a README.md file.

  • Clicking on the Edit model card button in your model repository.

If you are using a notebook like Jupyter or Colaboratory, make sure you have the library installed. This library allows you to programmatically interact with the Hub.

Then use notebook_login to sign-in to the Hub, and follow the link to generate a token to login with:

Converting a checkpoint for another framework is easy. Make sure you have PyTorch and TensorFlow installed (see for installation instructions), and then find the specific model for your task in the other framework.

Sharing a model to the Hub is as simple as adding an extra parameter or callback. Remember from the , the class is where you specify hyperparameters and additional training options. One of these training options includes the ability to push a model directly to the Hub. Set push_to_hub=True in your :

Pass your training arguments as usual to :

After you fine-tune your model, call on to push the trained model to the Hub. ๐ŸŒŽ Transformers will even automatically add training hyperparameters, training results and framework versions to your model card!

Share a model to the Hub with . In the function, add:

Add the callback to , and ๐ŸŒŽ Transformers will push the trained model to the Hub:

For more details on how to create and upload files to a repository, refer to the Hub documentation .

Users who prefer a no-code approach are able to upload a model through the Hubโ€™s web interface. Visit to create a new repository:

new_model_repo
upload_file

Take a look at the DistilBert for a good example of the type of information a model card should include. For more details about other options you can control in the README.md file such as a modelโ€™s carbon footprint or widget examples, refer to the documentation .

๐ŸŒ
boincai_hub
here
here
fine-tuning tutorial
TrainingArguments
TrainingArguments
Trainer
push_to_hub()
Trainer
PushToHubCallback
PushToHubCallback
fit
here
boincai.com/new
model card
here
Model Hub
boincai.com
git-lfs