BOINC AI Hub
  • 🌍BOINC AI Hub
  • 🌍Repositories
  • Getting Started with Repositories
  • Repository Settings
  • Pull Requests & Discussions
  • Notifications
  • Collections
  • 🌍Webhooks
    • How-to: Automatic fine-tuning with Auto-Train
    • How-to: Build a Discussion bot based on BLOOM
    • How-to: Create automatic metadata quality reports
  • Repository size recommendations
  • Next Steps
  • Licenses
  • 🌍Models
  • The Model Hub
  • 🌍Model Cards
    • Annotated Model Card
    • Carbon Emissions
    • Model Card Guidebook
    • Landscape Analysis
  • Gated Models
  • Uploading Models
  • Downloading Models
  • 🌍Integrated Libraries
    • Adapter Transformers
    • AllenNLP
    • Asteroid
    • Diffusers
    • ESPnet
    • fastai
    • Flair
    • Keras
    • ML-Agents
    • PaddleNLP
    • RL-Baselines3-Zoo
    • Sample Factory
    • Sentence Transformers
    • spaCy
    • SpanMarker
    • SpeechBrain
    • Stable-Baselines3
    • Stanza
    • TensorBoard
    • timm
    • Transformers
    • Transformers.js
  • 🌍Model Widgets
    • Widget Examples
  • Inference API docs
  • Frequently Asked Questions
  • 🌍Advanced Topics
    • Integrate a library with the Hub
    • Tasks
  • 🌍Datasets
  • Datasets Overview
  • Dataset Cards
  • Gated Datasets
  • Dataset Viewer
  • Using Datasets
  • Adding New Datasets
  • 🌍Spaces
  • 🌍Spaces Overview
    • Handling Spaces Dependencies
    • Spaces Settings
    • Using Spaces for Organization Cards
  • Spaces GPU Upgrades
  • Spaces Persistent Storage
  • Gradio Spaces
  • Streamlit Spaces
  • Static HTML Spaces
  • 🌍Docker Spaces
    • Your first Docker Spaces
    • Example Docker Spaces
    • Argilla on Spaces
    • Label Studio on Spaces
    • Aim on Space
    • Livebook on Spaces
    • Shiny on Spaces
    • ZenML on Spaces
    • Panel on Spaces
    • ChatUI on Spaces
    • Tabby on Spaces
  • Embed your Space
  • Run Spaces with Docker
  • Spaces Configuration Reference
  • Sign-In with BA button
  • Spaces Changelog
  • 🌍Advanced Topics
    • Using OpenCV in Spaces
    • More ways to create Spaces
    • Managing Spaces with Github Actions
    • Custom Python Spaces
    • How to Add a Space to ArXiv
    • Cookie limitations in Spaces
  • 🌍Other
  • 🌍Organizations
    • Managing Organizations
    • Organization Cards
    • Access Control in Organizations
  • Billing
  • 🌍Security
    • User Access Tokens
    • Git over SSH
    • Signing Commits with GPG
    • Single Sign-On (SSO)
    • Malware Scanning
    • Pickle Scanning
    • Secrets Scanning
  • Moderation
  • Paper Pages
  • Search
  • Digital Object Identifier (DOI)
  • Hub API Endpoints
  • Sign-In with BA
Powered by GitBook
On this page
  • Integrate your library with the Hub
  • Installation
  • Download files from the Hub
  • Upload files to the Hub
  • Set up the Inference API
  1. Advanced Topics

Integrate a library with the Hub

PreviousAdvanced TopicsNextTasks

Last updated 1 year ago

Integrate your library with the Hub

The BOINC AI Hub aims to facilitate sharing machine learning models, checkpoints, and artifacts. This endeavor includes integrating the Hub into many of the amazing third-party libraries in the community. Some of the ones already integrated include , , and , among many others. Integration means users can download and upload files to the Hub directly from your library. We hope you will integrate your library and join us in democratizing artificial intelligence for everyone!

Integrating the Hub with your library provides many benefits, including:

  • Free model hosting for you and your users.

  • Built-in file versioning - even for huge files - made possible by .

  • All public models are powered by the .

  • In-browser widgets allow users to interact with your hosted models directly.

This tutorial will help you integrate the Hub into your library so your users can benefit from all the features offered by the Hub.

Before you begin, we recommend you create a from which you can manage your repositories and files.

If you need help with the integration, feel free to open an , and we would be more than happy to help you!

Installation

  1. Install the huggingface_hub library with pip in your environment:

    Copied

    python -m pip install huggingface_hub
  2. Once you have successfully installed the huggingface_hub library, log in to your Hugging Face account:

    Copied

    huggingface-cli login

    Copied

         _|    _|  _|    _|    _|_|_|    _|_|_|  _|_|_|  _|      _|    _|_|_|      _|_|_|_|    _|_|      _|_|_|  _|_|_|_|
         _|    _|  _|    _|  _|        _|          _|    _|_|    _|  _|            _|        _|    _|  _|        _|
         _|_|_|_|  _|    _|  _|  _|_|  _|  _|_|    _|    _|  _|  _|  _|  _|_|      _|_|_|    _|_|_|_|  _|        _|_|_|
         _|    _|  _|    _|  _|    _|  _|    _|    _|    _|    _|_|  _|    _|      _|        _|    _|  _|        _|
         _|    _|    _|_|      _|_|_|    _|_|_|  _|_|_|  _|      _|    _|_|_|      _|        _|    _|    _|_|_|  _|_|_|_|
    
         
    Username: 
    Password:
  3. Alternatively, if you prefer working from a Jupyter or Colaboratory notebook, login with notebook_login:

    Copied

    >>> from huggingface_hub import notebook_login
    >>> notebook_login()

    notebook_login will launch a widget in your notebook from which you can enter your Hugging Face credentials.

Download files from the Hub

Integration allows users to download your hosted files directly from the Hub using your library.

Use the hf_hub_download function to retrieve a URL and download files from your repository. Downloaded files are stored in your cache: ~/.cache/huggingface/hub. You don’t have to re-download the file the next time you use it, and for larger files, this can save a lot of time. Furthermore, if the repository is updated with a new version of the file, huggingface_hub will automatically download the latest version and store it in the cache for you. Users don’t have to worry about updating their files.

Copied

>>> from huggingface_hub import hf_hub_download
>>> hf_hub_download(repo_id="lysandre/arxiv-nlp", filename="config.json")

Download a specific version of the file by specifying the revision parameter. The revision parameter can be a branch name, tag, or commit hash.

The commit hash must be a full-length hash instead of the shorter 7-character commit hash:

Copied

>>> from huggingface_hub import hf_hub_download
>>> hf_hub_download(repo_id="lysandre/arxiv-nlp", filename="config.json", revision="877b84a8f93f2d619faa2a6e514a32beef88ab0a")

Use the cache_dir parameter to change where a file is stored:

Copied

>>> from huggingface_hub import hf_hub_download
>>> hf_hub_download(repo_id="lysandre/arxiv-nlp", filename="config.json", cache_dir="/home/lysandre/test")

Code sample

We recommend adding a code snippet to explain how to use a model in your downstream library.

Copied

const asteroid = (model: ModelData) =>
`from asteroid.models import BaseModel
  
model = BaseModel.from_pretrained("${model.id}")`;

Doing so will also add a tag to your model so users can quickly identify models from your library.

Upload files to the Hub

You might also want to provide a method for creating model repositories and uploading files to the Hub directly from your library. The huggingface_hub library offers two ways to assist you with creating repositories and uploading files:

  • create_repo creates a repository on the Hub.

  • upload_file directly uploads files to a repository on the Hub.

create_repo

The create_repo method creates a repository on the Hub. Use the name parameter to provide a name for your repository:

Copied

>>> from huggingface_hub import create_repo
>>> create_repo(repo_id="test-model")
'https://huggingface.co/lysandre/test-model'

When you check your Hugging Face account, you should now see a test-model repository under your namespace.

upload_file

The upload_file method uploads files to the Hub. This method requires the following:

  • A path to the file to upload.

  • The final path in the repository.

  • The repository you wish to push the files to.

For example:

Copied

>>> from huggingface_hub import upload_file
>>> upload_file(
...    path_or_fileobj="/home/lysandre/dummy-test/README.md", 
...    path_in_repo="README.md", 
...    repo_id="lysandre/test-model"
... )
'https://huggingface.co/lysandre/test-model/blob/main/README.md'

Once again, if you check your Hugging Face account, you should see the file inside your repository.

Set up the Inference API

Our Inference API powers models uploaded to the Hub through your library.

Create an Inference API Docker image

  1. Copy the common folder and rename it with the name of your library (e.g. docker/common to docker/your-awesome-library).

  2. There are four files you need to edit:

    • List the packages required for your library to work in requirements.txt.

    • Copied

      ALLOWED_TASKS: Dict[str, Type[Pipeline]] = {
          "token-classification": TokenClassificationPipeline
      }
    • Add your model and task to the tests/test_api.py file. For example, if you have a text generation model:

      Copied

      TESTABLE_MODELS: Dict[str,str] = {
          "text-generation": "my-gpt2-model"
      }
  3. Finally, run the following test to ensure everything works as expected:

    Copied

    pytest -sv --rootdir docker_images/your-awesome-library/docker_images/your-awesome-library/

Register your libraries supported tasks on the hub

With these simple but powerful methods, you brought the full functionality of the Hub into your library. Users can download files stored on the Hub from your library with hf_hub_download, create repositories with create_repo, and upload files with upload_file. You also set up Inference API with your library, allowing users to interact with your models on the Hub from inside a browser.

For example, download the config.json file from the repository:

Add a code snippet by updating the with instructions for your model. For example, the integration includes a brief code snippet for how to load and use an Asteroid model:

If you need to upload more than one file, look at the .

Lastly, it is important to add a model card so users understand how to use your model. See for more details about how to create a model card.

All third-party libraries are Dockerized, so you can install the dependencies you’ll need for your library to work correctly. Add your library to the existing Docker images by navigating to the .

Update app/main.py with the tasks supported by your model (see for a complete list of available tasks). Look out for the IMPLEMENT_THIS flag to add your supported task.

For each task your library supports, modify the app/pipelines/task_name.py files accordingly. We have also added an IMPLEMENT_THIS flag in the pipeline files to guide you. If there isn’t a pipeline that supports your task, feel free to add one. Open an here, and we will be happy to help you.

To register the tasks supported by your library on the hub you’ll need to add a mapping from your library name to its supported tasks in this . This will ensure the inference API is registered for tasks supported by your model. This file is automatically generated as part of a in the repository. You can see an example of this .

🌍
spaCy
AllenNLP
timm
Git-LFS
Inference API
Hugging Face account
issue
lysandre/arxiv-nlp
Libraries Typescript file
Asteroid
utilities offered by the Repository class
here
Docker images folder
here
issue
file
GitHub Action
api-inference-community repository
here