BOINC AI Hub
  • ๐ŸŒBOINC AI Hub
  • ๐ŸŒRepositories
  • Getting Started with Repositories
  • Repository Settings
  • Pull Requests & Discussions
  • Notifications
  • Collections
  • ๐ŸŒWebhooks
    • How-to: Automatic fine-tuning with Auto-Train
    • How-to: Build a Discussion bot based on BLOOM
    • How-to: Create automatic metadata quality reports
  • Repository size recommendations
  • Next Steps
  • Licenses
  • ๐ŸŒModels
  • The Model Hub
  • ๐ŸŒModel Cards
    • Annotated Model Card
    • Carbon Emissions
    • Model Card Guidebook
    • Landscape Analysis
  • Gated Models
  • Uploading Models
  • Downloading Models
  • ๐ŸŒIntegrated Libraries
    • Adapter Transformers
    • AllenNLP
    • Asteroid
    • Diffusers
    • ESPnet
    • fastai
    • Flair
    • Keras
    • ML-Agents
    • PaddleNLP
    • RL-Baselines3-Zoo
    • Sample Factory
    • Sentence Transformers
    • spaCy
    • SpanMarker
    • SpeechBrain
    • Stable-Baselines3
    • Stanza
    • TensorBoard
    • timm
    • Transformers
    • Transformers.js
  • ๐ŸŒModel Widgets
    • Widget Examples
  • Inference API docs
  • Frequently Asked Questions
  • ๐ŸŒAdvanced Topics
    • Integrate a library with the Hub
    • Tasks
  • ๐ŸŒDatasets
  • Datasets Overview
  • Dataset Cards
  • Gated Datasets
  • Dataset Viewer
  • Using Datasets
  • Adding New Datasets
  • ๐ŸŒSpaces
  • ๐ŸŒSpaces Overview
    • Handling Spaces Dependencies
    • Spaces Settings
    • Using Spaces for Organization Cards
  • Spaces GPU Upgrades
  • Spaces Persistent Storage
  • Gradio Spaces
  • Streamlit Spaces
  • Static HTML Spaces
  • ๐ŸŒDocker Spaces
    • Your first Docker Spaces
    • Example Docker Spaces
    • Argilla on Spaces
    • Label Studio on Spaces
    • Aim on Space
    • Livebook on Spaces
    • Shiny on Spaces
    • ZenML on Spaces
    • Panel on Spaces
    • ChatUI on Spaces
    • Tabby on Spaces
  • Embed your Space
  • Run Spaces with Docker
  • Spaces Configuration Reference
  • Sign-In with BA button
  • Spaces Changelog
  • ๐ŸŒAdvanced Topics
    • Using OpenCV in Spaces
    • More ways to create Spaces
    • Managing Spaces with Github Actions
    • Custom Python Spaces
    • How to Add a Space to ArXiv
    • Cookie limitations in Spaces
  • ๐ŸŒOther
  • ๐ŸŒOrganizations
    • Managing Organizations
    • Organization Cards
    • Access Control in Organizations
  • Billing
  • ๐ŸŒSecurity
    • User Access Tokens
    • Git over SSH
    • Signing Commits with GPG
    • Single Sign-On (SSO)
    • Malware Scanning
    • Pickle Scanning
    • Secrets Scanning
  • Moderation
  • Paper Pages
  • Search
  • Digital Object Identifier (DOI)
  • Hub API Endpoints
  • Sign-In with BA
Powered by GitBook
On this page
  • Inference API
  • What technology do you use to power the inference API?
  • How can I turn off the inference API for my model?
  • Why donโ€™t I see an inference widget or why canโ€™t I use the inference API?
  • Can I send large volumes of requests? Can I get accelerated APIs?
  • How can I see my usage?
  • Is there programmatic access to the Inference API?

Inference API docs

PreviousWidget ExamplesNextFrequently Asked Questions

Last updated 1 year ago

Inference API

Please refer to for detailed information.

What technology do you use to power the inference API?

For ๐Ÿค— Transformers models, power the API.

On top of Pipelines and depending on the model type, there are several production optimizations like:

  • compiling models to optimized intermediary representations (e.g. ),

  • maintaining a Least Recently Used cache, ensuring that the most popular models are always loaded,

  • scaling the underlying compute infrastructure on the fly depending on the load constraints.

For models from , the API uses and runs in . Each library defines the implementation of .

How can I turn off the inference API for my model?

Specify inference: false in your model cardโ€™s metadata.

Why donโ€™t I see an inference widget or why canโ€™t I use the inference API?

For some tasks, there might not be support in the inference API, and, hence, there is no widget. For all libraries (except ๐ŸŒ Transformers), there is a of library to supported tasks in the API. When a model repository has a task that is not supported by the repository library, the repository has inference: false by default.

Can I send large volumes of requests? Can I get accelerated APIs?

If you are interested in accelerated inference, higher volumes of requests, or an SLA, please contact us at api-enterprise at huggingface.co.

How can I see my usage?

Is there programmatic access to the Inference API?

You can head to the . Learn more about it in the .

Yes, the huggingface_hub library has a client wrapper documented .

Inference API Documentation
Pipelines
ONNX
other libraries
Starlette
Docker containers
different pipelines
mapping
Inference API dashboard
Inference API documentation
here