BOINC AI Hub
  • 🌍BOINC AI Hub
  • 🌍Repositories
  • Getting Started with Repositories
  • Repository Settings
  • Pull Requests & Discussions
  • Notifications
  • Collections
  • 🌍Webhooks
    • How-to: Automatic fine-tuning with Auto-Train
    • How-to: Build a Discussion bot based on BLOOM
    • How-to: Create automatic metadata quality reports
  • Repository size recommendations
  • Next Steps
  • Licenses
  • 🌍Models
  • The Model Hub
  • 🌍Model Cards
    • Annotated Model Card
    • Carbon Emissions
    • Model Card Guidebook
    • Landscape Analysis
  • Gated Models
  • Uploading Models
  • Downloading Models
  • 🌍Integrated Libraries
    • Adapter Transformers
    • AllenNLP
    • Asteroid
    • Diffusers
    • ESPnet
    • fastai
    • Flair
    • Keras
    • ML-Agents
    • PaddleNLP
    • RL-Baselines3-Zoo
    • Sample Factory
    • Sentence Transformers
    • spaCy
    • SpanMarker
    • SpeechBrain
    • Stable-Baselines3
    • Stanza
    • TensorBoard
    • timm
    • Transformers
    • Transformers.js
  • 🌍Model Widgets
    • Widget Examples
  • Inference API docs
  • Frequently Asked Questions
  • 🌍Advanced Topics
    • Integrate a library with the Hub
    • Tasks
  • 🌍Datasets
  • Datasets Overview
  • Dataset Cards
  • Gated Datasets
  • Dataset Viewer
  • Using Datasets
  • Adding New Datasets
  • 🌍Spaces
  • 🌍Spaces Overview
    • Handling Spaces Dependencies
    • Spaces Settings
    • Using Spaces for Organization Cards
  • Spaces GPU Upgrades
  • Spaces Persistent Storage
  • Gradio Spaces
  • Streamlit Spaces
  • Static HTML Spaces
  • 🌍Docker Spaces
    • Your first Docker Spaces
    • Example Docker Spaces
    • Argilla on Spaces
    • Label Studio on Spaces
    • Aim on Space
    • Livebook on Spaces
    • Shiny on Spaces
    • ZenML on Spaces
    • Panel on Spaces
    • ChatUI on Spaces
    • Tabby on Spaces
  • Embed your Space
  • Run Spaces with Docker
  • Spaces Configuration Reference
  • Sign-In with BA button
  • Spaces Changelog
  • 🌍Advanced Topics
    • Using OpenCV in Spaces
    • More ways to create Spaces
    • Managing Spaces with Github Actions
    • Custom Python Spaces
    • How to Add a Space to ArXiv
    • Cookie limitations in Spaces
  • 🌍Other
  • 🌍Organizations
    • Managing Organizations
    • Organization Cards
    • Access Control in Organizations
  • Billing
  • 🌍Security
    • User Access Tokens
    • Git over SSH
    • Signing Commits with GPG
    • Single Sign-On (SSO)
    • Malware Scanning
    • Pickle Scanning
    • Secrets Scanning
  • Moderation
  • Paper Pages
  • Search
  • Digital Object Identifier (DOI)
  • Hub API Endpoints
  • Sign-In with BA
Powered by GitBook
On this page
  • Using spaCy at BOINC AI
  • Exploring spaCy models in the Hub
  • Using existing models
  • Sharing your models
  • Additional resources
  1. Integrated Libraries

spaCy

PreviousSentence TransformersNextSpanMarker

Last updated 1 year ago

Using spaCy at BOINC AI

spaCy is a popular library for advanced Natural Language Processing used widely across industry. spaCy makes it easy to use and train pipelines for tasks like named entity recognition, text classification, part of speech tagging and more, and lets you build powerful applications to process and analyze large volumes of text.

Exploring spaCy models in the Hub

The official models from spaCy 3.3 are in the spaCy . Anyone in the community can also share their spaCy models, which you can find by filtering at the left of the .

All models on the Hub come up with useful features

  1. An automatically generated model card with label scheme, metrics, components, and more.

  2. An evaluation sections at top right where you can look at the metrics.

  3. Metadata tags that help for discoverability and contain information such as license and language.

  4. An interactive widget you can use to play out with the model directly in the browser

  5. An Inference API that allows to make inference requests.

Using existing models

All spaCy models from the Hub can be directly installed using pip install.

Copied

pip install https://huggingface.co/spacy/en_core_web_sm/resolve/main/en_core_web_sm-any-py3-none-any.whl

To find the link of interest, you can go to a repository with a spaCy model. When you open the repository, you can click Use in spaCy and you will be given a working snippet that you can use to install and load the model!

Once installed, you can load the model as any spaCy pipeline.

Copied

# Using spacy.load().
import spacy
nlp = spacy.load("en_core_web_sm")

# Importing as module.
import en_core_web_sm
nlp = en_core_web_sm.load()

Sharing your models

Using the spaCy CLI (recommended)

The spacy-huggingface-hub library extends spaCy native CLI so people can easily push their packaged models to the Hub.

You can install spacy-huggingface-hub from pip:

Copied

pip install spacy-huggingface-hub

You can then check if the command has been registered successfully

Copied

python -m spacy huggingface-hub --help

To push with the CLI, you can use the huggingface-hub push command as seen below.

Copied

python -m spacy huggingface-hub push [whl_path] [--org] [--msg] [--local-repo] [--verbose]
Argument
Type
Description

whl_path

str / Path

--org, -o

str

Optional name of organization to which the pipeline should be uploaded.

--msg, -m

str

Commit message to use for update. Defaults to "Update spaCy pipeline".

--local-repo, -l

str / Path

Local path to the model repository (will be created if it doesn’t exist). Defaults to hub in the current working directory.

--verbose, -V

bool

Output additional info for debugging, e.g. the full generated hub metadata.

Copied

huggingface-cli login
python -m spacy package ./en_ner_fashion ./output --build wheel
cd ./output/en_ner_fashion-0.0.0/dist
python -m spacy huggingface-hub push en_ner_fashion-0.0.0-py3-none-any.whl

In just a minute, you can get your packaged model in the Hub, try it out directly in the browser, and share it with the rest of the community. All the required metadata will be uploaded for you and you even get a cool model card.

The command will output two things:

  • And how to install the pipeline directly from the Hub!

From a Python script

You can use the push function from Python. It returns a dictionary containing the "url" and ”whl_url” of the published model and the wheel file, which you can later install with pip install.

Copied

from spacy_huggingface_hub import push

result = push("./en_ner_fashion-0.0.0-py3-none-any.whl")
print(result["url"])

Additional resources

The path to the .whl file packaged with .

You can then upload any pipeline packaged with . Make sure to set --build wheel to output a binary .whl file. The uploader will read all metadata from the pipeline package, including the auto-generated pretty README.md and the model details available in the meta.json.

Where to find your repo in the Hub! For example,

spacy-huggingface-hub .

Launch

spaCy v 3.1

spaCy

🌍
spacy package
https://huggingface.co/spacy/en_core_web_sm
library
blog post
Announcement
documentation
spacy package
Organization Page
models page