Pipelines
Last updated
Last updated
Pipelines provide a high-level, easy to use, API for running machine learning models.
Example: Instantiate pipeline using the pipeline
function.
Copied
static
⇐ Callable
⇒ Promise.<void>
⇒ Promise.<any>
⇒ Promise.<(Array<Object>|Object)>
⇒ Promise.<(Array<Object>|Object)>
⇒ QuestionAnsweringReturnType
⇒ Promise.<(Array<Object>|Object)>
⇒ Promise.<any>
⇒ Promise.<any>
⇒ Promise.<(Object|Array<Object>)>
⇒
⇒ Promise.<(Array<Object>|Object)>
⇒ Promise.<Object>
⇒ Promise.<(Object|Array<Object>)>
⇒ Promise.<any>
⇒ Promise.<Array>
⇒ Promise.<any>
⇒ Promise.<(Object|Array<Object>)>
⇒ Promise.<Object>
⇒ Promise.<Pipeline>
inner
: object
: Promise.<(QuestionAnsweringResult|Array<QuestionAnsweringResult>)>
: function
The Pipeline class is the class from which all pipelines inherit. Refer to this class for methods shared across different pipelines.
Create a new Pipeline.
options
Object
An object containing the following properties:
[options.task]
string
The task of the pipeline. Useful for specifying subtasks.
[options.model]
PreTrainedModel
The model to use.
[options.tokenizer]
PreTrainedTokenizer
The tokenizer to use (if any).
[options.processor]
Processor
The processor to use (if any).
Disposes the model.
Executes the task associated with the pipeline.
texts
any
The input texts to be processed.
...args
any
Additional arguments.
Text classification pipeline using any ModelForSequenceClassification
.
Example: Sentiment-analysis w/ Xenova/distilbert-base-uncased-finetuned-sst-2-english
.
Copied
Example: Multilingual sentiment-analysis w/ Xenova/bert-base-multilingual-uncased-sentiment
(and return top 5 classes).
Copied
Example: Toxic comment classification w/ Xenova/toxic-bert
(and return all classes).
Copied
Executes the text classification task.
texts
any
The input texts to be classified.
options
Object
An optional object containing the following properties:
[options.topk]
number
1
The number of top predictions to be returned.
Named Entity Recognition pipeline using any ModelForTokenClassification
.
Example: Perform named entity recognition with Xenova/bert-base-NER
.
Copied
Example: Perform named entity recognition with Xenova/bert-base-NER
(and return all labels).
Copied
Executes the token classification task.
texts
any
The input texts to be classified.
options
Object
An optional object containing the following properties:
Question Answering pipeline using any ModelForQuestionAnswering
.
Example: Run question answering with Xenova/distilbert-base-uncased-distilled-squad
.
Copied
Executes the question answering task.
question
string
| Array<string>
The question(s) to be answered.
context
string
| Array<string>
The context(s) where the answer(s) can be found.
options
Object
An optional object containing the following properties:
[options.topk]
number
1
The number of top answer predictions to be returned.
Masked language modeling prediction pipeline using any ModelWithLMHead
.
Example: Perform masked language modelling (a.k.a. “fill-mask”) with Xenova/bert-base-uncased
.
Copied
Example: Perform masked language modelling (a.k.a. “fill-mask”) with Xenova/bert-base-cased
(and return top result).
Copied
Fill the masked token in the text(s) given as inputs.
texts
any
The masked input texts.
options
Object
An optional object containing the following properties:
[options.topk]
number
5
The number of top predictions to be returned.
Text2TextGenerationPipeline class for generating text using a model that performs text-to-text generation tasks.
Example: Text-to-text generation w/ Xenova/LaMini-Flan-T5-783M
.
Copied
Fill the masked token in the text(s) given as inputs.
Error
When the mask token is not found in the input text.
texts
string
| Array<string>
The text or array of texts to be processed.
[options]
Object
{}
Options for the fill-mask pipeline.
[options.topk]
number
5
The number of top-k predictions to return.
A pipeline for summarization tasks, inheriting from Text2TextGenerationPipeline.
Example: Summarization w/ Xenova/distilbart-cnn-6-6
.
Copied
Translates text from one language to another.
Example: Multilingual translation w/ Xenova/nllb-200-distilled-600M
.
Copied
Example: Multilingual translation w/ Xenova/m2m100_418M
.
Copied
Example: Multilingual translation w/ Xenova/mbart-large-50-many-to-many-mmt
.
Copied
Example: Text generation with Xenova/distilgpt2
(default settings).
Copied
Example: Text generation with Xenova/distilgpt2
(custom settings).
Copied
Example: Run code generation with Xenova/codegen-350M-mono
.
Copied
Generates text based on an input prompt.
texts
any
The input prompt or prompts to generate text from.
[generate_kwargs]
Object
{}
Additional arguments for text generation.
NLI-based zero-shot classification pipeline using a ModelForSequenceClassification
trained on NLI (natural language inference) tasks. Equivalent of text-classification
pipelines, but these models don’t require a hardcoded number of potential classes, they can be chosen at runtime. It usually means it’s slower but it is much more flexible.
Example: Zero shot classification with Xenova/mobilebert-uncased-mnli
.
Copied
Example: Zero shot classification with Xenova/nli-deberta-v3-xsmall
(multi-label).
Copied
Create a new ZeroShotClassificationPipeline.
options
Object
An object containing the following properties:
[options.task]
string
The task of the pipeline. Useful for specifying subtasks.
[options.model]
PreTrainedModel
The model to use.
[options.tokenizer]
PreTrainedTokenizer
The tokenizer to use.
texts
Array.<any>
candidate_labels
Array.<string>
options
Object
Additional options:
[options.hypothesis_template]
string
""This example is {}.""
The template used to turn each candidate label into an NLI-style hypothesis. The candidate label will replace the {} placeholder.
[options.multi_label]
boolean
false
Whether or not multiple candidate labels can be true. If false
, the scores are normalized such that the sum of the label likelihoods for each sequence is 1. If true
, the labels are considered independent and probabilities are normalized for each candidate by doing a softmax of the entailment score vs. the contradiction score.
Feature extraction pipeline using no model head. This pipeline extracts the hidden states from the base transformer, which can be used as features in downstream tasks.
Example: Run feature extraction with bert-base-uncased
(without pooling/normalization).
Copied
Example: Run feature extraction with bert-base-uncased
(with pooling/normalization).
Copied
Example: Calculating embeddings with sentence-transformers
models.
Copied
Extract the features of the input(s).
texts
string
| Array<string>
The input texts
options
Object
Additional options:
[options.pooling]
string
""none""
The pooling method to use. Can be one of: "none", "mean".
[options.normalize]
boolean
false
Whether or not to normalize the embeddings in the last dimension.
Audio classification pipeline using any AutoModelForAudioClassification
. This pipeline predicts the class of a raw waveform or an audio file.
Example: Perform audio classification.
Copied
Create a new AudioClassificationPipeline.
options
Object
An object containing the following properties:
[options.task]
string
The task of the pipeline. Useful for specifying subtasks.
[options.model]
PreTrainedModel
The model to use.
[options.processor]
Processor
The processor to use.
Executes the audio classification task.
audio
any
The input audio files to be classified.
options
Object
An optional object containing the following properties:
[options.topk]
number
5
The number of top predictions to be returned.
Pipeline that aims at extracting spoken text contained within some audio.
Example: Transcribe English.
Copied
Example: Transcribe English w/ timestamps.
Copied
Example: Transcribe English w/ word-level timestamps.
Copied
Example: Transcribe French.
Copied
Example: Translate French to English.
Copied
Example: Transcribe/translate audio longer than 30 seconds.
Copied
Create a new AutomaticSpeechRecognitionPipeline.
options
Object
An object containing the following properties:
[options.task]
string
The task of the pipeline. Useful for specifying subtasks.
[options.model]
PreTrainedModel
The model to use.
[options.tokenizer]
PreTrainedTokenizer
The tokenizer to use.
[options.processor]
Processor
The processor to use.
Asynchronously processes audio and generates text transcription using the model.
audio
Float32Array
| Array<Float32Array>
The audio to be transcribed. Can be a single Float32Array or an array of Float32Arrays.
[kwargs]
Object
{}
Optional arguments.
[kwargs.return_timestamps]
boolean
| 'word'
Whether to return timestamps or not. Default is false
.
[kwargs.chunk_length_s]
number
The length of audio chunks to process in seconds. Default is 0 (no chunking).
[kwargs.stride_length_s]
number
The length of overlap between consecutive audio chunks in seconds. If not provided, defaults to chunk_length_s / 6
.
[kwargs.chunk_callback]
ChunkCallback
Callback function to be called with each chunk processed.
[kwargs.force_full_sequences]
boolean
Whether to force outputting full sequences or not. Default is false
.
[kwargs.language]
string
The source language. Default is null
, meaning it should be auto-detected. Use this to potentially improve performance if the source language is known.
[kwargs.task]
string
The task to perform. Default is null
, meaning it should be auto-detected.
[kwargs.forced_decoder_ids]
Array.<Array<number>>
A list of pairs of integers which indicates a mapping from generation indices to token indices that will be forced before sampling. For example, [[1, 123]] means the second generated token will always be a token of index 123.
Image To Text pipeline using a AutoModelForVision2Seq
. This pipeline predicts a caption for a given image.
Example: Generate a caption for an image w/ Xenova/vit-gpt2-image-captioning
.
Copied
Create a new ImageToTextPipeline.
options
Object
An object containing the following properties:
[options.task]
string
The task of the pipeline. Useful for specifying subtasks.
[options.model]
PreTrainedModel
The model to use.
[options.tokenizer]
PreTrainedTokenizer
The tokenizer to use.
[options.processor]
Processor
The processor to use.
Assign labels to the image(s) passed as inputs.
images
Array.<any>
The images to be captioned.
[generate_kwargs]
Object
{}
Optional generation arguments.
Image classification pipeline using any AutoModelForImageClassification
. This pipeline predicts the class of an image.
Example: Classify an image.
Copied
Example: Classify an image and return top n
classes.
Copied
Example: Classify an image and return all classes.
Copied
Create a new ImageClassificationPipeline.
options
Object
An object containing the following properties:
[options.task]
string
The task of the pipeline. Useful for specifying subtasks.
[options.model]
PreTrainedModel
The model to use.
[options.processor]
Processor
The processor to use.
Classify the given images.
images
any
The images to classify.
options
Object
The options to use for classification.
[options.topk]
number
1
The number of top results to return.
Image segmentation pipeline using any AutoModelForXXXSegmentation
. This pipeline predicts masks of objects and their classes.
Example: Perform image segmentation with Xenova/detr-resnet-50-panoptic
.
Copied
Create a new ImageSegmentationPipeline.
options
Object
An object containing the following properties:
[options.task]
string
The task of the pipeline. Useful for specifying subtasks.
[options.model]
PreTrainedModel
The model to use.
[options.processor]
Processor
The processor to use.
Segment the input images.
images
Array
The input images.
options
Object
The options to use for segmentation.
[options.threshold]
number
0.5
Probability threshold to filter out predicted masks.
[options.mask_threshold]
number
0.5
Threshold to use when turning the predicted masks into binary values.
[options.overlap_mask_area_threshold]
number
0.8
Mask overlap threshold to eliminate small, disconnected segments.
[options.subtask]
null
| string
Segmentation task to be performed. One of [panoptic
, instance
, and semantic
], depending on model capabilities. If not set, the pipeline will attempt to resolve (in that order).
[options.label_ids_to_fuse]
Array
List of label ids to fuse. If not set, do not fuse any labels.
[options.target_sizes]
Array
List of target sizes for the input images. If not set, use the original image sizes.
Zero shot image classification pipeline. This pipeline predicts the class of an image when you provide an image and a set of candidate_labels
.
Example: Zero shot image classification w/ Xenova/clip-vit-base-patch32
.
Copied
Create a new ZeroShotImageClassificationPipeline.
options
Object
An object containing the following properties:
[options.task]
string
The task of the pipeline. Useful for specifying subtasks.
[options.model]
PreTrainedModel
The model to use.
[options.tokenizer]
PreTrainedTokenizer
The tokenizer to use.
[options.processor]
Processor
The processor to use.
Classify the input images with candidate labels using a zero-shot approach.
images
Array
The input images.
candidate_labels
Array.<string>
The candidate labels.
options
Object
The options for the classification.
[options.hypothesis_template]
string
The hypothesis template to use for zero-shot classification. Default: "This is a photo of {}".
Object detection pipeline using any AutoModelForObjectDetection
. This pipeline predicts bounding boxes of objects and their classes.
Example: Run object-detection with facebook/detr-resnet-50
.
Copied
Create a new ObjectDetectionPipeline.
options
Object
An object containing the following properties:
[options.task]
string
The task of the pipeline. Useful for specifying subtasks.
[options.model]
PreTrainedModel
The model to use.
[options.processor]
Processor
The processor to use.
Detect objects (bounding boxes & classes) in the image(s) passed as inputs.
images
Array.<any>
The input images.
options
Object
The options for the object detection.
[options.threshold]
number
0.9
The threshold used to filter boxes by score.
[options.percentage]
boolean
false
Whether to return the boxes coordinates in percentage (true) or in pixels (false).
Document Question Answering pipeline using any AutoModelForDocumentQuestionAnswering
. The inputs/outputs are similar to the (extractive) question answering pipeline; however, the pipeline takes an image (and optional OCR’d words/boxes) as input instead of text context.
Example: Answer questions about a document with Xenova/donut-base-finetuned-docvqa
.
Copied
Create a new DocumentQuestionAnsweringPipeline.
options
Object
An object containing the following properties:
[options.task]
string
The task of the pipeline. Useful for specifying subtasks.
[options.model]
PreTrainedModel
The model to use.
[options.tokenizer]
PreTrainedTokenizer
The tokenizer to use.
[options.processor]
Processor
The processor to use.
Answer the question given as input by using the document.
image
any
The image of the document to use.
question
string
A question to ask of the document.
[generate_kwargs]
Object
{}
Optional generation arguments.
Text-to-audio generation pipeline using any AutoModelForTextToWaveform
or AutoModelForTextToSpectrogram
. This pipeline generates an audio file from an input text and optional other conditional inputs.
Example: Generate audio from text with Xenova/speecht5_tts
.
Copied
You can then save the audio to a .wav file with the wavefile
package:
Copied
Create a new TextToAudioPipeline.
options
Object
An object containing the following properties:
[options.task]
string
The task of the pipeline. Useful for specifying subtasks.
[options.model]
PreTrainedModel
The model to use.
[options.tokenizer]
PreTrainedTokenizer
The tokenizer to use.
[options.processor]
Processor
The processor to use.
[options.vocoder]
PreTrainedModel
The vocoder to use.
Generates speech/audio from the inputs.
text_inputs
string
| Array<string>
The text(s) to generate.
options
Object
Parameters passed to the model generation/forward method.
[options.vocoder]
PreTrainedModel
The vocoder to use (if the model uses one). If not provided, use the default HifiGan vocoder.
[options.speaker_embeddings]
Tensor
| Float32Array
| string
| URL
Utility factory method to build a [Pipeline
] object.
Error
If an unsupported pipeline is requested.
task
string
The task defining which pipeline will be returned. Currently accepted tasks are:
"audio-classification"
: will return a AudioClassificationPipeline
.
"automatic-speech-recognition"
: will return a AutomaticSpeechRecognitionPipeline
.
"document-question-answering"
: will return a DocumentQuestionAnsweringPipeline
.
"feature-extraction"
: will return a FeatureExtractionPipeline
.
"fill-mask"
: will return a FillMaskPipeline
.
"image-classification"
: will return a ImageClassificationPipeline
.
"image-segmentation"
: will return a ImageSegmentationPipeline
.
"image-to-text"
: will return a ImageToTextPipeline
.
"object-detection"
: will return a ObjectDetectionPipeline
.
"question-answering"
: will return a QuestionAnsweringPipeline
.
"summarization"
: will return a SummarizationPipeline
.
"text2text-generation"
: will return a Text2TextGenerationPipeline
.
"text-classification"
(alias "sentiment-analysis" available): will return a TextClassificationPipeline
.
"text-generation"
: will return a TextGenerationPipeline
.
"token-classification"
(alias "ner" available): will return a TokenClassificationPipeline
.
"translation"
: will return a TranslationPipeline
.
"translation_xx_to_yy"
: will return a TranslationPipeline
.
"zero-shot-classification"
: will return a ZeroShotClassificationPipeline
.
"zero-shot-image-classification"
: will return a ZeroShotImageClassificationPipeline
.
[model]
string
null
The name of the pre-trained model to use. If not specified, the default model for the task will be used.
[options]
*
Optional parameters for the pipeline.
answer
string
The answer.
score
number
The score.
chunk
Chunk
The chunk to process.
Kind: static class of
Extends: Callable
⇐ Callable
⇒ Promise.<void>
⇒ Promise.<any>
Kind: instance method of
Returns: Promise.<void>
- A promise that resolves when the model has been disposed.
Kind: instance method of
Returns: Promise.<any>
- A promise that resolves to an array containing the inputs and outputs of the task.
Kind: static class of
Kind: instance method of
Returns: Promise.<(Array<Object>|Object)>
- A promise that resolves to an array or object containing the predicted labels and scores.
Kind: static class of
Kind: instance method of
Returns: Promise.<(Array<Object>|Object)>
- A promise that resolves to an array or object containing the predicted labels and scores.
Kind: static class of
Kind: instance method of
Returns: QuestionAnsweringReturnType
- A promise that resolves to an array or object containing the predicted answers and scores.
Kind: static class of
Kind: instance method of
Returns: Promise.<(Array<Object>|Object)>
- A promise that resolves to an array or object containing the predicted tokens and scores.
Kind: static class of
Kind: instance method of
Returns: Promise.<any>
- An array of objects containing the score, predicted token, predicted token string, and the sequence with the predicted token filled in, or an array of such arrays (one for each input text). If only one input text is given, the output will be an array of objects.
Throws:
Kind: static class of
See for the full list of languages and their corresponding codes.
See for the full list of languages and their corresponding codes.
See for the full list of languages and their corresponding codes.
Kind: static class of
Language generation pipeline using any ModelWithLMHead
or ModelForCausalLM
. This pipeline predicts the words that will follow a specified text prompt. NOTE: For the full list of generation parameters, see .
Kind: static class of
Kind: instance method of
Returns: Promise.<any>
- The generated text or texts.
Kind: static class of
⇒ Promise.<(Object|Array<Object>)>
Kind: instance method of
Returns: Promise.<(Object|Array<Object>)>
- The prediction(s), as a map (or list of maps) from label to score.
Kind: static class of
Kind: instance method of Returns: The features computed by the model.
Kind: static class of
⇒ Promise.<(Array<Object>|Object)>
Kind: instance method of
Returns: Promise.<(Array<Object>|Object)>
- A promise that resolves to an array or object containing the predicted labels and scores.
Kind: static class of
⇒ Promise.<Object>
Kind: instance method of
Returns: Promise.<Object>
- A Promise that resolves to an object containing the transcription text and optionally timestamps if return_timestamps
is true
.
Kind: static class of
⇒ Promise.<(Object|Array<Object>)>
Kind: instance method of
Returns: Promise.<(Object|Array<Object>)>
- A Promise that resolves to an object (or array of objects) containing the generated text(s).
Kind: static class of
⇒ Promise.<any>
Kind: instance method of
Returns: Promise.<any>
- The top classification results for the images.
Kind: static class of
⇒ Promise.<Array>
Kind: instance method of
Returns: Promise.<Array>
- The annotated segments.
Kind: static class of
⇒ Promise.<any>
Kind: instance method of
Returns: Promise.<any>
- An array of classifications for each input image or a single classification object if only one input image is provided.
Kind: static class of
Kind: instance method of
Kind: static class of
⇒ Promise.<(Object|Array<Object>)>
Kind: instance method of
Returns: Promise.<(Object|Array<Object>)>
- A Promise that resolves to an object (or array of objects) containing the generated text(s).
Kind: static class of
⇒ Promise.<Object>
Kind: instance method of
Returns: Promise.<Object>
- An object containing the generated audio and sampling rate.
Kind: static method of
Returns: Promise.<Pipeline>
- A Pipeline object for the specified task.
Throws:
Kind: inner typedef of Properties
Kind: inner typedef of
Kind: inner typedef of