Pipelines

pipelines

Pipelines provide a high-level, easy to use, API for running machine learning models.

Example: Instantiate pipeline using the pipeline function.

Copied

import { pipeline } from '@xenova/transformers';

let classifier = await pipeline('sentiment-analysis');
let output = await classifier('I love transformers!');
// [{'label': 'POSITIVE', 'score': 0.999817686}]

pipelines
- static
  - .Pipeline ⇐ Callable
    new Pipeline(options)
    .dispose() ⇒ Promise.<void>
    ._call(texts, ...args) ⇒ Promise.<any>
  - .TextClassificationPipeline
    ._call(texts, options) ⇒ Promise.<(Array<Object>|Object)>
  - .TokenClassificationPipeline
    ._call(texts, options) ⇒ Promise.<(Array<Object>|Object)>
  - .QuestionAnsweringPipeline
    ._call(question, context, options) ⇒ QuestionAnsweringReturnType
  - .FillMaskPipeline
    ._call(texts, options) ⇒ Promise.<(Array<Object>|Object)>
  - .Text2TextGenerationPipeline
    ._call(texts, [options]) ⇒ Promise.<any>
  - .SummarizationPipeline
  - .TranslationPipeline
  - .TextGenerationPipeline
    ._call(texts, [generate_kwargs]) ⇒ Promise.<any>
  - .ZeroShotClassificationPipeline
    new ZeroShotClassificationPipeline(options)
    ._call(texts, candidate_labels, options) ⇒ Promise.<(Object|Array<Object>)>
  - .FeatureExtractionPipeline
    ._call(texts, options) ⇒
  - .AudioClassificationPipeline
    new AudioClassificationPipeline(options)
    ._call(audio, options) ⇒ Promise.<(Array<Object>|Object)>
  - .AutomaticSpeechRecognitionPipeline
    new AutomaticSpeechRecognitionPipeline(options)
    ._call(audio, [kwargs]) ⇒ Promise.<Object>
  - .ImageToTextPipeline
    new ImageToTextPipeline(options)
    ._call(images, [generate_kwargs]) ⇒ Promise.<(Object|Array<Object>)>
  - .ImageClassificationPipeline
    new ImageClassificationPipeline(options)
    ._call(images, options) ⇒ Promise.<any>
  - .ImageSegmentationPipeline
    new ImageSegmentationPipeline(options)
    ._call(images, options) ⇒ Promise.<Array>
  - .ZeroShotImageClassificationPipeline
    new ZeroShotImageClassificationPipeline(options)
    ._call(images, candidate_labels, options) ⇒ Promise.<any>
  - .ObjectDetectionPipeline
    new ObjectDetectionPipeline(options)
    ._call(images, options)
  - .DocumentQuestionAnsweringPipeline
    new DocumentQuestionAnsweringPipeline(options)
    ._call(image, question, [generate_kwargs]) ⇒ Promise.<(Object|Array<Object>)>
  - .TextToAudioPipeline
    new TextToAudioPipeline(options)
    ._call(text_inputs, options) ⇒ Promise.<Object>
  - .pipeline(task, [model], [options]) ⇒ Promise.<Pipeline>
- inner
  - ~QuestionAnsweringResult : object
  - ~QuestionAnsweringReturnType : Promise.<(QuestionAnsweringResult|Array<QuestionAnsweringResult>)>
  - ~ChunkCallback : function

pipelines.Pipeline ⇐ <code> Callable </code>

The Pipeline class is the class from which all pipelines inherit. Refer to this class for methods shared across different pipelines.

Kind: static class of pipelines Extends: Callable

.Pipeline ⇐ Callable
- new Pipeline(options)
- .dispose() ⇒ Promise.<void>
- ._call(texts, ...args) ⇒ Promise.<any>

new Pipeline(options)

Create a new Pipeline.

Param

Type

Default

Description

options

Object

An object containing the following properties:

[options.task]

string

The task of the pipeline. Useful for specifying subtasks.

[options.model]

PreTrainedModel

The model to use.

[options.tokenizer]

PreTrainedTokenizer

The tokenizer to use (if any).

[options.processor]

Processor

The processor to use (if any).

pipeline.dispose() ⇒ <code> Promise. < void > </code>

Disposes the model.

Kind: instance method of Pipeline Returns: Promise.<void> - A promise that resolves when the model has been disposed.

pipeline._call(texts, ...args) ⇒ <code> Promise. < any > </code>

Executes the task associated with the pipeline.

Kind: instance method of Pipeline Returns: Promise.<any> - A promise that resolves to an array containing the inputs and outputs of the task.

Param

Type

Description

texts

any

The input texts to be processed.

...args

any

Additional arguments.

pipelines.TextClassificationPipeline

Text classification pipeline using any ModelForSequenceClassification.

Example: Sentiment-analysis w/ Xenova/distilbert-base-uncased-finetuned-sst-2-english.

Copied

let classifier = await pipeline('sentiment-analysis', 'Xenova/distilbert-base-uncased-finetuned-sst-2-english');
let output = await classifier('I love transformers!');
// [{ label: 'POSITIVE', score: 0.999788761138916 }]

Example: Multilingual sentiment-analysis w/ Xenova/bert-base-multilingual-uncased-sentiment (and return top 5 classes).

Copied

let classifier = await pipeline('sentiment-analysis', 'Xenova/bert-base-multilingual-uncased-sentiment');
let output = await classifier('Le meilleur film de tous les temps.', { topk: 5 });
// [
//   { label: '5 stars', score: 0.9610759615898132 },
//   { label: '4 stars', score: 0.03323351591825485 },
//   { label: '3 stars', score: 0.0036155181005597115 },
//   { label: '1 star', score: 0.0011325967498123646 },
//   { label: '2 stars', score: 0.0009423971059732139 }
// ]

Example: Toxic comment classification w/ Xenova/toxic-bert (and return all classes).

Copied

let classifier = await pipeline('text-classification', 'Xenova/toxic-bert');
let output = await classifier('I hate you!', { topk: null });
// [
//   { label: 'toxic', score: 0.9593140482902527 },
//   { label: 'insult', score: 0.16187334060668945 },
//   { label: 'obscene', score: 0.03452680632472038 },
//   { label: 'identity_hate', score: 0.0223250575363636 },
//   { label: 'threat', score: 0.019197041168808937 },
//   { label: 'severe_toxic', score: 0.005651099607348442 }
// ]

Kind: static class of pipelines

textClassificationPipeline._call(texts, options) ⇒ <code> Promise. < (Array < Object > |Object) > </code>

Executes the text classification task.

Kind: instance method of TextClassificationPipeline Returns: Promise.<(Array<Object>|Object)> - A promise that resolves to an array or object containing the predicted labels and scores.

Param

Type

Default

Description

texts

any

The input texts to be classified.

options

Object

An optional object containing the following properties:

[options.topk]

number

1

The number of top predictions to be returned.

pipelines.TokenClassificationPipeline

Named Entity Recognition pipeline using any ModelForTokenClassification.

Example: Perform named entity recognition with Xenova/bert-base-NER.

Copied

let classifier = await pipeline('token-classification', 'Xenova/bert-base-NER');
let output = await classifier('My name is Sarah and I live in London');
// [
//   { entity: 'B-PER', score: 0.9980202913284302, index: 4, word: 'Sarah' },
//   { entity: 'B-LOC', score: 0.9994474053382874, index: 9, word: 'London' }
// ]

Example: Perform named entity recognition with Xenova/bert-base-NER (and return all labels).

Copied

let classifier = await pipeline('token-classification', 'Xenova/bert-base-NER');
let output = await classifier('Sarah lives in the United States of America', { ignore_labels: [] });
// [
//   { entity: 'B-PER', score: 0.9966587424278259, index: 1, word: 'Sarah' },
//   { entity: 'O', score: 0.9987385869026184, index: 2, word: 'lives' },
//   { entity: 'O', score: 0.9990072846412659, index: 3, word: 'in' },
//   { entity: 'O', score: 0.9988298416137695, index: 4, word: 'the' },
//   { entity: 'B-LOC', score: 0.9995510578155518, index: 5, word: 'United' },
//   { entity: 'I-LOC', score: 0.9990395307540894, index: 6, word: 'States' },
//   { entity: 'I-LOC', score: 0.9986724853515625, index: 7, word: 'of' },
//   { entity: 'I-LOC', score: 0.9975294470787048, index: 8, word: 'America' }
// ]

Kind: static class of pipelines

tokenClassificationPipeline._call(texts, options) ⇒ <code> Promise. < (Array < Object > |Object) > </code>

Executes the token classification task.

Kind: instance method of TokenClassificationPipeline Returns: Promise.<(Array<Object>|Object)> - A promise that resolves to an array or object containing the predicted labels and scores.

Param

Type

Description

texts

any

The input texts to be classified.

options

Object

An optional object containing the following properties:

pipelines.QuestionAnsweringPipeline

Question Answering pipeline using any ModelForQuestionAnswering.

Example: Run question answering with Xenova/distilbert-base-uncased-distilled-squad.

Copied

let question = 'Who was Jim Henson?';
let context = 'Jim Henson was a nice puppet.';

let answerer = await pipeline('question-answering', 'Xenova/distilbert-base-uncased-distilled-squad');
let output = await answerer(question, context);
// {
//   "answer": "a nice puppet",
//   "score": 0.5768911502526741
// }

Kind: static class of pipelines

questionAnsweringPipeline._call(question, context, options) ⇒ <code> QuestionAnsweringReturnType </code>

Executes the question answering task.

Kind: instance method of QuestionAnsweringPipeline Returns: QuestionAnsweringReturnType - A promise that resolves to an array or object containing the predicted answers and scores.

Param

Type

Default

Description

question

string | Array<string>

The question(s) to be answered.

context

string | Array<string>

The context(s) where the answer(s) can be found.

options

Object

An optional object containing the following properties:

[options.topk]

number

1

The number of top answer predictions to be returned.

pipelines.FillMaskPipeline

Masked language modeling prediction pipeline using any ModelWithLMHead.

Example: Perform masked language modelling (a.k.a. “fill-mask”) with Xenova/bert-base-uncased.

Copied

let unmasker = await pipeline('fill-mask', 'Xenova/bert-base-cased');
let output = await unmasker('The goal of life is [MASK].');
// [
//   { token_str: 'survival', score: 0.06137419492006302, token: 8115, sequence: 'The goal of life is survival.' },
//   { token_str: 'love', score: 0.03902450203895569, token: 1567, sequence: 'The goal of life is love.' },
//   { token_str: 'happiness', score: 0.03253183513879776, token: 9266, sequence: 'The goal of life is happiness.' },
//   { token_str: 'freedom', score: 0.018736306577920914, token: 4438, sequence: 'The goal of life is freedom.' },
//   { token_str: 'life', score: 0.01859794743359089, token: 1297, sequence: 'The goal of life is life.' }
// ]

Example: Perform masked language modelling (a.k.a. “fill-mask”) with Xenova/bert-base-cased (and return top result).

Copied

let unmasker = await pipeline('fill-mask', 'Xenova/bert-base-cased');
let output = await unmasker('The Milky Way is a [MASK] galaxy.', { topk: 1 });
// [{ token_str: 'spiral', score: 0.6299987435340881, token: 14061, sequence: 'The Milky Way is a spiral galaxy.' }]

Kind: static class of pipelines

fillMaskPipeline._call(texts, options) ⇒ <code> Promise. < (Array < Object > |Object) > </code>

Fill the masked token in the text(s) given as inputs.

Kind: instance method of FillMaskPipeline Returns: Promise.<(Array<Object>|Object)> - A promise that resolves to an array or object containing the predicted tokens and scores.

Param

Type

Default

Description

texts

any

The masked input texts.

options

Object

An optional object containing the following properties:

[options.topk]

number

5

The number of top predictions to be returned.

pipelines.Text2TextGenerationPipeline

Text2TextGenerationPipeline class for generating text using a model that performs text-to-text generation tasks.

Example: Text-to-text generation w/ Xenova/LaMini-Flan-T5-783M.

Copied

let generator = await pipeline('text2text-generation', 'Xenova/LaMini-Flan-T5-783M');
let output = await generator('how can I become more healthy?', {
  max_new_tokens: 100,
});
// [ 'To become more healthy, you can: 1. Eat a balanced diet with plenty of fruits, vegetables, whole grains, lean proteins, and healthy fats. 2. Stay hydrated by drinking plenty of water. 3. Get enough sleep and manage stress levels. 4. Avoid smoking and excessive alcohol consumption. 5. Regularly exercise and maintain a healthy weight. 6. Practice good hygiene and sanitation. 7. Seek medical attention if you experience any health issues.' ]

Kind: static class of pipelines

text2TextGenerationPipeline._call(texts, [options]) ⇒ <code> Promise. < any > </code>

Fill the masked token in the text(s) given as inputs.

Kind: instance method of Text2TextGenerationPipeline Returns: Promise.<any> - An array of objects containing the score, predicted token, predicted token string, and the sequence with the predicted token filled in, or an array of such arrays (one for each input text). If only one input text is given, the output will be an array of objects. Throws:

Error When the mask token is not found in the input text.

Param

Type

Default

Description

texts

string | Array<string>

The text or array of texts to be processed.

[options]

Object

{}

Options for the fill-mask pipeline.

[options.topk]

number

5

The number of top-k predictions to return.

pipelines.SummarizationPipeline

A pipeline for summarization tasks, inheriting from Text2TextGenerationPipeline.

Example: Summarization w/ Xenova/distilbart-cnn-6-6.

Copied

let text = 'The tower is 324 metres (1,063 ft) tall, about the same height as an 81-storey building, ' +
  'and the tallest structure in Paris. Its base is square, measuring 125 metres (410 ft) on each side. ' +
  'During its construction, the Eiffel Tower surpassed the Washington Monument to become the tallest ' +
  'man-made structure in the world, a title it held for 41 years until the Chrysler Building in New ' +
  'York City was finished in 1930. It was the first structure to reach a height of 300 metres. Due to ' +
  'the addition of a broadcasting aerial at the top of the tower in 1957, it is now taller than the ' +
  'Chrysler Building by 5.2 metres (17 ft). Excluding transmitters, the Eiffel Tower is the second ' +
  'tallest free-standing structure in France after the Millau Viaduct.';

let generator = await pipeline('summarization', 'Xenova/distilbart-cnn-6-6');
let output = await generator(text, {
  max_new_tokens: 100,
});
// [{ summary_text: ' The Eiffel Tower is about the same height as an 81-storey building and the tallest structure in Paris. It is the second tallest free-standing structure in France after the Millau Viaduct.' }]

Kind: static class of pipelines

pipelines.TranslationPipeline

Translates text from one language to another.

Example: Multilingual translation w/ Xenova/nllb-200-distilled-600M.

See here for the full list of languages and their corresponding codes.

Copied

let translator = await pipeline('translation', 'Xenova/nllb-200-distilled-600M');
let output = await translator('जीवन एक चॉकलेट बॉक्स की तरह है।', {
  src_lang: 'hin_Deva', // Hindi
  tgt_lang: 'fra_Latn', // French
});
// [{ translation_text: 'La vie est comme une boîte à chocolat.' }]

Example: Multilingual translation w/ Xenova/m2m100_418M.

See here for the full list of languages and their corresponding codes.

Copied

let translator = await pipeline('translation', 'Xenova/m2m100_418M');
let output = await translator('生活就像一盒巧克力。', {
  src_lang: 'zh', // Chinese
  tgt_lang: 'en', // English
});
// [{ translation_text: 'Life is like a box of chocolate.' }]

Example: Multilingual translation w/ Xenova/mbart-large-50-many-to-many-mmt.

See here for the full list of languages and their corresponding codes.

Copied

let translator = await pipeline('translation', 'Xenova/mbart-large-50-many-to-many-mmt');
let output = await translator('संयुक्त राष्ट्र के प्रमुख का कहना है कि सीरिया में कोई सैन्य समाधान नहीं है', {
  src_lang: 'hi_IN', // Hindi
  tgt_lang: 'fr_XX', // French
});
// [{ translation_text: 'Le chef des Nations affirme qu 'il n 'y a military solution in Syria.' }]

Kind: static class of pipelines

pipelines.TextGenerationPipeline

Language generation pipeline using any ModelWithLMHead or ModelForCausalLM. This pipeline predicts the words that will follow a specified text prompt. NOTE: For the full list of generation parameters, see GenerationConfig.

Example: Text generation with Xenova/distilgpt2 (default settings).

Copied

let text = 'I enjoy walking with my cute dog,';
let classifier = await pipeline('text-generation', 'Xenova/distilgpt2');
let output = await classifier(text);
// [{ generated_text: "I enjoy walking with my cute dog, and I love to play with the other dogs." }]

Example: Text generation with Xenova/distilgpt2 (custom settings).

Copied

let text = 'Once upon a time, there was';
let classifier = await pipeline('text-generation', 'Xenova/distilgpt2');
let output = await classifier(text, {
  temperature: 2,
  max_new_tokens: 10,
  repetition_penalty: 1.5,
  no_repeat_ngram_size: 2,
  num_beams: 2,
  num_return_sequences: 2,
});
// [{
//   "generated_text": "Once upon a time, there was an abundance of information about the history and activities that"
// }, {
//   "generated_text": "Once upon a time, there was an abundance of information about the most important and influential"
// }]

Example: Run code generation with Xenova/codegen-350M-mono.

Copied

let text = 'def fib(n):';
let classifier = await pipeline('text-generation', 'Xenova/codegen-350M-mono');
let output = await classifier(text, {
  max_new_tokens: 44,
});
// [{
//   generated_text: 'def fib(n):\n' +
//     '    if n == 0:\n' +
//     '        return 0\n' +
//     '    elif n == 1:\n' +
//     '        return 1\n' +
//     '    else:\n' +
//     '        return fib(n-1) + fib(n-2)\n'
// }]

Kind: static class of pipelines

textGenerationPipeline._call(texts, [generate_kwargs]) ⇒ <code> Promise. < any > </code>

Generates text based on an input prompt.

Kind: instance method of TextGenerationPipeline Returns: Promise.<any> - The generated text or texts.

Param

Type

Default

Description

texts

any

The input prompt or prompts to generate text from.

[generate_kwargs]

Object

{}

Additional arguments for text generation.

pipelines.ZeroShotClassificationPipeline

NLI-based zero-shot classification pipeline using a ModelForSequenceClassification trained on NLI (natural language inference) tasks. Equivalent of text-classification pipelines, but these models don’t require a hardcoded number of potential classes, they can be chosen at runtime. It usually means it’s slower but it is much more flexible.

Example: Zero shot classification with Xenova/mobilebert-uncased-mnli.

Copied

let text = 'Last week I upgraded my iOS version and ever since then my phone has been overheating whenever I use your app.';
let labels = [ 'mobile', 'billing', 'website', 'account access' ];
let classifier = await pipeline('zero-shot-classification', 'Xenova/mobilebert-uncased-mnli');
let output = await classifier(text, labels);
// {
//   sequence: 'Last week I upgraded my iOS version and ever since then my phone has been overheating whenever I use your app.',
//   labels: [ 'mobile', 'website', 'billing', 'account access' ],
//   scores: [ 0.5562091040482018, 0.1843621307860853, 0.13942646639336376, 0.12000229877234923 ]
// }

Example: Zero shot classification with Xenova/nli-deberta-v3-xsmall (multi-label).

Copied

let text = 'I have a problem with my iphone that needs to be resolved asap!';
let labels = [ 'urgent', 'not urgent', 'phone', 'tablet', 'computer' ];
let classifier = await pipeline('zero-shot-classification', 'Xenova/nli-deberta-v3-xsmall');
let output = await classifier(text, labels, { multi_label: true });
// {
//   sequence: 'I have a problem with my iphone that needs to be resolved asap!',
//   labels: [ 'urgent', 'phone', 'computer', 'tablet', 'not urgent' ],
//   scores: [ 0.9958870956360275, 0.9923963400697035, 0.002333537946160235, 0.0015134138567598765, 0.0010699384208377163 ]
// }

Kind: static class of pipelines

.ZeroShotClassificationPipeline
- new ZeroShotClassificationPipeline(options)
- ._call(texts, candidate_labels, options) ⇒ Promise.<(Object|Array<Object>)>

new ZeroShotClassificationPipeline(options)

Create a new ZeroShotClassificationPipeline.

Param

Type

Description

options

Object

An object containing the following properties:

[options.task]

string

The task of the pipeline. Useful for specifying subtasks.

[options.model]

PreTrainedModel

The model to use.

[options.tokenizer]

PreTrainedTokenizer

The tokenizer to use.

zeroShotClassificationPipeline._call(texts, candidate_labels, options) ⇒ <code> Promise. < (Object|Array < Object > ) > </code>

Kind: instance method of ZeroShotClassificationPipeline Returns: Promise.<(Object|Array<Object>)> - The prediction(s), as a map (or list of maps) from label to score.

Param

Type

Default

Description

texts

Array.<any>

candidate_labels

Array.<string>

options

Object

Additional options:

[options.hypothesis_template]

string

""This example is {}.""

The template used to turn each candidate label into an NLI-style hypothesis. The candidate label will replace the {} placeholder.

[options.multi_label]

boolean

false

Whether or not multiple candidate labels can be true. If false, the scores are normalized such that the sum of the label likelihoods for each sequence is 1. If true, the labels are considered independent and probabilities are normalized for each candidate by doing a softmax of the entailment score vs. the contradiction score.

pipelines.FeatureExtractionPipeline

Feature extraction pipeline using no model head. This pipeline extracts the hidden states from the base transformer, which can be used as features in downstream tasks.

Example: Run feature extraction with bert-base-uncased (without pooling/normalization).

Copied

let extractor = await pipeline('feature-extraction', 'Xenova/bert-base-uncased', { revision: 'default' });
let output = await extractor('This is a simple test.');
// Tensor {
//   type: 'float32',
//   data: Float32Array [0.05939924716949463, 0.021655935794115067, ...],
//   dims: [1, 8, 768]
// }

Example: Run feature extraction with bert-base-uncased (with pooling/normalization).

Copied

let extractor = await pipeline('feature-extraction', 'Xenova/bert-base-uncased', { revision: 'default' });
let output = await extractor('This is a simple test.', { pooling: 'mean', normalize: true });
// Tensor {
//   type: 'float32',
//   data: Float32Array [0.03373778983950615, -0.010106077417731285, ...],
//   dims: [1, 768]
// }

Example: Calculating embeddings with sentence-transformers models.

Copied

let extractor = await pipeline('feature-extraction', 'Xenova/all-MiniLM-L6-v2');
let output = await extractor('This is a simple test.', { pooling: 'mean', normalize: true });
// Tensor {
//   type: 'float32',
//   data: Float32Array [0.09094982594251633, -0.014774246141314507, ...],
//   dims: [1, 384]
// }

Kind: static class of pipelines

featureExtractionPipeline._call(texts, options) ⇒

Extract the features of the input(s).

Kind: instance method of FeatureExtractionPipeline Returns: The features computed by the model.

Param

Type

Default

Description

texts

string | Array<string>

The input texts

options

Object

Additional options:

[options.pooling]

string

""none""

The pooling method to use. Can be one of: "none", "mean".

[options.normalize]

boolean

false

Whether or not to normalize the embeddings in the last dimension.

pipelines.AudioClassificationPipeline

Audio classification pipeline using any AutoModelForAudioClassification. This pipeline predicts the class of a raw waveform or an audio file.

Example: Perform audio classification.

Copied

let url = 'https://boincai.com/datasets/Xenova/transformers.js-docs/resolve/main/jfk.wav';
let classifier = await pipeline('audio-classification', 'Xenova/wav2vec2-large-xlsr-53-gender-recognition-librispeech');
let output = await classifier(url);
// [
//   { label: 'male', score: 0.9981542229652405 },
//   { label: 'female', score: 0.001845747814513743 }
// ]

Kind: static class of pipelines

.AudioClassificationPipeline
- new AudioClassificationPipeline(options)
- ._call(audio, options) ⇒ Promise.<(Array<Object>|Object)>

new AudioClassificationPipeline(options)

Create a new AudioClassificationPipeline.

Param

Type

Description

options

Object

An object containing the following properties:

[options.task]

string

The task of the pipeline. Useful for specifying subtasks.

[options.model]

PreTrainedModel

The model to use.

[options.processor]

Processor

The processor to use.

audioClassificationPipeline._call(audio, options) ⇒ <code> Promise. < (Array < Object > |Object) > </code>

Executes the audio classification task.

Kind: instance method of AudioClassificationPipeline Returns: Promise.<(Array<Object>|Object)> - A promise that resolves to an array or object containing the predicted labels and scores.

Param

Type

Default

Description

audio

any

The input audio files to be classified.

options

Object

An optional object containing the following properties:

[options.topk]

number

5

The number of top predictions to be returned.

pipelines.AutomaticSpeechRecognitionPipeline

Pipeline that aims at extracting spoken text contained within some audio.

Example: Transcribe English.

Copied

let url = 'https://boincai.com/datasets/Xenova/transformers.js-docs/resolve/main/jfk.wav';
let transcriber = await pipeline('automatic-speech-recognition', 'Xenova/whisper-tiny.en');
let output = await transcriber(url);
// { text: " And so my fellow Americans ask not what your country can do for you, ask what you can do for your country." }

Example: Transcribe English w/ timestamps.

Copied

let url = 'https://boincai.com/datasets/Xenova/transformers.js-docs/resolve/main/jfk.wav';
let transcriber = await pipeline('automatic-speech-recognition', 'Xenova/whisper-tiny.en');
let output = await transcriber(url, { return_timestamps: true });
// {
//   text: " And so my fellow Americans ask not what your country can do for you, ask what you can do for your country."
//   chunks: [
//     { timestamp: [0, 8],  text: " And so my fellow Americans ask not what your country can do for you" }
//     { timestamp: [8, 11], text: " ask what you can do for your country." }
//   ]
// }

Example: Transcribe English w/ word-level timestamps.

Copied

let url = 'https://boincai.com/datasets/Xenova/transformers.js-docs/resolve/main/jfk.wav';
let transcriber = await pipeline('automatic-speech-recognition', 'Xenova/whisper-tiny.en', {
    revision: 'output_attentions',
});
let output = await transcriber(url, { return_timestamps: 'word' });
// {
//   "text": " And so my fellow Americans ask not what your country can do for you ask what you can do for your country.",
//   "chunks": [
//     { "text": " And", "timestamp": [0, 0.78] },
//     { "text": " so", "timestamp": [0.78, 1.06] },
//     { "text": " my", "timestamp": [1.06, 1.46] },
//     ...
//     { "text": " for", "timestamp": [9.72, 9.92] },
//     { "text": " your", "timestamp": [9.92, 10.22] },
//     { "text": " country.", "timestamp": [10.22, 13.5] }
//   ]
// }

Example: Transcribe French.

Copied

let url = 'https://boincai.com/datasets/Xenova/transformers.js-docs/resolve/main/french-audio.mp3';
let transcriber = await pipeline('automatic-speech-recognition', 'Xenova/whisper-small');
let output = await transcriber(url, { language: 'french', task: 'transcribe' });
// { text: " J'adore, j'aime, je n'aime pas, je déteste." }

Example: Translate French to English.

Copied

let url = 'https://boincai.com/datasets/Xenova/transformers.js-docs/resolve/main/french-audio.mp3';
let transcriber = await pipeline('automatic-speech-recognition', 'Xenova/whisper-small');
let output = await transcriber(url, { language: 'french', task: 'translate' });
// { text: " I love, I like, I don't like, I hate." }

Example: Transcribe/translate audio longer than 30 seconds.

Copied

let url = 'https://boincai.com/datasets/Xenova/transformers.js-docs/resolve/main/ted_60.wav';
let transcriber = await pipeline('automatic-speech-recognition', 'Xenova/whisper-tiny.en');
let output = await transcriber(url, { chunk_length_s: 30, stride_length_s: 5 });
// { text: " So in college, I was a government major, which means [...] So I'd start off light and I'd bump it up" }

Kind: static class of pipelines

.AutomaticSpeechRecognitionPipeline
- new AutomaticSpeechRecognitionPipeline(options)
- ._call(audio, [kwargs]) ⇒ Promise.<Object>

new AutomaticSpeechRecognitionPipeline(options)

Create a new AutomaticSpeechRecognitionPipeline.

Param

Type

Description

options

Object

An object containing the following properties:

[options.task]

string

The task of the pipeline. Useful for specifying subtasks.

[options.model]

PreTrainedModel

The model to use.

[options.tokenizer]

PreTrainedTokenizer

The tokenizer to use.

[options.processor]

Processor

The processor to use.

automaticSpeechRecognitionPipeline._call(audio, [kwargs]) ⇒ <code> Promise. < Object > </code>

Asynchronously processes audio and generates text transcription using the model.

Kind: instance method of AutomaticSpeechRecognitionPipeline Returns: Promise.<Object> - A Promise that resolves to an object containing the transcription text and optionally timestamps if return_timestamps is true.

Param

Type

Default

Description

audio

Float32Array | Array<Float32Array>

The audio to be transcribed. Can be a single Float32Array or an array of Float32Arrays.

[kwargs]

Object

{}

Optional arguments.

[kwargs.return_timestamps]

boolean | 'word'

Whether to return timestamps or not. Default is false.

[kwargs.chunk_length_s]

number

The length of audio chunks to process in seconds. Default is 0 (no chunking).

[kwargs.stride_length_s]

number

The length of overlap between consecutive audio chunks in seconds. If not provided, defaults to chunk_length_s / 6.

[kwargs.chunk_callback]

ChunkCallback

Callback function to be called with each chunk processed.

[kwargs.force_full_sequences]

boolean

Whether to force outputting full sequences or not. Default is false.

[kwargs.language]

string

The source language. Default is null, meaning it should be auto-detected. Use this to potentially improve performance if the source language is known.

[kwargs.task]

string

The task to perform. Default is null, meaning it should be auto-detected.

[kwargs.forced_decoder_ids]

Array.<Array<number>>

A list of pairs of integers which indicates a mapping from generation indices to token indices that will be forced before sampling. For example, [[1, 123]] means the second generated token will always be a token of index 123.

pipelines.ImageToTextPipeline

Image To Text pipeline using a AutoModelForVision2Seq. This pipeline predicts a caption for a given image.

Example: Generate a caption for an image w/ Xenova/vit-gpt2-image-captioning.

Copied

let url = 'https://boincai.com/datasets/Xenova/transformers.js-docs/resolve/main/cats.jpg';
let captioner = await pipeline('image-to-text', 'Xenova/vit-gpt2-image-captioning');
let output = await captioner(url);
// [{ generated_text: 'a cat laying on a couch with another cat' }]

Kind: static class of pipelines

.ImageToTextPipeline
- new ImageToTextPipeline(options)
- ._call(images, [generate_kwargs]) ⇒ Promise.<(Object|Array<Object>)>

new ImageToTextPipeline(options)

Create a new ImageToTextPipeline.

Param

Type

Description

options

Object

An object containing the following properties:

[options.task]

string

The task of the pipeline. Useful for specifying subtasks.

[options.model]

PreTrainedModel

The model to use.

[options.tokenizer]

PreTrainedTokenizer

The tokenizer to use.

[options.processor]

Processor

The processor to use.

imageToTextPipeline._call(images, [generate_kwargs]) ⇒ <code> Promise. < (Object|Array < Object > ) > </code>

Assign labels to the image(s) passed as inputs.

Kind: instance method of ImageToTextPipeline Returns: Promise.<(Object|Array<Object>)> - A Promise that resolves to an object (or array of objects) containing the generated text(s).

Param

Type

Default

Description

images

Array.<any>

The images to be captioned.

[generate_kwargs]

Object

{}

Optional generation arguments.

pipelines.ImageClassificationPipeline

Image classification pipeline using any AutoModelForImageClassification. This pipeline predicts the class of an image.

Example: Classify an image.

Copied

let classifier = await pipeline('image-classification', 'Xenova/vit-base-patch16-224');
let url = 'https://boincai.com/datasets/Xenova/transformers.js-docs/resolve/main/tiger.jpg';
let output = await classifier(url);
// [
//   {label: 'tiger, Panthera tigris', score: 0.632695734500885},
// ]

Example: Classify an image and return top n classes.

Copied

let classifier = await pipeline('image-classification', 'Xenova/vit-base-patch16-224');
let url = 'https://boincai.com/datasets/Xenova/transformers.js-docs/resolve/main/tiger.jpg';
let output = await classifier(url, { topk: 3 });
// [
//   { label: 'tiger, Panthera tigris', score: 0.632695734500885 },
//   { label: 'tiger cat', score: 0.3634825646877289 },
//   { label: 'lion, king of beasts, Panthera leo', score: 0.00045060308184474707 },
// ]

Example: Classify an image and return all classes.

Copied

let classifier = await pipeline('image-classification', 'Xenova/vit-base-patch16-224');
let url = 'https://boincai.com/datasets/Xenova/transformers.js-docs/resolve/main/tiger.jpg';
let output = await classifier(url, { topk: 0 });
// [
//   {label: 'tiger, Panthera tigris', score: 0.632695734500885},
//   {label: 'tiger cat', score: 0.3634825646877289},
//   {label: 'lion, king of beasts, Panthera leo', score: 0.00045060308184474707},
//   {label: 'jaguar, panther, Panthera onca, Felis onca', score: 0.00035465499968267977},
//   ...
// ]

Kind: static class of pipelines

.ImageClassificationPipeline
- new ImageClassificationPipeline(options)
- ._call(images, options) ⇒ Promise.<any>

new ImageClassificationPipeline(options)

Create a new ImageClassificationPipeline.

Param

Type

Description

options

Object

An object containing the following properties:

[options.task]

string

The task of the pipeline. Useful for specifying subtasks.

[options.model]

PreTrainedModel

The model to use.

[options.processor]

Processor

The processor to use.

imageClassificationPipeline._call(images, options) ⇒ <code> Promise. < any > </code>

Classify the given images.

Kind: instance method of ImageClassificationPipeline Returns: Promise.<any> - The top classification results for the images.

Param

Type

Default

Description

images

any

The images to classify.

options

Object

The options to use for classification.

[options.topk]

number

1

The number of top results to return.

pipelines.ImageSegmentationPipeline

Image segmentation pipeline using any AutoModelForXXXSegmentation. This pipeline predicts masks of objects and their classes.

Example: Perform image segmentation with Xenova/detr-resnet-50-panoptic.

Copied

let url = 'https://boincai.com/datasets/Xenova/transformers.js-docs/resolve/main/cats.jpg';
let segmenter = await pipeline('image-segmentation', 'Xenova/detr-resnet-50-panoptic');
let output = await segmenter(url);
// [
//   { label: 'remote', score: 0.9984649419784546, mask: RawImage { ... } },
//   { label: 'cat', score: 0.9994316101074219, mask: RawImage { ... } }
// ]

Kind: static class of pipelines

.ImageSegmentationPipeline
- new ImageSegmentationPipeline(options)
- ._call(images, options) ⇒ Promise.<Array>

new ImageSegmentationPipeline(options)

Create a new ImageSegmentationPipeline.

Param

Type

Description

options

Object

An object containing the following properties:

[options.task]

string

The task of the pipeline. Useful for specifying subtasks.

[options.model]

PreTrainedModel

The model to use.

[options.processor]

Processor

The processor to use.

imageSegmentationPipeline._call(images, options) ⇒ <code> Promise. < Array > </code>

Segment the input images.

Kind: instance method of ImageSegmentationPipeline Returns: Promise.<Array> - The annotated segments.

Param

Type

Default

Description

images

Array

The input images.

options

Object

The options to use for segmentation.

[options.threshold]

number

0.5

Probability threshold to filter out predicted masks.

[options.mask_threshold]

number

0.5

Threshold to use when turning the predicted masks into binary values.

[options.overlap_mask_area_threshold]

number

0.8

Mask overlap threshold to eliminate small, disconnected segments.

[options.subtask]

null | string

Segmentation task to be performed. One of [panoptic, instance, and semantic], depending on model capabilities. If not set, the pipeline will attempt to resolve (in that order).

[options.label_ids_to_fuse]

Array

List of label ids to fuse. If not set, do not fuse any labels.

[options.target_sizes]

Array

List of target sizes for the input images. If not set, use the original image sizes.

pipelines.ZeroShotImageClassificationPipeline

Zero shot image classification pipeline. This pipeline predicts the class of an image when you provide an image and a set of candidate_labels.

Example: Zero shot image classification w/ Xenova/clip-vit-base-patch32.

Copied

let classifier = await pipeline('zero-shot-image-classification', 'Xenova/clip-vit-base-patch32');
let url = 'https://boincai.com/datasets/Xenova/transformers.js-docs/resolve/main/tiger.jpg';
let output = await classifier(url, ['tiger', 'horse', 'dog']);
// [
//   { score: 0.9993917942047119, label: 'tiger' },
//   { score: 0.0003519294841680676, label: 'horse' },
//   { score: 0.0002562698791734874, label: 'dog' }
// ]

Kind: static class of pipelines

.ZeroShotImageClassificationPipeline
- new ZeroShotImageClassificationPipeline(options)
- ._call(images, candidate_labels, options) ⇒ Promise.<any>

new ZeroShotImageClassificationPipeline(options)

Create a new ZeroShotImageClassificationPipeline.

Param

Type

Description

options

Object

An object containing the following properties:

[options.task]

string

The task of the pipeline. Useful for specifying subtasks.

[options.model]

PreTrainedModel

The model to use.

[options.tokenizer]

PreTrainedTokenizer

The tokenizer to use.

[options.processor]

Processor

The processor to use.

zeroShotImageClassificationPipeline._call(images, candidate_labels, options) ⇒ <code> Promise. < any > </code>

Classify the input images with candidate labels using a zero-shot approach.

Kind: instance method of ZeroShotImageClassificationPipeline Returns: Promise.<any> - An array of classifications for each input image or a single classification object if only one input image is provided.

Param

Type

Description

images

Array

The input images.

candidate_labels

Array.<string>

The candidate labels.

options

Object

The options for the classification.

[options.hypothesis_template]

string

The hypothesis template to use for zero-shot classification. Default: "This is a photo of {}".

pipelines.ObjectDetectionPipeline

Object detection pipeline using any AutoModelForObjectDetection. This pipeline predicts bounding boxes of objects and their classes.

Example: Run object-detection with facebook/detr-resnet-50.

Copied

let img = 'https://boincai.com/datasets/Xenova/transformers.js-docs/resolve/main/cats.jpg';

let detector = await pipeline('object-detection', 'Xenova/detr-resnet-50');
let output = await detector(img, { threshold: 0.9 });
// [{
//   "score": 0.9976370930671692,
//   "label": "remote",
//   "box": { "xmin": 31, "ymin": 68, "xmax": 190, "ymax": 118 }
// },
// ...
// {
//   "score": 0.9984092116355896,
//   "label": "cat",
//   "box": { "xmin": 331, "ymin": 19, "xmax": 649, "ymax": 371 }
// }]

Kind: static class of pipelines

.ObjectDetectionPipeline
- new ObjectDetectionPipeline(options)
- ._call(images, options)

new ObjectDetectionPipeline(options)

Create a new ObjectDetectionPipeline.

Param

Type

Description

options

Object

An object containing the following properties:

[options.task]

string

The task of the pipeline. Useful for specifying subtasks.

[options.model]

PreTrainedModel

The model to use.

[options.processor]

Processor

The processor to use.

objectDetectionPipeline._call(images, options)

Detect objects (bounding boxes & classes) in the image(s) passed as inputs.

Kind: instance method of ObjectDetectionPipeline

Param

Type

Default

Description

images

Array.<any>

The input images.

options

Object

The options for the object detection.

[options.threshold]

number

0.9

The threshold used to filter boxes by score.

[options.percentage]

boolean

false

Whether to return the boxes coordinates in percentage (true) or in pixels (false).

pipelines.DocumentQuestionAnsweringPipeline

Document Question Answering pipeline using any AutoModelForDocumentQuestionAnswering. The inputs/outputs are similar to the (extractive) question answering pipeline; however, the pipeline takes an image (and optional OCR’d words/boxes) as input instead of text context.

Example: Answer questions about a document with Xenova/donut-base-finetuned-docvqa.

Copied

let image = 'https://boincai.com/datasets/Xenova/transformers.js-docs/resolve/main/invoice.png';
let question = 'What is the invoice number?';

let qa_pipeline = await pipeline('document-question-answering', 'Xenova/donut-base-finetuned-docvqa');
let output = await qa_pipeline(image, question);
// [{ answer: 'us-001' }]

Kind: static class of pipelines

.DocumentQuestionAnsweringPipeline
- new DocumentQuestionAnsweringPipeline(options)
- ._call(image, question, [generate_kwargs]) ⇒ Promise.<(Object|Array<Object>)>

new DocumentQuestionAnsweringPipeline(options)

Create a new DocumentQuestionAnsweringPipeline.

Param

Type

Description

options

Object

An object containing the following properties:

[options.task]

string

The task of the pipeline. Useful for specifying subtasks.

[options.model]

PreTrainedModel

The model to use.

[options.tokenizer]

PreTrainedTokenizer

The tokenizer to use.

[options.processor]

Processor

The processor to use.

documentQuestionAnsweringPipeline._call(image, question, [generate_kwargs]) ⇒ <code> Promise. < (Object|Array < Object > ) > </code>

Answer the question given as input by using the document.

Kind: instance method of DocumentQuestionAnsweringPipeline Returns: Promise.<(Object|Array<Object>)> - A Promise that resolves to an object (or array of objects) containing the generated text(s).

Param

Type

Default

Description

image

any

The image of the document to use.

question

string

A question to ask of the document.

[generate_kwargs]

Object

{}

Optional generation arguments.

pipelines.TextToAudioPipeline

Text-to-audio generation pipeline using any AutoModelForTextToWaveform or AutoModelForTextToSpectrogram. This pipeline generates an audio file from an input text and optional other conditional inputs.

Example: Generate audio from text with Xenova/speecht5_tts.

Copied

let speaker_embeddings = 'https://boincai.com/datasets/Xenova/transformers.js-docs/resolve/main/speaker_embeddings.bin';
let synthesizer = await pipeline('text-to-speech', 'Xenova/speecht5_tts', { quantized: false });
let out = await synthesizer('Hello, my dog is cute', { speaker_embeddings });
// {
//   audio: Float32Array(26112) [-0.00005657337896991521, 0.00020583874720614403, ...],
//   sampling_rate: 16000
// }

You can then save the audio to a .wav file with the wavefile package:

Copied

import wavefile from 'wavefile';
import fs from 'fs';

let wav = new wavefile.WaveFile();
wav.fromScratch(1, out.sampling_rate, '32f', out.audio);
fs.writeFileSync('out.wav', wav.toBuffer());

Kind: static class of pipelines

.TextToAudioPipeline
- new TextToAudioPipeline(options)
- ._call(text_inputs, options) ⇒ Promise.<Object>

new TextToAudioPipeline(options)

Create a new TextToAudioPipeline.

Param

Type

Description

options

Object

An object containing the following properties:

[options.task]

string

The task of the pipeline. Useful for specifying subtasks.

[options.model]

PreTrainedModel

The model to use.

[options.tokenizer]

PreTrainedTokenizer

The tokenizer to use.

[options.processor]

Processor

The processor to use.

[options.vocoder]

PreTrainedModel

The vocoder to use.

textToAudioPipeline._call(text_inputs, options) ⇒ <code> Promise. < Object > </code>

Generates speech/audio from the inputs.

Kind: instance method of TextToAudioPipeline Returns: Promise.<Object> - An object containing the generated audio and sampling rate.

Param

Type

Default

Description

text_inputs

string | Array<string>

The text(s) to generate.

options

Object

Parameters passed to the model generation/forward method.

[options.vocoder]

PreTrainedModel

The vocoder to use (if the model uses one). If not provided, use the default HifiGan vocoder.

[options.speaker_embeddings]

Tensor | Float32Array | string | URL

pipelines.pipeline(task, [model], [options]) ⇒ <code> Promise. < Pipeline > </code>

Utility factory method to build a [Pipeline] object.

Kind: static method of pipelines Returns: Promise.<Pipeline> - A Pipeline object for the specified task. Throws:

Error If an unsupported pipeline is requested.

Param

Type

Default

Description

task

string

The task defining which pipeline will be returned. Currently accepted tasks are:

"audio-classification": will return a AudioClassificationPipeline.
"automatic-speech-recognition": will return a AutomaticSpeechRecognitionPipeline.
"document-question-answering": will return a DocumentQuestionAnsweringPipeline.
"feature-extraction": will return a FeatureExtractionPipeline.
"fill-mask": will return a FillMaskPipeline.
"image-classification": will return a ImageClassificationPipeline.
"image-segmentation": will return a ImageSegmentationPipeline.
"image-to-text": will return a ImageToTextPipeline.
"object-detection": will return a ObjectDetectionPipeline.
"question-answering": will return a QuestionAnsweringPipeline.
"summarization": will return a SummarizationPipeline.
"text2text-generation": will return a Text2TextGenerationPipeline.
"text-classification" (alias "sentiment-analysis" available): will return a TextClassificationPipeline.
"text-generation": will return a TextGenerationPipeline.
"token-classification" (alias "ner" available): will return a TokenClassificationPipeline.
"translation": will return a TranslationPipeline.
"translation_xx_to_yy": will return a TranslationPipeline.
"zero-shot-classification": will return a ZeroShotClassificationPipeline.
"zero-shot-image-classification": will return a ZeroShotImageClassificationPipeline.

[model]

string

null

The name of the pre-trained model to use. If not specified, the default model for the task will be used.

[options]

*

Optional parameters for the pipeline.

pipelines~QuestionAnsweringResult : <code> object </code>

Kind: inner typedef of pipelines Properties

Name

Type

Description

answer

string

The answer.

score

number

The score.

pipelines~QuestionAnsweringReturnType : <code> Promise. < (QuestionAnsweringResult|Array < QuestionAnsweringResult > ) > </code>

Kind: inner typedef of pipelines

pipelines~ChunkCallback : <code> function </code>

Kind: inner typedef of pipelines

Param

Type

Description

chunk

Chunk

The chunk to process.

PreviousIndex NextModels

Last updated 1 year ago