Processors
processors
Processors are used to prepare non-textual inputs (e.g., image or audio) for a model.
Example: Using a WhisperProcessor
to prepare an audio input for a model.
Copied
import { AutoProcessor, read_audio } from '@xenova/transformers';
let processor = await AutoProcessor.from_pretrained('openai/whisper-tiny.en');
let audio = await read_audio('https://boincai.com/datasets/Narsil/asr_dummy/resolve/main/mlk.flac', 16000);
let { input_features } = await processor(audio);
// Tensor {
// data: Float32Array(240000) [0.4752984642982483, 0.5597258806228638, 0.56434166431427, ...],
// dims: [1, 80, 3000],
// type: 'float32',
// size: 240000,
// }
static
.ImageFeatureExtractor ⇐
FeatureExtractor
.thumbnail(image, size, [resample])
⇒Promise.<RawImage>
.preprocess(image)
⇒Promise.<PreprocessedImage>
._call(images, ...args)
⇒Promise.<ImageFeatureExtractorResult>
.DetrFeatureExtractor ⇐
ImageFeatureExtractor
._call(urls)
⇒Promise.<DetrFeatureExtractorResult>
.post_process_object_detection()
:post_process_object_detection
.post_process_panoptic_segmentation(outputs, [threshold], [mask_threshold], [overlap_mask_area_threshold], [label_ids_to_fuse], [target_sizes])
⇒Array.<{segmentation: Tensor, segments_info: Array<{id: number, label_id: number, score: number}>}>
.Processor ⇐
Callable
._call(input, ...args)
⇒Promise.<any>
.WhisperProcessor ⇐
Processor
._call(audio)
⇒Promise.<any>
.from_pretrained(pretrained_model_name_or_path, options)
⇒Promise.<Processor>
inner
~center_to_corners_format(arr)
⇒Array.<number>
~post_process_object_detection(outputs)
⇒Array.<Object>
~box
:Array.<number>
~HeightWidth
:*
~ImageFeatureExtractorResult
:object
~PreprocessedImage
:object
~DetrFeatureExtractorResult
:object
~SamImageProcessorResult
:object
processors.FeatureExtractor ⇐ <code> Callable </code>
Base class for feature extractors.
Kind: static class of processors
Extends: Callable
new FeatureExtractor(config)
Constructs a new FeatureExtractor instance.
config
Object
The configuration for the feature extractor.
processors.ImageFeatureExtractor ⇐ <code> FeatureExtractor </code>
Feature extractor for image models.
Kind: static class of processors
Extends: FeatureExtractor
.ImageFeatureExtractor ⇐
FeatureExtractor
.thumbnail(image, size, [resample])
⇒Promise.<RawImage>
.preprocess(image)
⇒Promise.<PreprocessedImage>
._call(images, ...args)
⇒Promise.<ImageFeatureExtractorResult>
new ImageFeatureExtractor(config)
Constructs a new ImageFeatureExtractor instance.
config
Object
The configuration for the feature extractor.
config.image_mean
Array.<number>
The mean values for image normalization.
config.image_std
Array.<number>
The standard deviation values for image normalization.
config.do_rescale
boolean
Whether to rescale the image pixel values to the [0,1] range.
config.rescale_factor
number
The factor to use for rescaling the image pixel values.
config.do_normalize
boolean
Whether to normalize the image pixel values.
config.do_resize
boolean
Whether to resize the image.
config.resample
number
What method to use for resampling.
config.size
number
The size to resize the image to.
imageFeatureExtractor.thumbnail(image, size, [resample]) ⇒ <code> Promise. < RawImage > </code>
Resize the image to make a thumbnail. The image is resized so that no dimension is larger than any corresponding dimension of the specified size.
Kind: instance method of ImageFeatureExtractor
Returns: Promise.<RawImage>
- The resized image.
image
RawImage
The image to be resized.
size
Object
The size {"height": h, "width": w}
to resize the image to.
[resample]
string
| 0
| 1
| 2
| 3
| 4
| 5
2
The resampling filter to use.
imageFeatureExtractor.preprocess(image) ⇒ <code> Promise. < PreprocessedImage > </code>
Preprocesses the given image.
Kind: instance method of ImageFeatureExtractor
Returns: Promise.<PreprocessedImage>
- The preprocessed image.
image
RawImage
The image to preprocess.
imageFeatureExtractor._call(images, ...args) ⇒ <code> Promise. < ImageFeatureExtractorResult > </code>
Calls the feature extraction process on an array of image URLs, preprocesses each image, and concatenates the resulting features into a single Tensor.
Kind: instance method of ImageFeatureExtractor
Returns: Promise.<ImageFeatureExtractorResult>
- An object containing the concatenated pixel values (and other metadata) of the preprocessed images.
images
Array.<any>
The URL(s) of the image(s) to extract features from.
...args
any
Additional arguments.
processors.DetrFeatureExtractor ⇐ <code> ImageFeatureExtractor </code>
Detr Feature Extractor.
Kind: static class of processors
Extends: ImageFeatureExtractor
.DetrFeatureExtractor ⇐
ImageFeatureExtractor
._call(urls)
⇒Promise.<DetrFeatureExtractorResult>
.post_process_object_detection()
:post_process_object_detection
.post_process_panoptic_segmentation(outputs, [threshold], [mask_threshold], [overlap_mask_area_threshold], [label_ids_to_fuse], [target_sizes])
⇒Array.<{segmentation: Tensor, segments_info: Array<{id: number, label_id: number, score: number}>}>
detrFeatureExtractor._call(urls) ⇒ <code> Promise. < DetrFeatureExtractorResult > </code>
Calls the feature extraction process on an array of image URLs, preprocesses each image, and concatenates the resulting features into a single Tensor.
Kind: instance method of DetrFeatureExtractor
Returns: Promise.<DetrFeatureExtractorResult>
- An object containing the concatenated pixel values of the preprocessed images.
urls
Array.<any>
The URL(s) of the image(s) to extract features from.
detrFeatureExtractor.post_process_object_detection() : <code> post_process_object_detection </code>
Kind: instance method of DetrFeatureExtractor
detrFeatureExtractor.remove_low_and_no_objects(class_logits, mask_logits, object_mask_threshold, num_labels) ⇒ <code> * </code>
Binarize the given masks using object_mask_threshold
, it returns the associated values of masks
, scores
and labels
.
Kind: instance method of DetrFeatureExtractor
Returns: *
- The binarized masks, the scores, and the labels.
class_logits
Tensor
The class logits.
mask_logits
Tensor
The mask logits.
object_mask_threshold
number
A number between 0 and 1 used to binarize the masks.
num_labels
number
The number of labels.
detrFeatureExtractor.check_segment_validity(mask_labels, mask_probs, k, mask_threshold, overlap_mask_area_threshold) ⇒ <code> * </code>
Checks whether the segment is valid or not.
Kind: instance method of DetrFeatureExtractor
Returns: *
- Whether the segment is valid or not, and the indices of the valid labels.
mask_labels
Int32Array
Labels for each pixel in the mask.
mask_probs
Array.<Tensor>
Probabilities for each pixel in the masks.
k
number
The class id of the segment.
mask_threshold
number
0.5
The mask threshold.
overlap_mask_area_threshold
number
0.8
The overlap mask area threshold.
detrFeatureExtractor.compute_segments(mask_probs, pred_scores, pred_labels, mask_threshold, overlap_mask_area_threshold, label_ids_to_fuse, target_size) ⇒ <code> * </code>
Computes the segments.
Kind: instance method of DetrFeatureExtractor
Returns: *
- The computed segments.
mask_probs
Array.<Tensor>
The mask probabilities.
pred_scores
Array.<number>
The predicted scores.
pred_labels
Array.<number>
The predicted labels.
mask_threshold
number
The mask threshold.
overlap_mask_area_threshold
number
The overlap mask area threshold.
label_ids_to_fuse
Set.<number>
The label ids to fuse.
target_size
Array.<number>
The target size of the image.
detrFeatureExtractor.post_process_panoptic_segmentation(outputs, [threshold], [mask_threshold], [overlap_mask_area_threshold], [label_ids_to_fuse], [target_sizes]) ⇒ <code> Array. < {segmentation: Tensor, segments_info: Array < {id: number, label_id: number, score: number} > } > </code>
Post-process the model output to generate the final panoptic segmentation.
Kind: instance method of DetrFeatureExtractor
outputs
*
The model output to post process
[threshold]
number
0.5
The probability score threshold to keep predicted instance masks.
[mask_threshold]
number
0.5
Threshold to use when turning the predicted masks into binary values.
[overlap_mask_area_threshold]
number
0.8
The overlap mask area threshold to merge or discard small disconnected parts within each binary instance mask.
[label_ids_to_fuse]
Set.<number>
The labels in this state will have all their instances be fused together.
[target_sizes]
Array.<Array<number>>
The target sizes to resize the masks to.
processors.Processor ⇐ <code> Callable </code>
Represents a Processor that extracts features from an input.
Kind: static class of processors
Extends: Callable
.Processor ⇐
Callable
._call(input, ...args)
⇒Promise.<any>
new Processor(feature_extractor)
Creates a new Processor with the given feature extractor.
feature_extractor
FeatureExtractor
The function used to extract features from the input.
processor._call(input, ...args) ⇒ <code> Promise. < any > </code>
Calls the feature_extractor function with the given input.
Kind: instance method of Processor
Returns: Promise.<any>
- A Promise that resolves with the extracted features.
input
any
The input to extract features from.
...args
any
Additional arguments.
processors.WhisperProcessor ⇐ <code> Processor </code>
Represents a WhisperProcessor that extracts features from an audio input.
Kind: static class of processors
Extends: Processor
whisperProcessor._call(audio) ⇒ <code> Promise. < any > </code>
Calls the feature_extractor function with the given audio input.
Kind: instance method of WhisperProcessor
Returns: Promise.<any>
- A Promise that resolves with the extracted features.
audio
any
The audio input to extract features from.
processors.AutoProcessor
Helper class which is used to instantiate pretrained processors with the from_pretrained
function. The chosen processor class is determined by the type specified in the processor config.
Example: Load a processor using from_pretrained
.
Copied
let processor = await AutoProcessor.from_pretrained('openai/whisper-tiny.en');
Example: Run an image through a processor.
Copied
let processor = await AutoProcessor.from_pretrained('Xenova/clip-vit-base-patch16');
let image = await RawImage.read('https://boincai.com/datasets/Xenova/transformers.js-docs/resolve/main/football-match.jpg');
let image_inputs = await processor(image);
// {
// "pixel_values": {
// "dims": [ 1, 3, 224, 224 ],
// "type": "float32",
// "data": Float32Array [ -1.558687686920166, -1.558687686920166, -1.5440893173217773, ... ],
// "size": 150528
// },
// "original_sizes": [
// [ 533, 800 ]
// ],
// "reshaped_input_sizes": [
// [ 224, 224 ]
// ]
// }
Kind: static class of processors
AutoProcessor.from_pretrained(pretrained_model_name_or_path, options) ⇒ <code> Promise. < Processor > </code>
Instantiate one of the processor classes of the library from a pretrained model.
The processor class to instantiate is selected based on the feature_extractor_type
property of the config object (either passed as an argument or loaded from pretrained_model_name_or_path
if possible)
Kind: static method of AutoProcessor
Returns: Promise.<Processor>
- A new instance of the Processor class.
pretrained_model_name_or_path
string
The name or path of the pretrained model. Can be either:
A string, the model id of a pretrained processor hosted inside a model repo on boincai.com. Valid model ids can be located at the root-level, like
bert-base-uncased
, or namespaced under a user or organization name, likedbmdz/bert-base-german-cased
.A path to a directory containing processor files, e.g.,
./my_model_directory/
.
options
*
Additional options for loading the processor.
processors~center_to_corners_format(arr) ⇒ <code> Array. < number > </code>
Converts bounding boxes from center format to corners format.
Kind: inner method of processors
Returns: Array.<number>
- The coodinates for the top-left and bottom-right corners of the box (top_left_x, top_left_y, bottom_right_x, bottom_right_y)
arr
Array.<number>
The coordinate for the center of the box and its width, height dimensions (center_x, center_y, width, height)
processors~post_process_object_detection(outputs) ⇒ <code> Array. < Object > </code>
Post-processes the outputs of the model (for object detection).
Kind: inner method of processors
Returns: Array.<Object>
- An array of objects containing the post-processed outputs.
outputs
Object
The outputs of the model that must be post-processed
outputs.logits
Tensor
The logits
outputs.pred_boxes
Tensor
The predicted boxes.
post_process_object_detection~box : <code> Array. < number > </code>
Kind: inner property of post_process_object_detection
processors~HeightWidth : <code> * </code>
Named tuple to indicate the order we are using is (height x width), even though the Graphics’ industry standard is (width x height).
Kind: inner typedef of processors
processors~ImageFeatureExtractorResult : <code> object </code>
Kind: inner typedef of processors
Properties
pixel_values
Tensor
The pixel values of the batched preprocessed images.
original_sizes
Array.<HeightWidth>
Array of two-dimensional tuples like [[480, 640]].
reshaped_input_sizes
Array.<HeightWidth>
Array of two-dimensional tuples like [[1000, 1330]].
processors~PreprocessedImage : <code> object </code>
Kind: inner typedef of processors
Properties
original_size
HeightWidth
The original size of the image.
reshaped_input_size
HeightWidth
The reshaped input size of the image.
pixel_values
Tensor
The pixel values of the preprocessed image.
processors~DetrFeatureExtractorResult : <code> object </code>
Kind: inner typedef of processors
Properties
pixel_mask
Tensor
processors~SamImageProcessorResult : <code> object </code>
Kind: inner typedef of processors
Properties
pixel_values
Tensor
original_sizes
Array.<HeightWidth>
reshaped_input_sizes
Array.<HeightWidth>
input_points
Tensor
Last updated