Processors
Last updated
Last updated
Processors are used to prepare non-textual inputs (e.g., image or audio) for a model.
Example: Using a WhisperProcessor
to prepare an audio input for a model.
Copied
static
⇐ Callable
⇐ FeatureExtractor
⇒ Promise.<RawImage>
⇒ Promise.<PreprocessedImage>
⇒ Promise.<ImageFeatureExtractorResult>
⇐ ImageFeatureExtractor
⇒ Promise.<DetrFeatureExtractorResult>
: post_process_object_detection
⇒ *
⇒ *
⇒ *
⇒ Array.<{segmentation: Tensor, segments_info: Array<{id: number, label_id: number, score: number}>}>
⇐ Callable
⇒ Promise.<any>
⇐ Processor
⇒ Promise.<any>
⇒ Promise.<Processor>
inner
⇒ Array.<number>
⇒ Array.<Object>
: Array.<number>
: *
: object
: object
: object
: object
Base class for feature extractors.
Constructs a new FeatureExtractor instance.
config
Object
The configuration for the feature extractor.
Feature extractor for image models.
Constructs a new ImageFeatureExtractor instance.
config
Object
The configuration for the feature extractor.
config.image_mean
Array.<number>
The mean values for image normalization.
config.image_std
Array.<number>
The standard deviation values for image normalization.
config.do_rescale
boolean
Whether to rescale the image pixel values to the [0,1] range.
config.rescale_factor
number
The factor to use for rescaling the image pixel values.
config.do_normalize
boolean
Whether to normalize the image pixel values.
config.do_resize
boolean
Whether to resize the image.
config.resample
number
What method to use for resampling.
config.size
number
The size to resize the image to.
Resize the image to make a thumbnail. The image is resized so that no dimension is larger than any corresponding dimension of the specified size.
image
RawImage
The image to be resized.
size
Object
The size {"height": h, "width": w}
to resize the image to.
[resample]
string
| 0
| 1
| 2
| 3
| 4
| 5
2
The resampling filter to use.
Preprocesses the given image.
image
RawImage
The image to preprocess.
Calls the feature extraction process on an array of image URLs, preprocesses each image, and concatenates the resulting features into a single Tensor.
images
Array.<any>
The URL(s) of the image(s) to extract features from.
...args
any
Additional arguments.
Detr Feature Extractor.
Calls the feature extraction process on an array of image URLs, preprocesses each image, and concatenates the resulting features into a single Tensor.
urls
Array.<any>
The URL(s) of the image(s) to extract features from.
Binarize the given masks using object_mask_threshold
, it returns the associated values of masks
, scores
and labels
.
class_logits
Tensor
The class logits.
mask_logits
Tensor
The mask logits.
object_mask_threshold
number
A number between 0 and 1 used to binarize the masks.
num_labels
number
The number of labels.
Checks whether the segment is valid or not.
mask_labels
Int32Array
Labels for each pixel in the mask.
mask_probs
Array.<Tensor>
Probabilities for each pixel in the masks.
k
number
The class id of the segment.
mask_threshold
number
0.5
The mask threshold.
overlap_mask_area_threshold
number
0.8
The overlap mask area threshold.
Computes the segments.
mask_probs
Array.<Tensor>
The mask probabilities.
pred_scores
Array.<number>
The predicted scores.
pred_labels
Array.<number>
The predicted labels.
mask_threshold
number
The mask threshold.
overlap_mask_area_threshold
number
The overlap mask area threshold.
label_ids_to_fuse
Set.<number>
The label ids to fuse.
target_size
Array.<number>
The target size of the image.
Post-process the model output to generate the final panoptic segmentation.
outputs
*
The model output to post process
[threshold]
number
0.5
The probability score threshold to keep predicted instance masks.
[mask_threshold]
number
0.5
Threshold to use when turning the predicted masks into binary values.
[overlap_mask_area_threshold]
number
0.8
The overlap mask area threshold to merge or discard small disconnected parts within each binary instance mask.
[label_ids_to_fuse]
Set.<number>
The labels in this state will have all their instances be fused together.
[target_sizes]
Array.<Array<number>>
The target sizes to resize the masks to.
Represents a Processor that extracts features from an input.
Creates a new Processor with the given feature extractor.
feature_extractor
FeatureExtractor
The function used to extract features from the input.
Calls the feature_extractor function with the given input.
input
any
The input to extract features from.
...args
any
Additional arguments.
Represents a WhisperProcessor that extracts features from an audio input.
Calls the feature_extractor function with the given audio input.
audio
any
The audio input to extract features from.
Helper class which is used to instantiate pretrained processors with the from_pretrained
function. The chosen processor class is determined by the type specified in the processor config.
Example: Load a processor using from_pretrained
.
Copied
Example: Run an image through a processor.
Copied
Instantiate one of the processor classes of the library from a pretrained model.
The processor class to instantiate is selected based on the feature_extractor_type
property of the config object (either passed as an argument or loaded from pretrained_model_name_or_path
if possible)
pretrained_model_name_or_path
string
The name or path of the pretrained model. Can be either:
A string, the model id of a pretrained processor hosted inside a model repo on boincai.com. Valid model ids can be located at the root-level, like bert-base-uncased
, or namespaced under a user or organization name, like dbmdz/bert-base-german-cased
.
A path to a directory containing processor files, e.g., ./my_model_directory/
.
options
*
Additional options for loading the processor.
Converts bounding boxes from center format to corners format.
arr
Array.<number>
The coordinate for the center of the box and its width, height dimensions (center_x, center_y, width, height)
Post-processes the outputs of the model (for object detection).
outputs
Object
The outputs of the model that must be post-processed
outputs.logits
Tensor
The logits
outputs.pred_boxes
Tensor
The predicted boxes.
Named tuple to indicate the order we are using is (height x width), even though the Graphics’ industry standard is (width x height).
pixel_values
Tensor
The pixel values of the batched preprocessed images.
original_sizes
Array.<HeightWidth>
Array of two-dimensional tuples like [[480, 640]].
reshaped_input_sizes
Array.<HeightWidth>
Array of two-dimensional tuples like [[1000, 1330]].
original_size
HeightWidth
The original size of the image.
reshaped_input_size
HeightWidth
The reshaped input size of the image.
pixel_values
Tensor
The pixel values of the preprocessed image.
pixel_mask
Tensor
pixel_values
Tensor
original_sizes
Array.<HeightWidth>
reshaped_input_sizes
Array.<HeightWidth>
input_points
Tensor
Kind: static class of
Extends: Callable
Kind: static class of
Extends: FeatureExtractor
⇐ FeatureExtractor
⇒ Promise.<RawImage>
⇒ Promise.<PreprocessedImage>
⇒ Promise.<ImageFeatureExtractorResult>
Kind: instance method of
Returns: Promise.<RawImage>
- The resized image.
Kind: instance method of
Returns: Promise.<PreprocessedImage>
- The preprocessed image.
Kind: instance method of
Returns: Promise.<ImageFeatureExtractorResult>
- An object containing the concatenated pixel values (and other metadata) of the preprocessed images.
Kind: static class of
Extends: ImageFeatureExtractor
⇐ ImageFeatureExtractor
⇒ Promise.<DetrFeatureExtractorResult>
: post_process_object_detection
⇒ *
⇒ *
⇒ *
⇒ Array.<{segmentation: Tensor, segments_info: Array<{id: number, label_id: number, score: number}>}>
Kind: instance method of
Returns: Promise.<DetrFeatureExtractorResult>
- An object containing the concatenated pixel values of the preprocessed images.
Kind: instance method of
Kind: instance method of
Returns: *
- The binarized masks, the scores, and the labels.
Kind: instance method of
Returns: *
- Whether the segment is valid or not, and the indices of the valid labels.
Kind: instance method of
Returns: *
- The computed segments.
Kind: instance method of
Kind: static class of
Extends: Callable
⇐ Callable
⇒ Promise.<any>
Kind: instance method of
Returns: Promise.<any>
- A Promise that resolves with the extracted features.
Kind: static class of
Extends: Processor
Kind: instance method of
Returns: Promise.<any>
- A Promise that resolves with the extracted features.
Kind: static class of
Kind: static method of
Returns: Promise.<Processor>
- A new instance of the Processor class.
Kind: inner method of
Returns: Array.<number>
- The coodinates for the top-left and bottom-right corners of the box (top_left_x, top_left_y, bottom_right_x, bottom_right_y)
Kind: inner method of
Returns: Array.<Object>
- An array of objects containing the post-processed outputs.
Kind: inner property of
Kind: inner typedef of
Kind: inner typedef of Properties
Kind: inner typedef of Properties
Kind: inner typedef of Properties
Kind: inner typedef of Properties