Semantic segmentation datasets are used to train a model to classify every pixel in an image. There are a wide variety of applications enabled by these datasets such as background removal from images, stylizing images, or scene understanding for autonomous driving. This guide will show you how to apply transformations to an image segmentation dataset.
Before you start, make sure you have up-to-date versions of albumentations and cv2 installed:
Copied
pip install -U albumentations opencv-python
is a Python library for performing data augmentation for computer vision. It supports various computer vision tasks such as image classification, object detection, segmentation, and keypoint estimation.
This guide uses the dataset for segmenting and parsing an image into different image regions associated with semantic categories, such as sky, road, person, and bed.
Load the train split of the dataset and take a look at an example:
Copied
>>> from datasets import load_dataset
>>> dataset = load_dataset("scene_parse_150", split="train")
>>> index = 10
>>> dataset[index]
{'image': <PIL.JpegImagePlugin.JpegImageFile image mode=RGB size=683x512 at 0x7FB37B0EC810>,
'annotation': <PIL.PngImagePlugin.PngImageFile image mode=L size=683x512 at 0x7FB37B0EC9D0>,
'scene_category': 927}
The dataset has three fields:
image: a PIL image object.
annotation: segmentation mask of the image.
scene_category: the label or scene category of the image (like βkitchenβ or βofficeβ).
Next, check out an image with:
Copied
>>> dataset[index]["image"]
Similarly, you can check out the respective segmentation mask:
Copied
>>> dataset[index]["annotation"]
After defining the color palette, you should be ready to visualize some overlays.
Copied
>>> import matplotlib.pyplot as plt
>>> def visualize_seg_mask(image: np.ndarray, mask: np.ndarray):
... color_seg = np.zeros((mask.shape[0], mask.shape[1], 3), dtype=np.uint8)
... palette = np.array(create_ade20k_label_colormap())
... for label, color in enumerate(palette):
... color_seg[mask == label, :] = color
... color_seg = color_seg[..., ::-1] # convert to BGR
... img = np.array(image) * 0.5 + color_seg * 0.5 # plot the image with the segmentation map
... img = img.astype(np.uint8)
... plt.figure(figsize=(15, 10))
... plt.imshow(img)
... plt.axis("off")
... plt.show()
>>> visualize_seg_mask(
... np.array(dataset[index]["image"]),
... np.array(dataset[index]["annotation"])
... )
Now apply some augmentations with albumentations. Youβll first resize the image and adjust its brightness.