Datasets
  • 🌍GET STARTED
    • Datasets
    • Quickstart
    • Installation
  • 🌍TUTORIALS
    • Overview
    • Load a dataset from the Hub
    • Know your dataset
    • Preprocess
    • Evaluate predictions
    • Create a data
    • Share a dataset to the Hub
  • 🌍HOW-TO GUIDES
    • Overview
    • 🌍GENERAL USAGE
      • Load
      • Process
      • Stream
      • Use with TensorFlow
      • Use with PyTorch
      • Use with JAX
      • Use with Spark
      • Cache management
      • Cloud storage
      • Search index
      • Metrics
      • Beam Datasets
    • 🌍AUDIO
      • Load audio data
      • Process audio data
      • Create an audio dataset
    • 🌍VISION
      • Load image data
      • Process image data
      • Create an image dataset
      • Depth estimation
      • Image classification
      • Semantic segmentation
      • Object detection
    • 🌍TEXT
      • Load text data
      • Process text data
    • 🌍TABULAR
      • Load tabular data
    • 🌍DATASET REPOSITORY
      • Share
      • Create a dataset card
      • Structure your repository
      • Create a dataset loading script
  • 🌍CONCEPTUAL GUIDES
    • Datasets with Arrow
    • The cache
    • Dataset or IterableDataset
    • Dataset features
    • Build and load
    • Batch mapping
    • All about metrics
  • 🌍REFERENCE
    • Main classes
    • Builder classes
    • Loading methods
    • Table Classes
    • Logging methods
    • Task templates
Powered by GitBook
On this page
  • Load image data
  • Local files
  • ImageFolder
  1. HOW-TO GUIDES
  2. VISION

Load image data

PreviousVISIONNextProcess image data

Last updated 1 year ago

Load image data

Image datasets are loaded from the image column, which contains a PIL object.

To work with image datasets, you need to have the vision dependency installed. Check out the guide to learn how to install it.

When you load an image dataset and call the image column, the feature automatically decodes the PIL object into an image:

Copied

>>> from datasets import load_dataset, Image

>>> dataset = load_dataset("beans", split="train")
>>> dataset[0]["image"]

Index into an image dataset using the row index first and then the image column - dataset[0]["image"] - to avoid decoding and resampling all the image objects in the dataset. Otherwise, this can be a slow and time-consuming process if you have a large dataset.

For a guide on how to load any type of dataset, take a look at the .

Local files

You can load a dataset from the image path. Use the function to accept a column of image file paths, and decode it into a PIL image with the feature:

Copied

>>> from datasets import Dataset, Image

>>> dataset = Dataset.from_dict({"image": ["path/to/image_1", "path/to/image_2", ..., "path/to/image_n"]}).cast_column("image", Image())
>>> dataset[0]["image"]
<PIL.PngImagePlugin.PngImageFile image mode=RGBA size=1200x215 at 0x15E6D7160>]

Copied

>>> dataset = load_dataset("beans", split="train").cast_column("image", Image(decode=False))
>>> dataset[0]["image"]
{'bytes': None,
 'path': '/root/.cache/huggingface/datasets/downloads/extracted/b0a21163f78769a2cf11f58dfc767fb458fc7cea5c05dccc0144a2c0f0bc1292/train/bean_rust/bean_rust_train.29.jpg'}

ImageFolder

You can also load a dataset with an ImageFolder dataset builder which does not require writing a custom dataloader. This makes ImageFolder ideal for quickly creating and loading image datasets with several thousand images for different vision tasks. Your image dataset structure should look like this:

Copied

folder/train/dog/golden_retriever.png
folder/train/dog/german_shepherd.png
folder/train/dog/chihuahua.png

folder/train/cat/maine_coon.png
folder/train/cat/bengal.png
folder/train/cat/birman.png

Load your dataset by specifying imagefolder and the directory of your dataset in data_dir:

Copied

>>> from datasets import load_dataset

>>> dataset = load_dataset("imagefolder", data_dir="/path/to/folder")
>>> dataset["train"][0]
{"image": <PIL.PngImagePlugin.PngImageFile image mode=RGBA size=1200x215 at 0x15E6D7160>, "label": 0}

>>> dataset["train"][-1]
{"image": <PIL.PngImagePlugin.PngImageFile image mode=RGBA size=1200x215 at 0x15E8DAD30>, "label": 1}

Load remote datasets from their URLs with the data_files parameter:

Copied

>>> dataset = load_dataset("imagefolder", data_files="https://download.microsoft.com/download/3/E/1/3E1C3F21-ECDB-4869-8368-6DEBA77B919F/kagglecatsanddogs_3367a.zip", split="train")

Copied

>>> from datasets import load_dataset

>>> dataset = load_dataset("imagefolder", data_dir="/path/to/folder", drop_labels=False)

If you only want to load the underlying path to the image dataset without decoding the image object, set decode=False in the feature:

Some datasets have a metadata file (metadata.csv/metadata.jsonl) associated with it, containing other information about the data like bounding boxes, text captions, and labels. The metadata is automatically loaded when you call and specify imagefolder.

To ignore the information in the metadata file, set drop_labels=False in , and allow ImageFolder to automatically infer the label name from the directory name:

For more information about creating your own ImageFolder dataset, take a look at the guide.

🌍
🌍
installation
Image
general loading guide
cast_column()
Image
Image
load_dataset()
load_dataset()
Create an image dataset