Process image data
Last updated
Last updated
This guide shows specific methods for processing image datasets. Learn how to:
Use with image dataset.
Apply data augmentations to a dataset with .
For a guide on how to process any type of dataset, take a look at the .
The function can apply transforms over an entire dataset.
For example, create a basic function:
Copied
Now use the function to resize the entire dataset, and set batched=True
to speed up the process by accepting batches of examples. The transform returns pixel_values
as a cacheable PIL.Image
object:
Copied
For example, if youβd like to change the color properties of an image randomly:
Copied
Create a function to apply the ColorJitter
transform:
Copied
Copied
The cache file saves time because you donβt have to execute the same transform twice. The function is best for operations you only run once per training - like resizing an image - instead of using it for operations executed for each epoch, like data augmentations.
takes up some memory, but you can reduce its memory requirements with the following parameters:
determines the number of examples that are processed in one call to the transform function.
determines the number of processed examples that are kept in memory before they are stored away.
Both parameter values default to 1000, which can be expensive if you are storing images. Lower these values to use less memory when you use .
π Datasets applies data augmentations from any library or package to your dataset. Transforms can be applied on-the-fly on batches of data with , which consumes less disk space.
The following example uses , but feel free to use other data augmentation libraries like , , and .
Apply the transform with the function: