Process audio data
Last updated
Last updated
This guide shows specific methods for processing audio datasets. Learn how to:
Resample the sampling rate.
Use with audio datasets.
For a guide on how to process any type of dataset, take a look at the .
The function is used to cast a column to another feature to be decoded. When you use this function with the feature, you can resample the sampling rate:
Copied
Audio files are decoded and resampled on-the-fly, so the next time you access an example, the audio file is resampled to 16kHz:
Copied
For pretrained speech recognition models, load a feature extractor and tokenizer and combine them in a processor
:
Copied
For fine-tuned speech recognition models, you only need to load a processor
:
Copied
Copied
The function helps preprocess your entire dataset at once. Depending on the type of model youβre working with, youβll need to either load a or a .
When you use with your preprocessing function, include the audio
column to ensure youβre actually resampling the audio data: