Load audio data
Last updated
Last updated
You can load an audio dataset using the feature that automatically decodes and resamples the audio files when you access the examples. Audio decoding is based on the python package, which uses the C library under the hood.
To work with audio datasets, you need to have the audio
dependencies installed. Check out the guide to learn how to install it.
You can load your own dataset using the paths to your audio files. Use the function to take a column of audio file paths, and cast it to the feature:
Copied
You can also load a dataset with an AudioFolder
dataset builder. It does not require writing a custom dataloader, making it useful for quickly creating and loading audio datasets with several thousand audio files.
To link your audio files with metadata information, make sure your dataset has a metadata.csv
file. Your dataset structure might look like:
Copied
Your metadata.csv
file must have a file_name
column which links audio files with their metadata. An example metadata.csv
file might look like:
Copied
AudioFolder
will load audio data and create a transcription
column containing texts from metadata.csv
:
Copied
You can load remote datasets from their URLs with the data_files parameter:
Copied
Metadata can also be specified as JSON Lines, in which case use metadata.jsonl
as the name of the metadata file. This format is helpful in scenarios when one of the columns is complex, e.g. a list of floats, to avoid parsing errors or reading the complex values as strings.
Copied
If you don鈥檛 have a metadata file, AudioFolder
automatically infers the label name from the directory name. If you want to drop automatically created labels, set drop_labels=True
. In this case, your dataset will only contain an audio column:
Copied
To ignore the information in the metadata file, set drop_metadata=True
in :
For more information about creating your own AudioFolder
dataset, take a look at the guide.
For a guide on how to load any type of dataset, take a look at the .