List splits and configurations
Last updated
Last updated
Datasets typically have splits and may also have configurations. A split is a subset of the dataset, like train
and test
, that are used during different stages of training and evaluating a model. A configuration is a sub-dataset contained within a larger dataset. Configurations are especially common in multilingual speech datasets where there may be a different configuration for each language. If you’re interested in learning more about splits and configurations, check out the !
This guide shows you how to use Datasets Server’s /splits
endpoint to retrieve a dataset’s splits and configurations programmatically. Feel free to also try it out with , , or
The /splits
endpoint accepts the dataset name as its query parameter:
PythonJavaScriptcURLCopied
The endpoint response is a JSON containing a list of the dataset’s splits and configurations. For example, the dataset has six splits and two configurations:
Copied