Adding New Datasets
Adding new datasets
Any BOINC AI user can create a dataset! You can start by creating your dataset repository and choosing one of the following methods to upload your dataset:
While in many cases itβs possible to just add raw data to your dataset repo in any supported formats (JSON, CSV, Parquet, text, images, audio files, β¦), for some large datasets you may want to create a loading script. This script defines the different configurations and splits of your dataset, as well as how to download and process the data.
Datasets outside a namespace
Datasets outside a namespace are maintained by the BOINC AI team. Unlike the naming convention used for community datasets (username/dataset_name
or org/dataset_name
), datasets outside a namespace can be referenced directly by their name (e.g. glue
). If you find that an improvement is needed, use their βCommunityβ tab to open a discussion or submit a PR on the Hub to propose edits.
Last updated