Load text data
Last updated
Last updated
This guide shows you how to load text datasets. To learn how to load any type of dataset, take a look at the .
Text files are one of the most common file types for storing a dataset. By default, 🌍 Datasets samples a text file line by line to build the dataset.
Copied
To sample a text file by paragraph or even an entire document, use the sample_by
parameter:
Copied
You can also use grep patterns to load specific files:
Copied
To load remote text files via HTTP, pass the URLs instead:
Copied