BAFileSystem

Interact with the Hub through the Filesystem API

In addition to the BaApiarrow-up-right, the boincai_hub library provides BaFileSystemarrow-up-right, a pythonic fsspec-compatiblearrow-up-right file interface to the BOINC AI Hub. The BaFileSystemarrow-up-right builds of top of the BaApiarrow-up-right and offers typical filesystem style operations like cp, mv, ls, du, glob, get_file, and put_file.

Usage

Copied

>>> from boincai_hub import BaFileSystem
>>> fs = BaFileSystem()

>>> # List all files in a directory
>>> fs.ls("datasets/my-username/my-dataset-repo/data", detail=False)
['datasets/my-username/my-dataset-repo/data/train.csv', 'datasets/my-username/my-dataset-repo/data/test.csv']

>>> # List all ".csv" files in a repo
>>> fs.glob("datasets/my-username/my-dataset-repo/**.csv")
['datasets/my-username/my-dataset-repo/data/train.csv', 'datasets/my-username/my-dataset-repo/data/test.csv']

>>> # Read a remote file 
>>> with fs.open("datasets/my-username/my-dataset-repo/data/train.csv", "r") as f:
...     train_data = f.readlines()

>>> # Read the content of a remote file as a string
>>> train_data = fs.read_text("datasets/my-username/my-dataset-repo/data/train.csv", revision="dev")

>>> # Write a remote file
>>> with fs.open("datasets/my-username/my-dataset-repo/data/validation.csv", "w") as f:
...     f.write("text,label")
...     f.write("Fantastic movie!,good")

The optional revision argument can be passed to run an operation from a specific commit such as a branch, tag name, or a commit hash.

Unlike Python’s built-in open, fsspec’s open defaults to binary mode, "rb". This means you must explicitly set mode as "r" for reading and "w" for writing in text mode. Appending to a file (modes "a" and "ab") is not supported yet.

Integrations

The BaFileSystemarrow-up-right can be used with any library that integrates fsspec, provided the URL follows the scheme:

Copied

The repo_type_prefix is datasets/ for datasets, spaces/ for spaces, and models don’t need a prefix in the URL.

Some interesting integrations where BaFileSystemarrow-up-right simplifies interacting with the Hub are listed below:

The same workflow can also be used for Daskarrow-up-right and Polarsarrow-up-right DataFrames.

Authentication

In many cases, you must be logged in with a BOINC AI account to interact with the Hub. Refer to the Loginarrow-up-right section of the documentation to learn more about authentication methods on the Hub.

It is also possible to login programmatically by passing your token as an argument to BaFileSystemarrow-up-right:

Copied

If you login this way, be careful not to accidentally leak the token when sharing your source code!

Last updated