Train with a script
Last updated
Last updated
Along with the π Transformers , there are also example scripts demonstrating how to train a model for a task with , , or .
You will also find scripts weβve used in our and which are mostly community contributed. These scripts are not actively maintained and require a specific version of π Transformers that will most likely be incompatible with the latest version of the library.
The example scripts are not expected to work out-of-the-box on every problem, and you may need to adapt the script to the problem youβre trying to solve. To help you with this, most of the scripts fully expose how data is preprocessed, allowing you to edit it as necessary for your use case.
For any feature youβd like to implement in an example script, please discuss it on the or in an before submitting a Pull Request. While we welcome bug fixes, it is unlikely we will merge a Pull Request that adds more functionality at the cost of readability.
This guide will show you how to run an example summarization training script in and . All examples are expected to work with both frameworks unless otherwise specified.
To successfully run the latest version of the example scripts, you have to install πTransformers from source in a new virtual environment:
Copied
For older versions of the example scripts, click on the toggle below:
Then switch your current clone of π Transformers to a specific version, like v3.5.1 for example:
Copied
After youβve setup the correct library version, navigate to the example folder of your choice and install the example specific requirements:
Copied
PytorchHide Pytorch content
Copied
TensorFlowHide TensorFlow content
Copied
Add the fp16
argument to enable mixed precision.
Set the number of GPUs to use with the nproc_per_node
argument.
Copied
PytorchHide Pytorch content
Copied
TensorFlowHide TensorFlow content
Copied
Note: As Accelerate is rapidly developing, the git version of accelerate must be installed to run the scripts
Copied
Instead of the run_summarization.py
script, you need to use the run_summarization_no_trainer.py
script. π Accelerate supported scripts will have a task_no_trainer.py
file in the folder. Begin by running the following command to create and save a configuration file:
Copied
Test your setup to make sure it is configured correctly:
Copied
Now you are ready to launch the training:
Copied
The summarization script supports custom datasets as long as they are a CSV or JSON Line file. When you use your own dataset, you need to specify several additional arguments:
train_file
and validation_file
specify the path to your training and validation files.
text_column
is the input text to summarize.
summary_column
is the target text to output.
A summarization script using a custom dataset would look like this:
Copied
It is often a good idea to run your script on a smaller number of dataset examples to ensure everything works as expected before committing to an entire dataset which may take hours to complete. Use the following arguments to truncate the dataset to a maximum number of samples:
max_train_samples
max_eval_samples
max_predict_samples
Copied
Not all example scripts support the max_predict_samples
argument. If you arenβt sure whether your script supports this argument, add the -h
argument to check:
Copied
Another helpful option to enable is resuming training from a previous checkpoint. This will ensure you can pick up where you left off without starting over if your training gets interrupted. There are two methods to resume training from a checkpoint.
The first method uses the output_dir previous_output_dir
argument to resume training from the latest checkpoint stored in output_dir
. In this case, you should remove overwrite_output_dir
:
Copied
The second method uses the resume_from_checkpoint path_to_specific_checkpoint
argument to resume training from a specific checkpoint folder.
Copied
Copied
Then add the push_to_hub
argument to the script. This argument will create a repository with your BOINC AI username and the folder name specified in output_dir
.
To give your repository a specific name, use the push_to_hub_model_id
argument to add it. The repository will be automatically listed under your namespace.
The following example shows how to upload a model with a specific repository name:
Copied
The example script downloads and preprocesses a dataset from the π library. Then the script fine-tunes a dataset with the on an architecture that supports summarization. The following example shows how to fine-tune on the dataset. The T5 model requires an additional source_prefix
argument due to how it was trained. This prompt lets T5 know this is a summarization task.
The example script downloads and preprocesses a dataset from the π library. Then the script fine-tunes a dataset using Keras on an architecture that supports summarization. The following example shows how to fine-tune on the dataset. The T5 model requires an additional source_prefix
argument due to how it was trained. This prompt lets T5 know this is a summarization task.
The supports distributed training and mixed precision, which means you can also use it in a script. To enable both of these features:
TensorFlow scripts utilize a for distributed training, and you donβt need to add any additional arguments to the training script. The TensorFlow script will use multiple GPUs by default if they are available.
Tensor Processing Units (TPUs) are specifically designed to accelerate performance. PyTorch supports TPUs with the deep learning compiler (see for more details). To use a TPU, launch the xla_spawn.py
script and use the num_cores
argument to set the number of TPU cores you want to use.
Tensor Processing Units (TPUs) are specifically designed to accelerate performance. TensorFlow scripts utilize a for training on TPUs. To use a TPU, pass the name of the TPU resource to the tpu
argument.
π is a PyTorch-only library that offers a unified method for training a model on several types of setups (CPU-only, multiple GPUs, TPUs) while maintaining complete visibility into the PyTorch training loop. Make sure you have π Accelerate installed if you donβt already have it:
All scripts can upload your final model to the . Make sure you are logged into BOINC AI before you begin: