TRL
  • 🌍GET STARTED
    • TRL
    • Quickstart
    • Installation
    • PPO Training FAQ
    • Use Trained Models
    • Customize the Training
    • Understanding Logs
  • 🌍API
    • Model Classes
    • Trainer Classes
    • Reward Model Training
    • Supervised Fine-Tuning
    • PPO Trainer
    • Best of N Sampling
    • DPO Trainer
    • Denoising Diffusion Policy Optimization
    • Text Environments
  • 🌍EXAMPLES
    • Example Overview
    • Sentiment Tuning
    • Training with PEFT
    • Detoxifying a Language Model
    • Training StackLlama
    • Learning to Use Tools
    • Multi Adapter RLHF
Powered by GitBook
On this page

🌍API

Model ClassesTrainer ClassesReward Model TrainingSupervised Fine-TuningPPO TrainerBest of N SamplingDPO TrainerDenoising Diffusion Policy OptimizationText Environments
PreviousUnderstanding LogsNextModel Classes