Transformers
Ctrlk
  • 🌍GET STARTED
  • 🌍TUTORIALS
  • 🌍TASK GUIDES
  • 🌍DEVELOPER GUIDES
  • 🌍PERFORMANCE AND SCALABILITY
  • 🌍CONTRIBUTE
  • 🌍CONCEPTUAL GUIDES
  • 🌍API
    • 🌍MAIN CLASSES
    • 🌍MODELS
      • 🌍TEXT MODELS
      • 🌍VISION MODELS
      • 🌍AUDIO MODELS
        • Audio Spectrogram Transformer
        • Bark
        • CLAP
        • EnCodec
        • Hubert
        • MCTCT
        • MMS
        • MusicGen
        • Pop2Piano
        • SEW
        • SEW-D
        • Speech2Text
        • Speech2Text2
        • SpeechT5
        • UniSpeech
        • UniSpeech-SAT
        • VITS
        • Wav2Vec2
        • Wav2Vec2-Conformer
        • Wav2Vec2Phoneme
        • WavLM
        • Whisper
        • XLS-R
        • XLSR-Wav2Vec2
      • 🌍MULTIMODAL MODELS
      • 🌍REINFORCEMENT LEARNING MODELS
      • 🌍TIME SERIES MODELS
      • 🌍GRAPH MODELS
  • 🌍INTERNAL HELPERS
Powered by GitBook
On this page
  1. 🌍API
  2. 🌍MODELS

🌍AUDIO MODELS

Audio Spectrogram TransformerBarkCLAPEnCodecHubertMCTCTMMSMusicGenPop2PianoSEWSEW-DSpeech2TextSpeech2Text2SpeechT5UniSpeechUniSpeech-SATVITSWav2Vec2Wav2Vec2-ConformerWav2Vec2PhonemeWavLMWhisperXLS-RXLSR-Wav2Vec2
PreviousYOLOSNextAudio Spectrogram Transformer