Transformers
search
Ctrlk
  • 🌍GET STARTEDchevron-right
  • 🌍TUTORIALSchevron-right
  • 🌍TASK GUIDESchevron-right
  • 🌍DEVELOPER GUIDESchevron-right
  • 🌍PERFORMANCE AND SCALABILITYchevron-right
  • 🌍CONTRIBUTEchevron-right
  • 🌍CONCEPTUAL GUIDESchevron-right
  • 🌍APIchevron-right
    • 🌍MAIN CLASSESchevron-right
    • 🌍MODELSchevron-right
      • 🌍TEXT MODELSchevron-right
      • 🌍VISION MODELSchevron-right
      • 🌍AUDIO MODELSchevron-right
        • Audio Spectrogram Transformer
        • Bark
        • CLAP
        • EnCodec
        • Hubert
        • MCTCT
        • MMS
        • MusicGen
        • Pop2Piano
        • SEW
        • SEW-D
        • Speech2Text
        • Speech2Text2
        • SpeechT5
        • UniSpeech
        • UniSpeech-SAT
        • VITS
        • Wav2Vec2
        • Wav2Vec2-Conformer
        • Wav2Vec2Phoneme
        • WavLM
        • Whisper
        • XLS-R
        • XLSR-Wav2Vec2
      • 🌍MULTIMODAL MODELSchevron-right
      • 🌍REINFORCEMENT LEARNING MODELSchevron-right
      • 🌍TIME SERIES MODELSchevron-right
      • 🌍GRAPH MODELSchevron-right
  • 🌍INTERNAL HELPERSchevron-right
gitbookPowered by GitBook
block-quoteOn this pagechevron-down
  1. 🌍APIchevron-right
  2. 🌍MODELS

🌍AUDIO MODELS

Audio Spectrogram Transformerchevron-rightBarkchevron-rightCLAPchevron-rightEnCodecchevron-rightHubertchevron-rightMCTCTchevron-rightMMSchevron-rightMusicGenchevron-rightPop2Pianochevron-rightSEWchevron-rightSEW-Dchevron-rightSpeech2Textchevron-rightSpeech2Text2chevron-rightSpeechT5chevron-rightUniSpeechchevron-rightUniSpeech-SATchevron-rightVITSchevron-rightWav2Vec2chevron-rightWav2Vec2-Conformerchevron-rightWav2Vec2Phonemechevron-rightWavLMchevron-rightWhisperchevron-rightXLS-Rchevron-rightXLSR-Wav2Vec2chevron-right
PreviousYOLOSchevron-leftNextAudio Spectrogram Transformerchevron-right