GR0: SELF-SUPERVISED GLOBAL REPRESENTATION LEARNING FOR ZERO-SHOT VOICE CONVERSION (ICASSP 2024)

    Project Page

    Index Page for GR0 Listening Samples

    On this page, you will find a set of audio samples created by our method and baselines, which were benchmarked the listening test (MOS).

    We present a single example per experiment here. For additional examples and the detailed explanations of the labels, please click on "more samples" to access a full list of the audio samples.

    Voice Conversion

    DAPS (unseen dataset, more samples)

    ID Source Target Ours-TFdecoder Ours-TFdecoder+ Resemblyzer YourTTS AutovcAIC AutovcF0

    female=>male

    VCTK (unseen dataset, more samples)

    ID Source Target Ours-TFdecoder Ours-TFdecoder+ Resemblyzer YourTTS AutovcAIC AutovcF0

    female=>male

    EN2PT (unseen dataset and language, more samples)

    ID Source Target Ours-TFdecoder Ours-TFdecoder+ Resemblyzer YourTTS AutovcAIC AutovcF0

    female=>male

    PT2EN (unseen dataset and language, more samples)

    ID Source Target Ours-TFdecoder Ours-TFdecoder+ Resemblyzer YourTTS AutovcAIC AutovcF0

    female=>male