GR0: SELF-SUPERVISED GLOBAL REPRESENTATION LEARNING FOR ZERO-SHOT VOICE CONVERSION (ICASSP 2024)

Project Page

Index Page for GR0 Listening Samples

On this page, you will find a set of audio samples created by our method and baselines, which were benchmarked the listening test (MOS).

We present a single example per experiment here. For additional examples and the detailed explanations of the labels, please click on "more samples" to access a full list of the audio samples.

Voice Conversion

DAPS (unseen dataset, more samples)

ID	Source	Target	Ours-TFdecoder	Ours-TFdecoder+	Resemblyzer	YourTTS	AutovcAIC	AutovcF0
female=>male

VCTK (unseen dataset, more samples)

ID	Source	Target	Ours-TFdecoder	Ours-TFdecoder+	Resemblyzer	YourTTS	AutovcAIC	AutovcF0
female=>male

EN2PT (unseen dataset and language, more samples)

ID	Source	Target	Ours-TFdecoder	Ours-TFdecoder+	Resemblyzer	YourTTS	AutovcAIC	AutovcF0
female=>male

PT2EN (unseen dataset and language, more samples)

ID	Source	Target	Ours-TFdecoder	Ours-TFdecoder+	Resemblyzer	YourTTS	AutovcAIC	AutovcF0
female=>male