Perceptually-motivated
Environment-specific
Speech Enhancement

Jiaqi Su, Adam Finkelstein, Zeyu Jin

[Paper] [Github]

Network

We introduce a data-driven method to enhance speech recordings made in a specific environment. The method handles denoising, de-reverberation, and equalization matching due recording nonlinearities in a unified framework. It relies on a new perceptual loss function that combines adversarial loss with spectrogram features. We show that the method offers an improvement over state of the art baseline methods in both subjective and objective evaluations.

SAMPLES

This page contains the sentences used in the MOS test described in Section 3. Click in the grid of buttons below to play the audio. Colors (ranging from red=bad to green=good) encode the scores which are also shown as text in the button labels.