Near-End Speech Enhancement

A speech pre-processing algorithm is presented that improves the speech intelligibility in noise for the near-end listener. The algorithm improves intelligibility by optimally redistributing the speech energy over time and frequency according to a perceptual distortion measure, which is based on a spectro-temporal auditory model.

Since this auditory model takes into account short- time information, transients will receive more amplification than stationary vowels, which has been shown to be beneficial for intelligibility of speech in noise.

The proposed method is compared to unprocessed speech and two reference methods using an intelligibility listening test. Results show that the proposed method leads to significant intelligibility gains while still preserving quality.

The attached file contains Matlab code that implements the algorithm.

Related publications

  1. Speech Energy Redistribution for Intelligibility Improvement in Noise Based on a Perceptual Distortion Measure
    C. H. Taal; R. C. Hendriks; R. Heusdens;
    Computer Speech and Language,
    Volume 2013, 2013.

  2. A Speech Preprocessing Strategy For Intelligibility Improvement In Noise Based On A Perceptual Distortion Measure
    Cees H. Taal; Richard C. Hendriks; Richard Heusdens;
    In Proc. IEEE Int. Conf. Acoustics, Speech, Signal Proc. (ICASSP),
    pp. 4061-4064, May 2012.

Repository data

Size: 475 kB
Modified: 18 August 2017
Type: software
Authors: C.H. Taal, Richard Hendriks, Richard Heusdens
Date: May 2012
Contact: Richard Hendriks