Optimally Modeling with Applications for Multichannel Speech Enhancement Based on Environmental Perception----Institute of Automation

Optimally Modeling with Applications for Multichannel Speech Enhancement Based on Environmental Perception

Apr 18, 2016Author：

PrintText Size A A

Optimally Modeling with Applications for Multichannel Speech Enhancement Based on Environmental Perception

Abstract: The main object of speech enhancement is to suppress the noise component in the noisy speech while keeping the speech component undistorted. It can be widely used in many applications such as the speech recognition system and the telecommunication system. In recent years, the multi-channel speech enhancement, which utilizes two or more microphones, has attracted much attention. By exploiting the spatial information, theoretically, the multi-channel methods can usually achieve better performance compared with the single-channel ones. However, some problems still exist. For the beam forming and Generalized Sidelobe Canceller, they need the prior knowledge of the direction of arrival (DOA) of the speaker, while in practice, the DOA is always unknown, and estimating the DOA is also a difficult task. Although the multichannel wiener filter avoids the DOA estimation problem, it can only leads to the speech distortion in theory, and the performance relies on the noise estimation. In this research, based on the intelligent perception of the acoustic environment, we study optimally modeling of the multichannel speech enhancement problem with applications. On one hand, we do not need any prior knowledge of the acoustic environment; on the other hand, we can reduce speech distortion caused by noise reduction. The main contents include: noise robust acoustic environmental knowledge estimation, confidence measure of the acoustic environmental knowledge, optimally modeling of the multichannel speech enhancement, integration of multiple speech enhancement results based on time-frequency speech property, and experimental verification platform for speech enhancement and its applications. Our research helps to improve the practicability of speech enhancement techniques, and has high value for the society and economy.

Keywords: robust speech recognition; speech enhancement; reverberation suppression; sound localization; sound separation

Contact:

LIU Wenju

E-mail: lwj@nlpr.ia.ac.cn

National Laboratory of Pattern Recognition

Research

Research Projects