Audio and video based speech separation for multiple moving sources within a room environment
Overview
Human beings have developed a unique ability to communicate within a noisy environment, such as at a cocktail party. This skill is dependent upon the use of both the aural and visual senses together with sophisticated processing within the brain. To mimic this ability within a machine is very challenging, particularly if the humans are moving. This project attempts to address major challenges in audio-visual speaker localization, tracking and separation.
Funder
Team
Principal investigators
![Wenwu Wang](/sites/default/files/styles/image_150x150_scale_and_crop/public/2018-09/WenwuWang_profile2018_03.png?itok=eS2sSr7J)
![Josef Kittler](/sites/default/files/styles/image_150x150_scale_and_crop/public/2018-10/JosefKittler_profile2018_02.png?itok=gs8fcKr4)