This paper examines whether visual speech information can be effective within audio-masking-based speaker separation to improve the quality and intelligibility of the target speech. Two visual-only methods of generating an audio mask for speaker separation are first developed. These use a deep neural network to map the visual speech features to an audio feature space from which both visually derived binary masks and visually derived ratio masks are estimated, before application to the speech mixture. Second, an audio ratio masking method forms a baseline approach for speaker separation which is extended to exploit visual speech information to form audio-visual ratio masks. Speech quality and intelligibility tests are carried out on the visual-only, audio-only, and audio-visual masking methods of speaker separation at mixing levels from - 10 to +10 dB. These reveal substantial improvements in the target speech when applying the visual-only and audio-only masks, but with highest performance occurring when combining audio and visual information to create the audio-visual masks.
机构:
Sogang Univ, Dept Elect Engn, Seoul 04107, South KoreaSogang Univ, Dept Elect Engn, Seoul 04107, South Korea
Hwang, Jung-Wook
Park, Jeongkyun
论文数: 0引用数: 0
h-index: 0
机构:
Sogang Univ, Dept Artificial Intelligence, Seoul 04107, South KoreaSogang Univ, Dept Elect Engn, Seoul 04107, South Korea
Park, Jeongkyun
Park, Rae-Hong
论文数: 0引用数: 0
h-index: 0
机构:
Sogang Univ, Dept Elect Engn, Seoul 04107, South Korea
Sogang Univ, ICT Convergence Disaster Safety Res Inst, Seoul 04107, South KoreaSogang Univ, Dept Elect Engn, Seoul 04107, South Korea
Park, Rae-Hong
Park, Hyung-Min
论文数: 0引用数: 0
h-index: 0
机构:
Sogang Univ, Dept Elect Engn, Seoul 04107, South KoreaSogang Univ, Dept Elect Engn, Seoul 04107, South Korea
机构:
Boys Town Natl Res Hosp, Ctr Hearing Res, Omaha, NE 68104 USA
Western Washington Univ, Commun Sci & Disorders, Bellingham, WA 98225 USABoys Town Natl Res Hosp, Ctr Hearing Res, Omaha, NE 68104 USA
Halverson, Destinee M.
Lalonde, Kaylah
论文数: 0引用数: 0
h-index: 0
机构:
Boys Town Natl Res Hosp, Ctr Hearing Res, Omaha, NE 68104 USABoys Town Natl Res Hosp, Ctr Hearing Res, Omaha, NE 68104 USA