Visual Lip Activity Detection and Speaker Detection Using Mouth Region Intensities

被引:29
作者
Siatras, Spyridon [1 ]
Nikolaidis, Nikos [1 ]
Krinidis, Michail [1 ]
Pitas, Ioannis [1 ]
机构
[1] Aristotle Univ Thessaloniki, Dept Informat, Thessaloniki 54124, Greece
关键词
Speaker detection; visual speech detection; SPEECH; FEATURES;
D O I
10.1109/TCSVT.2008.2009262
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
In this letter, we introduce a novel approach for lip activity detection and speaker detection, using solely visual information. The main idea in this work is to apply signal detection algorithms to a simple and easily extracted feature from the mouth region. We argue that the increased average value and standard deviation of the number of pixels with low intensities that the mouth region of a speaking person demonstrates can be used as visual cues for detecting visual speech. We then proceed in deriving a statistical algorithm that utilizes this fact for the efficient characterization of visual speech and silence In video sequences. Furthermore, we employ the lip activity detection method in order to determine the active speaker(s) in a multi-person environment.
引用
收藏
页码:133 / 137
页数:5
相关论文
共 50 条
  • [31] Replay spoof detection for speaker verification system using magnitude-phase-instantaneous frequency and energy features
    Bharath, K. P.
    Kumar, M. Rajesh
    MULTIMEDIA TOOLS AND APPLICATIONS, 2022, 81 (27) : 39343 - 39366
  • [32] Fuzzy Neural Network with Audio-Visual Data for Voice Activity Detection in Noisy Environments
    Wu, Gin-Der
    Zhu, Zhen-Wei
    2018 INTERNATIONAL CONFERENCE ON INTELLIGENT AUTONOMOUS SYSTEMS (ICOIAS), 2018, : 141 - 145
  • [33] A Framework for Speech Activity Detection Using Adaptive Auditory Receptive Fields
    Carlin, Michael A.
    Elhilali, Mounya
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2015, 23 (12) : 2422 - 2433
  • [34] ROBUST DETECTION OF GLOTTAL ACTIVITY USING UNWRAPPED PHASE ELECTROGLOTTOGRAPHIC SIGNAL
    Mandal, Tanumay
    Rao, K. Sreenivasa
    2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 5584 - 5588
  • [35] Consonant-vowel unit recognition using dominant aperiodic and transition region detection
    Sarma, Biswajit D.
    Prasanna, S. R. Mahadeva
    Sarmah, Priyankoo
    SPEECH COMMUNICATION, 2017, 92 : 77 - 89
  • [36] Multi-class Object Detection with Hough Forests Using Local Histograms of Visual Words
    Muehling, Markus
    Ewerth, Ralph
    Shi, Bing
    Freisleben, Bernd
    COMPUTER ANALYSIS OF IMAGES AND PATTERNS: 14TH INTERNATIONAL CONFERENCE, CAIP 2011, PT I, 2011, 6854 : 386 - 393
  • [37] Automatic Detection of Pornographic and Gambling Websites Based on Visual and Textual Content Using a Decision Mechanism
    Chen, Yang
    Zheng, Rongfeng
    Zhou, Anmin
    Liao, Shan
    Liu, Liang
    SENSORS, 2020, 20 (14) : 1 - 21
  • [38] Enhancing detection of steady-state visual evoked potentials using channel ensemble method
    Yan, Wenqiang
    Du, Chenghang
    Luo, Dan
    Wu, YongCheng
    Duan, Nan
    Zheng, Xiaowei
    Xu, Guanghua
    JOURNAL OF NEURAL ENGINEERING, 2021, 18 (04)
  • [39] V-SIN: Visual Saliency detection in noisy Images using convolutional neural Network
    Singh, Maheep
    Govil, M. C.
    Pilli, E. S.
    2018 CONFERENCE ON INFORMATION AND COMMUNICATION TECHNOLOGY (CICT'18), 2018,
  • [40] Visual seizure annotation and automated seizure detection using behind-the-ear electroencephalographic channels
    Vandecasteele, Kaat
    De Cooman, Thomas
    Dan, Jonathan
    Cleeren, Evy
    Van Huffel, Sabine
    Hunyadi, Borbala
    Van Paesschen, Wim
    EPILEPSIA, 2020, 61 (04) : 766 - 775