Voice Orientation Recognition: New Paradigm of Speech-Based Human-Computer Interaction

被引:2
|
作者
Bu, Yiyu [1 ]
Guo, Peng [1 ]
机构
[1] Huazhong Univ Sci & Technol, Sch Elect Informat & Commun, Wuhan, Peoples R China
关键词
Voice orientation recognition; human-computer interaction; speech interaction; mouth radiation pattern; attention mechanism;
D O I
10.1080/10447318.2023.2233128
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
As one of the most preferred forms of Human-Computer Interaction (HCI) nowadays, speech-based HCI enables people to communicate verbally with machines, leveraging technologies such as speech recognition and speech synthesis. Current paradigm of speech-based HCI focus on the content of speech only, failing to comprehend deeper pointing information in voice interaction. In particular, when encountering scenarios with multiple smart voice devices around, if people intend to interact with a certain device, the lack of extra pointing information (like the role played by the direction of eye gaze) would cause unintended response from the other devices, resulting in poor interaction experience during HCI. Hence, an interesting problem is: Is it possible for the devices to be aware of the orientation of human voice with only the acoustic speech signals? There is little research studying this topic, except for very a few primary works with much room for improvement. The main challenge of this study lies in capturing the concealed orientation information embedded within the speech signal, while simultaneously maintaining the scheme's practicality and high precision. In this paper, we propose Oriennet, for identifying the orientation of human voice. With a series of features intentionally designed in view of the indoor voice propagation model and mouth radiation pattern, as well as the application of attention mechanism, Oriennet achieve 95% accuracy in terms of judging whether people are facing the device or not. Even for the fine-grained task of classifying people's specific orientation from 8 different directions, our work achieved an accuracy of 74%, far outperforming the existed works. We have validated the robustness of Oriennet under various conditions (noisy environment; different people, rooms, languages, locations; fewer microphones), demonstrating its promising applicability in real-life scenarios.
引用
收藏
页码:5259 / 5278
页数:20
相关论文
共 50 条
  • [21] The Applications of Facial Expression Recognition in Human-computer Interaction
    Wang, Huan-Huan
    Gu, Jing-Wei
    PROCEEDINGS OF THE 2018 IEEE INTERNATIONAL CONFERENCE ON ADVANCED MANUFACTURING (IEEE ICAM), 2018, : 288 - 291
  • [22] Intentional Microgesture Recognition for Extended Human-Computer Interaction
    Kandoi, Chirag
    Jung, Changsoo
    Mannan, Sheikh
    VanderHoeven, Hannah
    Meisman, Quincy
    Krishnaswamy, Nikhil
    Blanchard, Nathaniel
    HUMAN-COMPUTER INTERACTION, HCI 2023, PT I, 2023, 14011 : 499 - 518
  • [23] Perceptive animated interfaces: First steps toward a new paradigm for human-computer interaction
    Cole, R
    Van Vuuren, S
    Pellom, B
    Hacioglu, K
    Ma, JY
    Movellan, J
    Schwartz, S
    Wade-Stein, D
    Ward, W
    Yan, J
    PROCEEDINGS OF THE IEEE, 2003, 91 (09) : 1391 - 1405
  • [24] Extending human-computer interaction by using computer vision and colour recognition
    de Oliveira, TB
    Schnitman, L
    Greve, FGP
    de Souza, JAMF
    Proceedings of the Eighth IASTED International Conference on Intelligent Systems and Control, 2005, : 339 - 344
  • [25] Review of constraints on vision-based gesture recognition for human-computer interaction
    Chakraborty, Biplab Ketan
    Sarma, Debajit
    Bhuyan, M. K.
    MacDorman, Karl F.
    IET COMPUTER VISION, 2018, 12 (01) : 3 - 15
  • [26] Research on Human-Computer Interaction Intention Recognition Based on EEG and Eye Movement
    Zhao, Minrui
    Gao, Hongni
    Wang, Wei
    Qu, Jue
    IEEE ACCESS, 2020, 8 : 145824 - 145832
  • [27] Vision-Based Hand Gesture Recognition for Human-Computer Interaction——A Survey
    GAO Yongqiang
    LU Xiong
    SUN Junbin
    TAO Xianglin
    HUANG Xiaomei
    YAN Yuxing
    LIU Jia
    Wuhan University Journal of Natural Sciences, 2020, 25 (02) : 169 - 184
  • [28] Design of Human-Computer Interaction Gesture Recognition System Based on a Flexible Biosensor
    Chen, Qianhui
    INTERNATIONAL JOURNAL OF COMPUTATIONAL INTELLIGENCE SYSTEMS, 2024, 17 (01)
  • [30] A new evaluation model for human-computer interaction
    Xu, WX
    Liu, XM
    2004 IEEE CONFERENCE ON CYBERNETICS AND INTELLIGENT SYSTEMS, VOLS 1 AND 2, 2004, : 1154 - 1159