Speech recognition and intelligent translation under multimodal human-computer interaction system

被引：0

作者：

Huang, Danhua ^{[1
]}

Xiang, Shuaiqiu ^{[2
]}

机构：

[1] Zhejiang Yuexiu Univ, Sch English Studies, Shaoxing 312000, Peoples R China

[2] Shenzhen Inst Informat Technol, Sch Software, Shenzhen 518172, Peoples R China

来源：

JOURNAL OF INTELLIGENT SYSTEMS | 2024年 / 33卷 / 01期

关键词：

multimodal human-computer interaction; speech recognition; intelligent translation; attention mechanism;

D O I：

10.1515/jisys-2023-0192

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The traditional translation robot is limited to the translation of single-mode text images and text videos, which has the problem of low translation accuracy. Therefore, speech recognition and intelligent translation in multimodal human-computer interaction (HCI) system are proposed. First, the network structure of speech recognition model in multi-channel HCI system is established, and the multi-head self-attention mechanism is constructed. Then, the artificial intelligence voice wake-up function is designed, and a multimodal machine translation model is constructed. On this basis, selective attention is added to obtain visual recognition of perceived text, and the decoder is used for multimodal gating fusion to realize the output of encoder translation results. Experimental results show that this method has high BLUE value and high translation accuracy.

引用

页数：14

共 25 条

[1]

Alhumsi MH, 2021, Advanced Journal of Social Science, V8, P164, DOI 10.21467/ajss.8.1.164-170

[2]

Ashok Kumar L., 2022, International Journal of Cognitive Computing in Engineering, V3, P24, DOI DOI 10.1016/J.IJCCE.2022.01.003

[3] Automatic Speech Recognition for Air Traffic Control Communications [J].

Badrinath, Sandeep ;

Balakrishnan, Hamsa .

TRANSPORTATION RESEARCH RECORD, 2022, 2676 (01) :798-810

[4] Waste reduction in printing process by implementing a video inspection system as a human machine interface [J].

Carlos Alberto, Perez Juarez ;

Sonia Karina, Perez Juarez ;

Francisca Irene, Soler Anguiano ;

Adrielly Nahomee, Ramos Alvarez .

PROCEEDINGS OF THE 2ND INTERNATIONAL CONFERENCE ON INDUSTRY 4.0 AND SMART MANUFACTURING (ISM 2020), 2021, 180 :79-85

[5] Effects of Automatic Speech Recognition Software on Pronunciation for Adults With Different Learning Styles [J].

Evers, Katerina ;

Chen, Sufen .

JOURNAL OF EDUCATIONAL COMPUTING RESEARCH, 2021, 59 (04) :669-685

[6] Detection, Speech Recognition, Loudness, and Preference Outcomes With a Direct Drive Hearing Aid: Effects of Bandwidth [J].

Folkeard, Paula ;

Eeckhoutte, Maaike Van ;

Levy, Suzanne ;

Dundas, Drew ;

Abbasalipour, Parvaneh ;

Glista, Danielle ;

Agrawal, Sumit ;

Scollie, Susan .

TRENDS IN HEARING, 2021, 25

[7]

Jasim Mahmood, 2020, Proceedings of the ACM on Human-Computer Interaction, V4, DOI [10.1145/3432912, 10.1145/3432912]

[8]

Jeong Ji Young, 2021, [Korean Journal of Otorhinolaryngology-Head and Neck Surgery, 대한이비인후-두경부외과학회지], V64, P70, DOI [10.3342/kjorl-hns.2019.00696, 10.3342/kjorl-hns.2019.00696]

[9] Effect of Powered Air-Purifying Respirators on Speech Recognition Among Health Care Workers [J].

Kempfle, Judith S. ;

Panda, Ashwin ;

Hottin, Mary ;

Vinik, Kevin ;

Kozin, Elliott D. ;

Ito, Christopher J. ;

Remenschneider, Aaron K. .

OTOLARYNGOLOGY-HEAD AND NECK SURGERY, 2021, 164 (01) :87-90

[10]

Mitchell Elliot G, 2021, Proc ACM Hum Comput Interact, V5, DOI [10.1145/3449173, 10.1145/3449173]

← 1 2 3 →