Application of a Neural Network-based Visual Question Answering System in Preschool Language Education

被引：0

作者：

Cheng Y. ^{[1
]}

机构：

[1] School of Teachers College, Xianyang Vocational Technical College, Xianyang

来源：

IEIE Transactions on Smart Processing and Computing | 2023年 / 12卷 / 05期

关键词：

Attention mechanism; Neural network; Preschool language education; Visual question answering;

D O I：

10.5573/IEIESPC.2023.12.5.419

中图分类号：

学科分类号：

摘要：

The continuous progress of modern science and technology has led to comprehensive innovations in education, and the use of information technology for teaching has become the mainstream in the current education field. For children’s preschool language education, the application of a visual question answering (VQA) system has gradually become a new development power. This research uses a Recurrent Neural Network and a VGGNet-16 network to extract features from text and images, respectively, and applies a Hierarchical Joint Attention (HJA) model to the whole VQA system. Experiment results demonstrate that the HJA model reaches the target accuracy after 125 iterations, and convergence performance is good. When using the VQAv1 dataset, accuracy can stabilize at 88% after 18 iterations, and when using the VQAv2 dataset, the highest and lowest overall accuracy rates are 77% and 72%, respectively. The three question types (Num, Y/N, and Other) are answered with high accuracy when using the chosen preschool language education database for children, providing accuracy rates of 90%, 94%, and 91%, respectively. This new reference technique offers a new method for maximization of a VQA system, and significantly raises the preschool language education level of the children. Copyrights © 2023 The Institute of Electronics and Information Engineers.

引用

页码：419 / 427

页数：8

共 25 条

[1] Partika A., Johnson A. D., Phillips D. A., Et al., Dual language supports for dual language learners? Exploring preschool classroom instructional supports for DLLs’ early learning outcomes, Early Childhood Research Quarterly, 56, pp. 124-138, (2021)
[2] Satagalieva S. M., The trends for modern libraries and building the strategy of library and information education in the Republic of Kazakhstan, Scientific and Technical Libraries, 3, pp. 58-70, (2021)
[3] Zhang M., Zhang M., Tian G., Et al., A Home Service-Oriented Question Answering System with High Accuracy and Stability, IEEE Access, 2019, pp. 1-3, (2019)
[4] Yin C., Tang J., Xu Z., Et al., Memory Augmented Deep Recurrent Neural Network for Video Question Answering, IEEE Transactions on Neural Networks and Learning Systems, 99, pp. 1-9, (2019)
[5] Cao Q., Liang X., Li B., Et al., Interpretable Visual Question Answering by Reasoning on Dependency Trees, IEEE transactions on pattern analysis and machine intelligence, 43, 3, pp. 887-901, (2021)
[6] Hong J., Fu J., Uh Y., Et al., Exploiting hierarchical visual features for visual question answering, Neurocomputing, 351, pp. 187-195, (2019)
[7] Garg S., Srivastava R., Object sequences: encoding categorical and spatial information for a yes/no visual question answering task, Computer Vision, IET, 12, 8, pp. 1141-1150, (2018)
[8] Yusuf A. A., Chong F., Xianling M., Evaluation of graph convolutional networks performance for visual question answering on reasoning datasets, Multimedia Tools and Applications, pp. 1-10, (2022)
[9] Le T. M., Le V., Venkatesh S., Et al., Hierarchical Conditional Relation Networks for Multimodal Video Question Answering, International Journal of Computer Vision, 8, pp. 1-24, (2021)
[10] Chong F., Yusuf A. A., Xianling M., An analysis of graph convolutional networks and recent datasets for visual question answering, Artificial Intelligence Review, pp. 1-24, (2022)

← 1 2 3 →