Acoustic-based LEGO recognition using attention-based convolutional neural networks

被引:0
|
作者
Van-Thuan Tran
Chia-Yang Wu
Wei-Ho Tsai
机构
[1] National Taipei University of Technology,Department of Electronic Engineering
来源
Artificial Intelligence Review | 2024年 / 57卷
关键词
LEGO recognition; Acoustic-based object detection; Attention mechanism; Audio classification; Audio features; Convolutional neural networks; Time-distributed layers;
D O I
暂无
中图分类号
学科分类号
摘要
This work investigates the classification of LEGO types using deep learning-based audio classification approaches. The motivation for this investigation is based on the following assumption. If objects of the same shape fall freely from a certain height and hit a fixed plane, the impact sounds will be very similar, so we can distinguish the same types of objects from the others. Applying this idea to LEGO recognition, we collect impact sounds of 200 LEGO objects that fall from a height of about 30cm from a designated plane, and design a CNN-based recognition system that processes the impact sounds to determine the type of LEGO it belongs to. Recognizing that the fall of LEGO results in the main impact sound (i.e., only the sound at the moment of impact) and several subsequent sounds, we examine whether considering only the first impact sound or all sounds brings about better classification accuracies. We propose a compact two-dimensional CNN model, namely LegoNet, which is designed with a frame-level attention module at the input spectrogram and time-distributed fully-connected layers. Our experiments show that free-fall impact sounds can be used efficiently for accurate object recognition, and the proposed LegoNet, with a much smaller size, achieves better accuracy and robustness compared to baseline models. Also, using the whole sequence of impact sounds is more informative for LEGO classification than only considering the first impact sound. Moreover, it is found that utilizing data of specific object postures can help to improve the classifier’s performance in the case of small training data. The proposed approach can be employed as an extra module to build intelligent agents or object classification systems that require a rich understanding of the surrounding physical world.
引用
收藏
相关论文
共 50 条
  • [41] Applied attention-based LSTM neural networks in stock prediction
    Cheng, Li-Chen
    Huang, Yu-Hsiang
    Wu, Mu-En
    2018 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2018, : 4716 - 4718
  • [42] Revisiting Attention-Based Graph Neural Networks for Graph Classification
    Tao, Ye
    Li, Ying
    Wu, Zhonghai
    PARALLEL PROBLEM SOLVING FROM NATURE - PPSN XVII, PPSN 2022, PT I, 2022, 13398 : 442 - 458
  • [43] An Attention-Based Time-Frequency Pyramid Pooling Strategy in Deep Convolutional Networks for Acoustic Scene Classification
    Jiang, Pengxu
    Yang, Yang
    Zou, Cairong
    Wang, Qingyun
    IEEE SIGNAL PROCESSING LETTERS, 2024, 31 : 296 - 300
  • [44] Text Classification Research with Attention-based Recurrent Neural Networks
    Du, C.
    Huang, L.
    INTERNATIONAL JOURNAL OF COMPUTERS COMMUNICATIONS & CONTROL, 2018, 13 (01) : 50 - 61
  • [45] Human Action Recognition Using Key-Frame Attention-Based LSTM Networks
    Yang, Changxuan
    Mei, Feng
    Zang, Tuo
    Tu, Jianfeng
    Jiang, Nan
    Liu, Lingfeng
    ELECTRONICS, 2023, 12 (12)
  • [46] Attention-Based Convolutional Recurrent Deep Neural Networks for the Prediction of Response to Repetitive Transcranial Magnetic Stimulation for Major Depressive Disorder
    Shahabi, Mohsen Sadat
    Shalbaf, Ahmad
    Nobakhsh, Behrooz
    Rostami, Reza
    Kazemi, Reza
    INTERNATIONAL JOURNAL OF NEURAL SYSTEMS, 2023, 33 (02)
  • [47] Attention-Based Parallel Multiscale Convolutional Neural Network for Visual Evoked Potentials EEG Classification
    Gao, Zhongke
    Sun, Xinlin
    Liu, Mingxu
    Dang, Weidong
    Ma, Chao
    Chen, Guanrong
    IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS, 2021, 25 (08) : 2887 - 2894
  • [48] End-to-end Language Identification using Attention-based Recurrent Neural Networks
    Geng, Wang
    Wang, Wenfu
    Zhao, Yuanyuan
    Cai, Xinyuan
    Xu, Bo
    17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 2944 - 2948
  • [49] Protein-Protein Interaction Extraction Using Attention-Based Convolution Neural Networks
    Zhang, Hao
    Yang, Mary Qu
    Feng, Xiaoyue
    Yang, William
    Tong, Weida
    Guan, Renchu
    ACM-BCB' 2017: PROCEEDINGS OF THE 8TH ACM INTERNATIONAL CONFERENCE ON BIOINFORMATICS, COMPUTATIONAL BIOLOGY,AND HEALTH INFORMATICS, 2017, : 770 - 771
  • [50] Recognition of a Plant Leaf Based on Convolutional Neural Networks
    Guo, Yingjiu
    Wang, Dayu
    Zhu, Hongwei
    Li, Ailan
    TENTH INTERNATIONAL CONFERENCE ON DIGITAL IMAGE PROCESSING (ICDIP 2018), 2018, 10806