A Cognitively Inspired Multi-granularity Model Incorporating Label Information for Complex Long Text Classification

被引:1
作者
Gao, Li [1 ]
Liu, Yi [2 ]
Zhu, Jianmin [3 ]
Yu, Zhen [4 ]
机构
[1] Univ Shanghai Sci & Technol, Lib & Dept Comp Sci & Engn, Shanghai 200093, Peoples R China
[2] Univ Shanghai Sci & Technol, Dept Comp Sci & Engn, Shanghai 200093, Peoples R China
[3] Univ Shanghai Sci & Technol, Sch Mech Engn, Shanghai 200093, Peoples R China
[4] Shanghai Datong High Sch, Shanghai, Peoples R China
关键词
Text classification; Neural network; Machine learning; Multi-head attention; Gated recurrent unit; CONVOLUTIONAL NEURAL-NETWORK;
D O I
10.1007/s12559-023-10237-1
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Because the abstracts contain complex information and the labels of abstracts do not contain information about categories, it is difficult for cognitive models to extract comprehensive features to match the corresponding labels. In this paper, a cognitively inspired multi-granularity model incorporating label information (LIMG) is proposed to solve these problems. Firstly, we use information of abstracts to give labels the actual semantics. It can improve the semantic representation of word embeddings. Secondly, the model uses the dual channel pooling convolutional neural network (DCP-CNN) and the timescale shrink gated recurrent units (TSGRU) to extract multi-granularity information of abstracts. One of the channels in DCP-CNN highlights the key content and the other is used for TSGRU to extract context-related features of abstracts. Finally, TSGRU adds a timescale to retain the long-term dependence by recuring the past information and a soft thresholding algorithm to realize the noise reduction. Experiments were carried out on four benchmark datasets: Arxiv Academic Paper Dataset (AAPD), Web of Science (WOS), Amazon Review and Yahoo! Answers. As compared to the baseline models, the accuracy is improved by up to 3.36%. On AAPD (54,840 abstracts) and WOS (46,985 abstracts) datasets, the micro-F1 score reached 75.62% and 81.68%, respectively. The results show that acquiring label semantics from abstracts can enhance text representations and multi-granularity feature extraction can inspire the cognitive system's understanding of the complex information in abstracts.
引用
收藏
页码:740 / 755
页数:16
相关论文
共 38 条
  • [1] A systematic review of emerging feature selection optimization methods for optimal text classification: the present state and prospective opportunities
    Abiodun, Esther Omolara
    Alabdulatif, Abdulatif
    Abiodun, Oludare Isaac
    Alawida, Moatsum
    Alabdulatif, Abdullah
    Alkhawaldeh, Rami S.
    [J]. NEURAL COMPUTING & APPLICATIONS, 2021, 33 (22) : 15091 - 15118
  • [2] A fine-grained deep learning model using embedded- CNN with BiLSTM for exploiting product sentiments
    Ahmed, Zohair
    Wang, Jianxin
    [J]. ALEXANDRIA ENGINEERING JOURNAL, 2023, 65 : 731 - 747
  • [3] Augmented language model with deep learning adaptation on sentiment analysis for E-learning recommendation
    Alatrash, Rawaa
    Priyadarshini, Rojalina
    Ezaldeen, Hadi
    Alhinnawi, Akram
    [J]. COGNITIVE SYSTEMS RESEARCH, 2022, 75 : 53 - 69
  • [4] Binary Particle Swarm Optimization with an improved genetic algorithm to solve multi-document text summarization problem of Hindi documents
    Aote, Shailendra S.
    Pimpalshende, Anjusha
    Potnurwar, Archana
    Lohi, Shantanu
    [J]. ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2023, 117
  • [5] Impact of word embedding models on text analytics in deep learning environment: a review
    Asudani, Deepak Suresh
    Nagwani, Naresh Kumar
    Singh, Pradeep
    [J]. ARTIFICIAL INTELLIGENCE REVIEW, 2023, 56 (09) : 10345 - 10425
  • [6] Attention-based aspect sentiment classification using enhanced learning through CNN-BiLSTM networks
    Ayetiran, Eniafe Festus
    [J]. KNOWLEDGE-BASED SYSTEMS, 2022, 252
  • [7] Bilingual word embedding fusion for robust unsupervised bilingual lexicon induction
    Cao, Hailong
    Zhao, Tiejun
    Wang, Weixuan
    Peng, Wei
    [J]. INFORMATION FUSION, 2023, 97
  • [8] ALBERT: An automatic learning based execution and resource management system for optimizing Hadoop workload in clouds
    Chen, Chen-Chun
    Wang, Kai-Siang
    Hsiao, Yu-Tung
    Chou, Jerry
    [J]. JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 2022, 168 : 45 - 56
  • [9] Modeling task effects in human reading with neural network-based attention?
    Hahn, Michael
    Keller, Frank
    [J]. COGNITION, 2023, 230
  • [10] Hassan S U., 2022, Sustainable Operations and Computers, V3, P238, DOI DOI 10.1016/J.SUSOC.2022.03.001