Active Learning for Text Classification: Using the LSI Subspace Signature Model

被引:0
|
作者
Zhu, Weizhong [1 ]
Allen, Robert B. [2 ]
机构
[1] City Hope Med Ctr, Los Angeles, CA USA
[2] Yonsei Univ, Dept Lib & Informat Sci, Seoul, South Korea
来源
2014 INTERNATIONAL CONFERENCE ON DATA SCIENCE AND ADVANCED ANALYTICS (DSAA) | 2014年
关键词
active learning; classifiers; Latent Semantic Indexing Subspace Signature Model; text categorization; REGRESSION;
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Supervised learning methods rely on large sets of labeled training examples. However, large training sets are rare and making them is expensive. In this research, Latent Semantic Indexing Subspace Signature Model (LSISSM) is applied to labeling for active learning of unstructured text. Based on Singular Value Decomposition (SVD), LSISSM represents terms and documents as semantic signatures by the distribution of their local statistical contribution across the top-ranking LSI latent dimensions after dimension reduction. When utilized to an unlabeled text corpus, LSISSM finds the most important samples and terms according to their global statistical contribution ranking in the corresponding LSI subspaces without prior knowledge of labels or dependency to model-loss functions of the classifiers. These sample subsets also effectively maintain the sampling distribution of the whole corpus. Furthermore, tests demonstrate that the sample subsets with the optimized term subsets substantially improve the learning accuracy across three standard classifiers.
引用
收藏
页码:149 / 155
页数:7
相关论文
共 50 条
  • [11] MII: A novel text classification model combining deep active learning with BERT
    Zhang A.
    Li B.
    Wang W.
    Wan S.
    Chen W.
    Computers, Materials and Continua, 2020, 63 (03): : 1499 - 1514
  • [12] CoLAL: Co-learning Active Learning for Text Classification
    Le, Linh
    Zhao, Genghong
    Zhang, Xia
    Zuccon, Guido
    Demartini, Gianluca
    THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 12, 2024, : 13337 - 13345
  • [13] Active Learning Based on Transfer Learning Techniques for Text Classification
    Onita, Daniela
    IEEE ACCESS, 2023, 11 : 28751 - 28761
  • [14] A Novel Active Learning Method Using SVM for Text Classification附视频
    Mohamed Goudjil
    Mouloud Koudil
    Mouldi Bedda
    Noureddine Ghoggali
    International Journal of Automation and Computing, 2018, (03) : 290 - 298
  • [15] Small-Text: Active Learning for Text Classification in Python']Python
    Schroeder, Christopher
    Mueller, Lydia
    Niekler, Andreas
    Potthast, Martin
    17TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EACL 2023, 2023, : 84 - 95
  • [16] Target Advertising Classification using Combination of Deep Learning and Text model
    Phaisangittisagul, E.
    Koobkrabee, Y.
    Wirojborisuth, K.
    Ratanasrimetha, T.
    Aummaro, S.
    2019 10TH INTERNATIONAL CONFERENCE OF INFORMATION AND COMMUNICATION TECHNOLOGY FOR EMBEDDED SYSTEMS (IC-ICTES), 2019,
  • [17] Deep active learning for multi label text classification
    Wang, Qunbo
    Zhang, Hangu
    Zhang, Wentao
    Dai, Lin
    Liang, Yu
    Shi, Haobin
    SCIENTIFIC REPORTS, 2024, 14 (01):
  • [18] Active Learning for Text Classification and Fake News Detection
    Sahan, Marko
    Smidl, Vaclav
    Marik, Radek
    2021 INTERNATIONAL SYMPOSIUM ON COMPUTER SCIENCE AND INTELLIGENT CONTROLS (ISCSIC 2021), 2021, : 87 - 94
  • [19] Deep Active Learning for Text Classification with Diverse Interpretations
    Liu, Qiang
    Zhu, Yanqiao
    Liu, Zhaocheng
    Zhang, Yufeng
    Wu, Shu
    PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON INFORMATION & KNOWLEDGE MANAGEMENT, CIKM 2021, 2021, : 3263 - 3267
  • [20] An Extension of the Aspect PLSA Model to Active and Semi-Supervised Learning for Text Classification
    Krithara, Anastasia
    Amini, Massih-Reza
    Goutte, Cyril
    Renders, Jean-Michel
    ARTIFICIAL INTELLIGENCE: THEORIES, MODELS AND APPLICATIONS, PROCEEDINGS, 2010, 6040 : 183 - +