Locally Embedding Autoencoders: A Semi-Supervised Manifold Learning Approach of Document Representation

被引:13
|
作者
Wei, Chao [1 ]
Luo, Senlin [1 ]
Ma, Xincheng [1 ]
Ren, Hao [1 ]
Zhang, Ji [1 ]
Pan, Limin [1 ]
机构
[1] Beijing Inst Technol, Beijing 10081, Peoples R China
来源
PLOS ONE | 2016年 / 11卷 / 01期
关键词
NETWORK;
D O I
10.1371/journal.pone.0146672
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Topic models and neural networks can discover meaningful low-dimensional latent representations of text corpora; as such, they have become a key technology of document representation. However, such models presume all documents are non-discriminatory, resulting in latent representation dependent upon all other documents and an inability to provide discriminative document representation. To address this problem, we propose a semi-supervised manifold-inspired autoencoder to extract meaningful latent representations of documents, taking the local perspective that the latent representation of nearby documents should be correlative. We first determine the discriminative neighbors set with Euclidean distance in observation spaces. Then, the autoencoder is trained by joint minimization of the Bernoulli cross-entropy error between input and output and the sum of the square error between neighbors of input and output. The results of two widely used corpora show that our method yields at least a 15% improvement in document clustering and a nearly 7% improvement in classification tasks compared to comparative methods. The evidence demonstrates that our method can readily capture more discriminative latent representation of new documents. Moreover, some meaningful combinations of words can be efficiently discovered by activating features that promote the comprehensibility of latent representation.
引用
收藏
页数:20
相关论文
共 50 条
  • [21] Azimuth-Aware Discriminative Representation Learning for Semi-Supervised Few-Shot SAR Vehicle Recognition
    Zhang, Linbin
    Leng, Xiangguang
    Feng, Sijia
    Ma, Xiaojie
    Ji, Kefeng
    Kuang, Gangyao
    Liu, Li
    REMOTE SENSING, 2023, 15 (02)
  • [22] SCANet: A Unified Semi-Supervised Learning Framework for Vessel Segmentation
    Shen, Ning
    Xu, Tingfa
    Bian, Ziyang
    Huang, Shiqi
    Mu, Feng
    Huang, Bo
    Xiao, Yuze
    Li, Jianan
    IEEE TRANSACTIONS ON MEDICAL IMAGING, 2023, 42 (09) : 2476 - 2489
  • [23] Semi-Supervised Self-Learning-Based Lifetime Prediction for Batteries
    Che, Yunhong
    Stroe, Daniel-Ioan
    Hu, Xiaosong
    Teodorescu, Remus
    IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, 2023, 19 (05) : 6471 - 6481
  • [24] Semi-supervised Learning for Weed and Crop Segmentation Using UAV Imagery
    Nong, Chunshi
    Fan, Xijian
    Wang, Junling
    FRONTIERS IN PLANT SCIENCE, 2022, 13
  • [25] Improved Road Extraction Models through Semi-Supervised Learning with ACCT
    Yu, Hao
    Du, Shihong
    Tan, Zhenshan
    Zhang, Xiuyuan
    Li, Zhijiang
    ISPRS INTERNATIONAL JOURNAL OF GEO-INFORMATION, 2024, 13 (10)
  • [26] Semi-supervised learning advances species recognition for aquatic biodiversity monitoring
    Ma, Dongliang
    Wei, Jine
    Zhu, Likai
    Zhao, Fang
    Wu, Hao
    Chen, Xi
    Li, Ye
    Liu, Min
    FRONTIERS IN MARINE SCIENCE, 2024, 11
  • [27] Semi-supervised learning for industrial fault detection and diagnosis: A systemic review
    Ramirez-Sanz, Jose Miguel
    Maestro-Prieto, Jose-Alberto
    Arnaiz-Gonzalez, Alvar
    Bustillo, Andres
    ISA TRANSACTIONS, 2023, 143 : 255 - 270
  • [28] Learning discriminative features for semi-supervised person re-identification
    Cai, Huanhuan
    Huang, Lei
    Zhang, Wenfeng
    Wei, Zhiqiang
    MULTIMEDIA TOOLS AND APPLICATIONS, 2022, 81 (02) : 1787 - 1809
  • [29] Semi-supervised learning framework for oil and gas pipeline failure detection
    Alobaidi, Mohammad H.
    Meguid, Mohamed A.
    Zayed, Tarek
    SCIENTIFIC REPORTS, 2022, 12 (01)
  • [30] Semi-Supervised Domain Adaption Classifier via Broad Learning System
    Xuan, Zehua
    Ren, Chang-E
    Shi, Zhiping
    Guan, Yong
    2021 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC), 2021, : 2743 - 2748