Locally Embedding Autoencoders: A Semi-Supervised Manifold Learning Approach of Document Representation

被引:13
作者
Wei, Chao [1 ]
Luo, Senlin [1 ]
Ma, Xincheng [1 ]
Ren, Hao [1 ]
Zhang, Ji [1 ]
Pan, Limin [1 ]
机构
[1] Beijing Inst Technol, Beijing 10081, Peoples R China
来源
PLOS ONE | 2016年 / 11卷 / 01期
关键词
NETWORK;
D O I
10.1371/journal.pone.0146672
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Topic models and neural networks can discover meaningful low-dimensional latent representations of text corpora; as such, they have become a key technology of document representation. However, such models presume all documents are non-discriminatory, resulting in latent representation dependent upon all other documents and an inability to provide discriminative document representation. To address this problem, we propose a semi-supervised manifold-inspired autoencoder to extract meaningful latent representations of documents, taking the local perspective that the latent representation of nearby documents should be correlative. We first determine the discriminative neighbors set with Euclidean distance in observation spaces. Then, the autoencoder is trained by joint minimization of the Bernoulli cross-entropy error between input and output and the sum of the square error between neighbors of input and output. The results of two widely used corpora show that our method yields at least a 15% improvement in document clustering and a nearly 7% improvement in classification tasks compared to comparative methods. The evidence demonstrates that our method can readily capture more discriminative latent representation of new documents. Moreover, some meaningful combinations of words can be efficiently discovered by activating features that promote the comprehensibility of latent representation.
引用
收藏
页数:20
相关论文
共 50 条
  • [31] Semi-Supervised Dual-Manifold Regularized Fuzzy Broad Learning for ICU Admission Prediction in Post-COVID Transplant Recipients
    Zhang, Xiao
    Nebot, Angela
    2024 IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS, FUZZ-IEEE 2024, 2024,
  • [32] Semi-supervised learning advances species recognition for aquatic biodiversity monitoring
    Ma, Dongliang
    Wei, Jine
    Zhu, Likai
    Zhao, Fang
    Wu, Hao
    Chen, Xi
    Li, Ye
    Liu, Min
    FRONTIERS IN MARINE SCIENCE, 2024, 11
  • [33] Semi-supervised Deep Convolutional Transform Learning for Hyperspectral Image Classification
    Singh, Shikha
    Majumdar, Angshul
    Chouzenoux, Emilie
    Chierchia, Giovanni
    2022 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2022, : 206 - 210
  • [34] Semi-supervised Domain Adaptive Retrieval via Discriminative Hashing Learning
    Xia, Haifeng
    Jing, Taotao
    Chen, Chen
    Ding, Zhengming
    PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2021, 2021, : 3853 - 3861
  • [35] Semi-supervised Learning via Multiple Layer Graph Regularized Perception
    Xu, Haiyun
    Huang, Lili
    Jiang, Bo
    Tang, Jin
    Zhang, Shaojie
    2022 26TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2022, : 3112 - 3118
  • [36] A hierarchical semi-supervised extreme learning machine method for EEG recognition
    She, Qingshan
    Hu, Bo
    Luo, Zhizeng
    Thinh Nguyen
    Zhang, Yingchun
    MEDICAL & BIOLOGICAL ENGINEERING & COMPUTING, 2019, 57 (01) : 147 - 157
  • [37] Semi-Supervised Domain Adaption Classifier via Broad Learning System
    Xuan, Zehua
    Ren, Chang-E
    Shi, Zhiping
    Guan, Yong
    2021 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC), 2021, : 2743 - 2748
  • [38] Online detection of bearing incipient fault with semi-supervised architecture and deep feature representation
    Mao, Wentao
    Tian, Siyu
    Fan, Jingjing
    Liang, Xihui
    Safian, Ali
    JOURNAL OF MANUFACTURING SYSTEMS, 2020, 55 : 179 - 198
  • [39] Transductive active learning - A new semi-supervised learning approach based on iteratively refined generative models to capture structure in data
    Reitmaier, Tobias
    Calma, Adrian
    Sick, Bernhard
    INFORMATION SCIENCES, 2015, 293 : 275 - 298
  • [40] Detecting While Accessing: A Semi-Supervised Learning-Based Approach for Malicious Traffic Detection in Internet of Things
    Luo, Yantian
    Sun, Hancun
    Chen, Xu
    Ge, Ning
    Feng, Wei
    Lu, Jianhua
    CHINA COMMUNICATIONS, 2023, 20 (04) : 302 - 314