Exploiting covariate embeddings for classification using Gaussian processes

被引:1
|
作者
Andrade, Daniel [1 ]
Tamura, Akihiro [2 ]
Tsuchida, Masaaki [3 ]
机构
[1] NEC Corp Ltd, Secur Res Labs, Tokyo, Japan
[2] Ehime Univ, Grad Sch Sci & Engn, Matsuyama, Ehime, Japan
[3] DeNA Co Ltd, Tokyo, Japan
关键词
Logistic regression; Auxiliary information of covariates; Gaussian process; Text classification; TEXT CLASSIFICATION;
D O I
10.1016/j.patrec.2018.01.011
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In many logistic regression tasks, auxiliary information about the covariates is available. For example, a user might be able to specify a similarity measure between the covariates, or an embedding (feature vector) for each covariate, which is created from unlabeled data. In particular for text classification, the covariates (words) can be described by word embeddings or similarity measures from lexical resources like WordNet. We propose a new method to use such embeddings of covariates for logistic regression. Our method consists of two main components. The first component is a Gaussian process (GP) with a covariance function that models the correlations between covariates, and returns a noise-free estimate of the covariates. The second component is a logistic regression model that uses these noise-free estimates. One advantage of our model is that the covariance function can be adjusted to the training data using maximum likelihood. Another advantage is that new covariates that never occurred in the training data can be incorporated at test time, while run-time increases only linearly in the number of new covariates. Our experiments demonstrate the usefulness of our method in situations when only small training data is available. (c) 2018 Elsevier B.V. All rights reserved.
引用
收藏
页码:8 / 14
页数:7
相关论文
共 50 条
  • [31] Using word embeddings in Twitter election classification
    Xiao Yang
    Craig Macdonald
    Iadh Ounis
    Information Retrieval Journal, 2018, 21 : 183 - 207
  • [32] Debate Stance Classification Using Word Embeddings
    Konjengbam, Anand
    Ghosh, Subrata
    Kumar, Nagendra
    Singh, Manish
    BIG DATA ANALYTICS AND KNOWLEDGE DISCOVERY (DAWAK 2018), 2018, 11031 : 382 - 395
  • [33] Kernel Machine Classification Using Universal Embeddings
    Boufounos, Petros T.
    Mansour, Hassan
    2015 DATA COMPRESSION CONFERENCE (DCC), 2015, : 440 - 440
  • [34] Using word embeddings in Twitter election classification
    Yang, Xiao
    Macdonald, Craig
    Ounis, Iadh
    INFORMATION RETRIEVAL JOURNAL, 2018, 21 (2-3): : 183 - 207
  • [35] Sentiment analysis with covariate-assisted word embeddings
    Xu, Shirong
    Dai, Ben
    Wang, Junhui
    ELECTRONIC JOURNAL OF STATISTICS, 2021, 15 (01): : 3015 - 3039
  • [36] Robust Library Building for Autonomous Classification of Downhole Geophysical Logs Using Gaussian Processes
    Katherine L. Silversides
    Arman Melkumyan
    Pure and Applied Geophysics, 2017, 174 : 1255 - 1268
  • [37] Land Cover Classification With Gaussian Processes Using Spatio-Spectro-Temporal Features
    Bellet, Valentine
    Fauvel, Mathieu
    Inglada, Jordi
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2023, 61
  • [38] Robust Library Building for Autonomous Classification of Downhole Geophysical Logs Using Gaussian Processes
    Silversides, Katherine L.
    Melkumyan, Arman
    PURE AND APPLIED GEOPHYSICS, 2017, 174 (03) : 1255 - 1268
  • [39] Unsupervised Quadratic Discriminant Embeddings Using Gaussian Mixture Models
    Szekely, Eniko
    Bruno, Eric
    Marchand-Maillet, Stephane
    KNOWLEDGE DISCOVERY, KNOWLEDGE ENGINEERING AND KNOWLEDGE MANAGEMENT, 2011, 128 : 107 - 120
  • [40] Gaussian Embeddings for Collaborative Filtering
    Dos Santos, Ludovic
    Piwowarski, Benjamin
    Gallinari, Patrick
    SIGIR'17: PROCEEDINGS OF THE 40TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, 2017, : 1065 - 1068