Semi-supervised Naive Hubness Bayesian k-Nearest Neighbor for Gene Expression Data

被引:1
作者
Buza, Krisztian [1 ]
机构
[1] Semmelweis Univ, BioIntelligence Lab, Inst Genom Med & Rare Disorders, Budapest, Hungary
来源
PROCEEDINGS OF THE 9TH INTERNATIONAL CONFERENCE ON COMPUTER RECOGNITION SYSTEMS, CORES 2015 | 2016年 / 403卷
基金
匈牙利科学研究基金会;
关键词
Semi-supervised classification; Gene expression data; High dimensionality; CLASSIFICATION;
D O I
10.1007/978-3-319-26227-7_10
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Classification of gene expression data is the common denominator of various biomedical recognition tasks. However, obtaining class labels for large training samples may be difficult or even impossible in many cases. Therefore, semisupervised classification techniques are required as semi-supervised classifiers take advantage of the unlabeled data. Furthermore, gene expression data is high dimensional which gives rise to the phenomena known under the umbrella of the curse of dimensionality, one of its recently explored aspects being the presence of hubs or hubness for short. Therefore, hubness-aware classifiers were developed recently, such as Naive Hubness Bayesian k-Nearest Neighbor (NHBNN). In this paper, we propose a semi-supervised extension of NHBNN and show in experiments on publicly available gene expression data that the proposed classifier outperforms all its examined competitors.
引用
收藏
页码:101 / 110
页数:10
相关论文
共 23 条
  • [1] Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays
    Alon, U
    Barkai, N
    Notterman, DA
    Gish, K
    Ybarra, S
    Mack, D
    Levine, AJ
    [J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1999, 96 (12) : 6745 - 6750
  • [2] [Anonymous], 2009, P 26 INT C MACHINE L, DOI DOI 10.1145/1553374.1553485
  • [3] [Anonymous], P CIKM C
  • [4] Classification of human lung carcinomas by mRNA expression profiling reveals distinct adenocarcinoma subclasses
    Bhattacharjee, A
    Richards, WG
    Staunton, J
    Li, C
    Monti, S
    Vasa, P
    Ladd, C
    Beheshti, J
    Bueno, R
    Gillette, M
    Loda, M
    Weber, G
    Mark, EJ
    Lander, ES
    Wong, W
    Johnson, BE
    Golub, TR
    Sugarbaker, DJ
    Meyerson, M
    [J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2001, 98 (24) : 13790 - 13795
  • [5] Bishop C., 2006, Pattern recognition and machine learning, P423
  • [6] Buza K, 2011, LECT NOTES ARTIF INT, V6635, P149, DOI 10.1007/978-3-642-20847-8_13
  • [7] Chapelle O., 2009, SEMISUPERVISED LEARN, V20, P542
  • [8] GUILLAUMIN M, 2010, PROC CVPR IEEE, P902, DOI DOI 10.1109/CVPR.2010.5540120
  • [9] Class-imbalanced classifiers for high-dimensional data
    Lin, Wei-Jiun
    Chen, James J.
    [J]. BRIEFINGS IN BIOINFORMATICS, 2013, 14 (01) : 13 - 26
  • [10] Marussy K., 2014, P STUD SCI C BUD U T