PUGSVM: a caBIG™ analytical tool for multiclass gene selection and predictive classification

被引:7
作者
Yu, Guoqiang [1 ]
Li, Huai [2 ]
Ha, Sook [1 ]
Shih, Ie-Ming [3 ,4 ,5 ]
Clarke, Robert [6 ]
Hoffman, Eric P. [7 ]
Madhavan, Subha [6 ]
Xuan, Jianhua [1 ]
Wang, Yue [1 ]
机构
[1] Virginia Polytech Inst & State Univ, Bradley Dept Elect & Comp Engn, Arlington, VA 22203 USA
[2] NIA, Bioinformat Unit, RRB, NIH, Baltimore, MD 21224 USA
[3] Johns Hopkins Univ, Sch Med, Dept Gynecol & Obstet, Baltimore, MD 21231 USA
[4] Johns Hopkins Univ, Sch Med, Dept Pathol, Baltimore, MD 21231 USA
[5] Johns Hopkins Univ, Sch Med, Dept Oncol, Baltimore, MD 21231 USA
[6] Georgetown Univ, Dept Oncol, Lombardi Comprehens Canc Ctr, Washington, DC 20057 USA
[7] Childrens Natl Med Ctr, Med Genet Res Ctr, Washington, DC 20010 USA
基金
美国国家卫生研究院;
关键词
HIGH-DIMENSIONAL DATA; MOLECULAR CLASSIFICATION; EXPRESSION DATA; CANCER; DISCOVERY; SPACES;
D O I
10.1093/bioinformatics/btq721
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Phenotypic Up-regulated Gene Support Vector Machine (PUGSVM) is a cancer Biomedical Informatics Grid (caBIG (TM)) analytical tool for multiclass gene selection and classification. PUGSVM addresses the problem of imbalanced class separability, small sample size and high gene space dimensionality, where multiclass gene markers are defined by the union of one-versus-everyone phenotypic upregulated genes, and used by a well-matched one-versus-rest support vector machine. PUGSVM provides a simple yet more accurate strategy to identify statistically reproducible mechanistic marker genes for characterization of heterogeneous diseases.
引用
收藏
页码:736 / 738
页数:3
相关论文
共 11 条
  • [1] The properties of high-dimensional data spaces: implications for exploring gene and protein expression data
    Clarke, Robert
    Ressom, Habtom W.
    Wang, Antai
    Xuan, Jianhua
    Liu, Minetta C.
    Gehan, Edmund A.
    Wang, Yue
    [J]. NATURE REVIEWS CANCER, 2008, 8 (01) : 37 - 49
  • [2] Comparison of discrimination methods for the classification of tumors using gene expression data
    Dudoit, S
    Fridlyand, J
    Speed, TP
    [J]. JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2002, 97 (457) : 77 - 87
  • [3] Molecular classification of cancer: Class discovery and class prediction by gene expression monitoring
    Golub, TR
    Slonim, DK
    Tamayo, P
    Huard, C
    Gaasenbeek, M
    Mesirov, JP
    Coller, H
    Loh, ML
    Downing, JR
    Caligiuri, MA
    Bloomfield, CD
    Lander, ES
    [J]. SCIENCE, 1999, 286 (5439) : 531 - 537
  • [4] Analysis of recursive gene selection approaches from microarray data
    Li, F
    Yang, YM
    [J]. BIOINFORMATICS, 2005, 21 (19) : 3741 - 3747
  • [5] Liu Huiqing, 2002, Genome Inform, V13, P51
  • [6] Multiclass cancer classification and biomarker discovery using GA-based algorithms
    Liu, JJ
    Cutler, G
    Li, WX
    Pan, Z
    Peng, SH
    Hoey, T
    Chen, LB
    Ling, XFB
    [J]. BIOINFORMATICS, 2005, 21 (11) : 2691 - 2697
  • [7] Multiclass linear dimension reduction by weighted pairwise Fisher criteria
    Loog, M
    Duin, RPW
    Haeb-Umbach, R
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2001, 23 (07) : 762 - 766
  • [8] Multiclass cancer diagnosis using tumor gene expression signatures
    Ramaswamy, S
    Tamayo, P
    Rifkin, R
    Mukherjee, S
    Yeang, CH
    Angelo, M
    Ladd, C
    Reich, M
    Latulippe, E
    Mesirov, JP
    Poggio, T
    Gerald, W
    Loda, M
    Lander, ES
    Golub, TR
    [J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2001, 98 (26) : 15149 - 15154
  • [9] Rifkin R, 2004, J MACH LEARN RES, V5, P101
  • [10] Approaches to working in high-dimensional data spaces: gene expression microarrays
    Wang, Y.
    Miller, D. J.
    Clarke, R.
    [J]. BRITISH JOURNAL OF CANCER, 2008, 98 (06) : 1023 - 1028