Outlier detection and ambiguity detection for microarray data in probabilistic discriminant partial least squares regression

被引:7
|
作者
Botella, C. [1 ]
Ferre, J. [1 ]
Boque, R. [1 ]
机构
[1] Univ Rovira & Virgili, Dept Analyt Chem & Organ Chem, Tarragona 43007, Spain
关键词
outlier detection; ambiguous samples; discriminant partial least squares; reject option; EXPRESSION; CLASSIFICATION; PREDICTION; ERROR; CALIBRATION; DESIGN;
D O I
10.1002/cem.1304
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The reject option plays an important role in the classification of microarray data. In this work, a reject option is implemented in the probabilistic discriminant partial least squares (p-DPLS) method in order to reject to classify both outliers and ambiguous samples. Microarray data are highly susceptible to present outliers because of the many steps involved in the experimental process. During the development of the classifier, outliers in the training data may strongly influence the model and degrade its performance. Some future samples to be classified may also be outliers that will most probably be misclassified. Ambiguous samples are samples that cannot be clearly assigned to any of the classes with a high confidence. In this work, outlier detection and ambiguity detection are implemented taking into account the x-residuals, the leverage and the predicted (y) over cap. The method was applied to oligonucleotide microarray data and cDNA microarray data. For the first dataset (prostate cancer data set), the outlier detection criteria allowed us to remove nine samples from the training set. The model without those samples had better classification ability, with a decrease in the classification cost per sample from 0.10 to 0.07. The method was also used in a second dataset (small round blue cell tumours of childhood dataset) to detect prediction outliers so that most of the outliers were rejected to classify and misclassifications were reduced from 100 to 5%. Copyright (C) 2010 John Wiley & Sons, Ltd.
引用
收藏
页码:434 / 443
页数:10
相关论文
共 50 条
  • [1] Classification from microarray data using probabilistic discriminant partial least squares with reject option
    Botella, Cristina
    Ferre, Joan
    Boque, Ricard
    TALANTA, 2009, 80 (01) : 321 - 328
  • [2] Partial least squares and random sample consensus in outlier detection
    Peng, Jiangtao
    Peng, Silong
    Hu, Yong
    ANALYTICA CHIMICA ACTA, 2012, 719 : 24 - 29
  • [3] Ensemble partial least squares regression for descriptor selection, outlier detection, applicability domain assessment, and ensemble modeling in QSAR/QSPR modeling
    Cao, Dong-Sheng
    Deng, Zhen-Ke
    Zhu, Min-Feng
    Yao, Zhi-Jiang
    Dong, Jie
    Zhao, Rui-Gang
    JOURNAL OF CHEMOMETRICS, 2017, 31 (11)
  • [4] Partial least squares with outlier detection in spectral analysis: A tool to predict gasoline properties
    Bao, Xin
    Dai, Liankui
    FUEL, 2009, 88 (07) : 1216 - 1222
  • [5] Efficient and Simplified Modeling for Kerosene Processing Quality Detection Using Partial Least Squares-Discriminant Analysis Regression
    Issa, Hayder M.
    Salih, Rezan H. Hama
    ARO-THE SCIENTIFIC JOURNAL OF KOYA UNIVERSITY, 2024, 12 (01): : 135 - 142
  • [6] Partial Least Squares Regression for Binary Data
    Vicente-Gonzalez, Laura
    Frutos-Bernal, Elisa
    Vicente-Villardon, Jose Luis
    MATHEMATICS, 2025, 13 (03)
  • [7] Integration of Partial Least Squares Regression and Hyperspectral Data Processing for the Nondestructive Detection of the Scaling Rate of Carp (Cyprinus carpio)
    Wang, Huihui
    Wang, Kunlun
    Zhu, Xinyu
    Zhang, Peng
    Yang, Jixin
    Tan, Mingqian
    FOODS, 2020, 9 (04)
  • [8] libPLS: An integrated library for partial least squares regression and linear discriminant analysis
    Li, Hong-Dong
    Xu, Qing-Song
    Liang, Yi-Zeng
    CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS, 2018, 176 : 34 - 43
  • [9] Vehicle Detection Using Partial Least Squares
    Kembhavi, Aniruddha
    Harwood, David
    Davis, Larry S.
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2011, 33 (06) : 1250 - 1265
  • [10] Robust partial least squares regression - part III, outlier analysis and application studies
    Kruger, Uwe
    Zhou, Yan
    Wang, Xun
    Rooney, David
    Thompson, Jillian
    JOURNAL OF CHEMOMETRICS, 2008, 22 (5-6) : 323 - 334