A comparative study of cell classifiers for image-based high-throughput screening

被引:9
作者
Abbas, Syed Saiden [1 ]
Dijkstra, Tjeerd M. H. [1 ,2 ]
Heskes, Tom [1 ]
机构
[1] Radboud Univ Nijmegen, Inst Comp & Informat Sci, NL-6525 ED Nijmegen, Netherlands
[2] Eindhoven Univ Technol, Dept Elect Engn, NL-5600 MB Eindhoven, Netherlands
来源
BMC BIOINFORMATICS | 2014年 / 15卷
关键词
MICROSCOPE IMAGES; CLASSIFICATION; RNAI; MULTICLASS; LOCATION; LIBRARY;
D O I
10.1186/1471-2105-15-342
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: Millions of cells are present in thousands of images created in high-throughput screening (HTS). Biologists could classify each of these cells into a phenotype by visual inspection. But in the presence of millions of cells this visual classification task becomes infeasible. Biologists train classification models on a few thousand visually classified example cells and iteratively improve the training data by visual inspection of the important misclassified phenotypes. Classification methods differ in performance and performance evaluation time. We present a comparative study of computational performance of gentle boosting, joint boosting CellProfiler Analyst (CPA), support vector machines (linear and radial basis function) and linear discriminant analysis (LDA) on two data sets of HT29 and HeLa cancer cells. Results: For the HT29 data set we find that gentle boosting, SVM (linear) and SVM (RBF) are close in performance but SVM (linear) is faster than gentle boosting and SVM (RBF). For the HT29 data set the average performance difference between SVM (RBF) and SVM (linear) is 0.42%. For the HeLa data set we find that SVM (RBF) outperforms other classification methods and is on average 1.41% better in performance than SVM (linear). Conclusions: Our study proposes SVM (linear) for iterative improvement of the training data and SVM (RBF) for the final classifier to classify all unlabeled cells in the whole data set.
引用
收藏
页数:10
相关论文
共 37 条
[1]  
Alpaydin E, 2004, INTRO MACH LEARNING
[2]   Cellular Heterogeneity: Do Differences Make a Difference? [J].
Altschuler, Steven J. ;
Wu, Lani F. .
CELL, 2010, 141 (04) :559-563
[3]  
[Anonymous], 2006, P WORKSHOP MICROSCOP
[4]   An automatic method for robust and fast cell detection in bright field images from high-throughput microscopy [J].
Buggenthin, Felix ;
Marr, Carsten ;
Schwarzfischer, Michael ;
Hoppe, Philipp S. ;
Hilsenbeck, Oliver ;
Schroeder, Timm ;
Theis, Fabian J. .
BMC BIOINFORMATICS, 2013, 14
[5]   CellProfiler: image analysis software for identifying and quantifying cell phenotypes [J].
Carpenter, Anne E. ;
Jones, Thouis Ray ;
Lamprecht, Michael R. ;
Clarke, Colin ;
Kang, In Han ;
Friman, Ola ;
Guertin, David A. ;
Chang, Joo Han ;
Lindquist, Robert A. ;
Moffat, Jason ;
Golland, Polina ;
Sabatini, David M. .
GENOME BIOLOGY, 2006, 7 (10)
[6]   LIBSVM: A Library for Support Vector Machines [J].
Chang, Chih-Chung ;
Lin, Chih-Jen .
ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY, 2011, 2 (03)
[7]   Determining the subcellular location of new proteins from microscope images using local features [J].
Coelho, Luis Pedro ;
Kangas, Joshua D. ;
Naik, Armaghan W. ;
Osuna-Highley, Elvira ;
Glory-Afshar, Estelle ;
Fuhrman, Margaret ;
Simha, Ramanuja ;
Berget, Peter B. ;
Jarvik, Jonathan W. ;
Murphy, Robert F. .
BIOINFORMATICS, 2013, 29 (18) :2343-2349
[8]   Automated microscopy for high-content RNAi screening [J].
Conrad, Christian ;
Gerlich, Daniel W. .
JOURNAL OF CELL BIOLOGY, 2010, 188 (04) :453-461
[9]  
Duin RPW, 1998, ADV PATTERN RECOGNIT, V1451
[10]  
Fan RE, 2008, J MACH LEARN RES, V9, P1871