Determining the subcellular location of new proteins from microscope images using local features

被引:50
作者
Coelho, Luis Pedro [1 ,2 ]
Kangas, Joshua D. [1 ,2 ]
Naik, Armaghan W. [1 ,2 ]
Osuna-Highley, Elvira [3 ]
Glory-Afshar, Estelle [3 ]
Fuhrman, Margaret [4 ]
Simha, Ramanuja [5 ]
Berget, Peter B. [4 ]
Jarvik, Jonathan W. [4 ]
Murphy, Robert F. [1 ,2 ,3 ,4 ,6 ]
机构
[1] Carnegie Mellon Univ, Lane Ctr Computat Biol, Pittsburgh, PA 15213 USA
[2] Joint Carnegie Mellon Univ Univ Pittsburgh PhD Pr, Pittsburgh, PA 15213 USA
[3] Carnegie Mellon Univ, Dept Biomed Engn, Pittsburgh, PA 15213 USA
[4] Carnegie Mellon Univ, Dept Biol Sci, Pittsburgh, PA 15213 USA
[5] Univ Delaware, Dept Comp & Informat Sci, Newark, DE 19716 USA
[6] Carnegie Mellon Univ, Dept Machine Learning, Pittsburgh, PA 15213 USA
关键词
AUTOMATED CLASSIFICATION; PATTERNS; RECOGNITION; PROTEOMICS; FRAMEWORK; PROBES; ATLAS;
D O I
10.1093/bioinformatics/btt392
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Evaluation of previous systems for automated determination of subcellular location from microscope images has been done using datasets in which each location class consisted of multiple images of the same representative protein. Here, we frame a more challenging and useful problem where previously unseen proteins are to be classified. Results: Using CD-tagging, we generated two new image datasets for evaluation of this problem, which contain several different proteins for each location class. Evaluation of previous methods on these new datasets showed that it is much harder to train a classifier that generalizes across different proteins than one that simply recognizes a protein it was trained on. We therefore developed and evaluated additional approaches, incorporating novel modifications of local features techniques. These extended the notion of local features to exploit both the protein image and any reference markers that were imaged in parallel. With these, we obtained a large accuracy improvement in our new datasets over existing methods. Additionally, these features help achieve classification improvements for other previously studied datasets.
引用
收藏
页码:2343 / 2349
页数:7
相关论文
共 36 条
  • [1] NEW LOOK AT STATISTICAL-MODEL IDENTIFICATION
    AKAIKE, H
    [J]. IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 1974, AC19 (06) : 716 - 723
  • [2] Subcellular localization of mammalian type II membrane proteins
    Aturaliya, RN
    Fink, JL
    Davis, MJ
    Teasdale, MS
    Hanson, KA
    Miranda, KC
    Forrest, ARR
    Grimmond, SM
    Suzuki, H
    Kanamori, M
    Kai, C
    Kawai, J
    Carninci, P
    Hayashizaki, Y
    Teasdale, RD
    [J]. TRAFFIC, 2006, 7 (05) : 613 - 625
  • [3] Toward a confocal subcellular atlas of the human proteome
    Barbe, Laurent
    Lundberg, Emma
    Oksvold, Per
    Stenius, Anna
    Lewin, Erland
    Bjorling, Erik
    Asplund, Anna
    Ponten, Fredrik
    Brismar, Hjalmar
    Uhlen, Mathias
    Andersson-Svahn, Helene
    [J]. MOLECULAR & CELLULAR PROTEOMICS, 2008, 7 (03) : 499 - 508
  • [4] Speeded-Up Robust Features (SURF)
    Bay, Herbert
    Ess, Andreas
    Tuytelaars, Tinne
    Van Gool, Luc
    [J]. COMPUTER VISION AND IMAGE UNDERSTANDING, 2008, 110 (03) : 346 - 359
  • [5] Boland MV, 1998, CYTOMETRY, V33, P366
  • [6] A neural network classifier capable of recognizing the patterns of all major subcellular structures in fluorescence microscope images of HeLa cells
    Boland, MV
    Murphy, RF
    [J]. BIOINFORMATICS, 2001, 17 (12) : 1213 - 1223
  • [7] LIBSVM: A Library for Support Vector Machines
    Chang, Chih-Chung
    Lin, Chih-Jen
    [J]. ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY, 2011, 2 (03)
  • [8] A multiresolution approach to automated classification of protein subcellular location images
    Chebira, Amina
    Barbotin, Yann
    Jackson, Charles
    Merryman, Thomas
    Srinivasa, Gowri
    Murphy, Robert F.
    Kovacevic, Jelena
    [J]. BMC BIOINFORMATICS, 2007, 8 (1)
  • [9] COELHO LP, 2013, J OPEN RES SOFTW, V1
  • [10] Structured Literature Image Finder: Extracting Information from Text and Images in Biomedical Literature
    Coelho, Luis Pedro
    Ahmed, Amr
    Arnold, Andrew
    Kangas, Joshua
    Sheikh, Abdul-Saboor
    Xing, Eric P.
    Cohen, William W.
    Murphy, Robert F.
    [J]. LINKING LITERATURE, INFORMATION, AND KNOWLEDGE FOR BIOLOGY, 2010, 6004 : 23 - +