MCAM: Multiple Clustering Analysis Methodology for Deriving Hypotheses and Insights from High-Throughput Proteomic Datasets

被引:25
作者
Naegle, Kristen M. [1 ,2 ]
Welsch, Roy E. [3 ]
Yaffe, Michael B. [1 ,2 ,4 ]
White, Forest M. [1 ,2 ]
Lauffenburger, Douglas A. [1 ]
机构
[1] MIT, Dept Biol Engn, Cambridge, MA 02139 USA
[2] MIT, Koch Inst Integrat Canc Res, Cambridge, MA 02139 USA
[3] MIT, Alfred P Sloan Sch Management, Cambridge, MA 02139 USA
[4] MIT, Dept Biol, Cambridge, MA USA
关键词
TYROSINE PHOSPHORYLATION; MASS-SPECTROMETRY; GENE-EXPRESSION; PROTEIN-KINASES; IN-VIVO; SITES; PAXILLIN; ACTIVATION; REVEALS; SHC;
D O I
10.1371/journal.pcbi.1002119
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Advances in proteomic technologies continue to substantially accelerate capability for generating experimental data on protein levels, states, and activities in biological samples. For example, studies on receptor tyrosine kinase signaling networks can now capture the phosphorylation state of hundreds to thousands of proteins across multiple conditions. However, little is known about the function of many of these protein modifications, or the enzymes responsible for modifying them. To address this challenge, we have developed an approach that enhances the power of clustering techniques to infer functional and regulatory meaning of protein states in cell signaling networks. We have created a new computational framework for applying clustering to biological data in order to overcome the typical dependence on specific a priori assumptions and expert knowledge concerning the technical aspects of clustering. Multiple clustering analysis methodology ('MCAM') employs an array of diverse data transformations, distance metrics, set sizes, and clustering algorithms, in a combinatorial fashion, to create a suite of clustering sets. These sets are then evaluated based on their ability to produce biological insights through statistical enrichment of metadata relating to knowledge concerning protein functions, kinase substrates, and sequence motifs. We applied MCAM to a set of dynamic phosphorylation measurements of the ERRB network to explore the relationships between algorithmic parameters and the biological meaning that could be inferred and report on interesting biological predictions. Further, we applied MCAM to multiple phosphoproteomic datasets for the ERBB network, which allowed us to compare independent and incomplete overlapping measurements of phosphorylation sites in the network. We report specific and global differences of the ERBB network stimulated with different ligands and with changes in HER2 expression. Overall, we offer MCAM as a broadly-applicable approach for analysis of proteomic data which may help increase the current understanding of molecular networks in a variety of biological problems.
引用
收藏
页数:15
相关论文
共 46 条
  • [1] Novel invadopodia components revealed by differential proteomic analysis
    Attanasio, Francesca
    Caldieri, Giusi
    Giacchetti, Giada
    van Horssen, Remco
    Wieringa, Be
    Buccione, Roberto
    [J]. EUROPEAN JOURNAL OF CELL BIOLOGY, 2011, 90 (2-3) : 115 - 127
  • [2] Tyrosine phosphorylation of paxillin affects the metastatic potential of human osteosarcoma
    Azuma, K
    Tanaka, M
    Uekita, T
    Inoue, S
    Yokota, J
    Ouchi, Y
    Sakai, R
    [J]. ONCOGENE, 2005, 24 (30) : 4754 - 4764
  • [3] Annexin 2 regulates intestinal epithelial cell spreading and wound closure through Rho-related signaling
    Babbin, Brian A.
    Parkos, Charles A.
    Mandell, Kenneth J.
    Winfree, L. Matthew
    Laur, Oskar
    Ivanov, Andrei I.
    Nusrat, Asma
    [J]. AMERICAN JOURNAL OF PATHOLOGY, 2007, 170 (03) : 951 - 966
  • [4] Paxillin phosphorylation controls invadopodia/podosomes spatiotemporal organization
    Badowski, Cedric
    Pawlak, Geraldine
    Grichine, Alexie
    Chabadel, Anne
    Oddou, Christiane
    Jurdic, Pierre
    Pfaff, Martin
    Albiges-Rizo, Corinne
    Block, Marc R.
    [J]. MOLECULAR BIOLOGY OF THE CELL, 2008, 19 (02) : 633 - 645
  • [5] HIERARCHY OF BINDING-SITES FOR GRB2 AND SHC ON THE EPIDERMAL GROWTH-FACTOR RECEPTOR
    BATZER, AG
    ROTIN, D
    URENA, JM
    SKOLNIK, EY
    SCHLESSINGER, J
    [J]. MOLECULAR AND CELLULAR BIOLOGY, 1994, 14 (08) : 5192 - 5201
  • [6] CONTROLLING THE FALSE DISCOVERY RATE - A PRACTICAL AND POWERFUL APPROACH TO MULTIPLE TESTING
    BENJAMINI, Y
    HOCHBERG, Y
    [J]. JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 1995, 57 (01) : 289 - 300
  • [7] An invasion-related complex of cortactin, paxillin and PKCμ associates with invadopodia at sites of extracellular matrix degradation
    Bowden, ET
    Barth, M
    Thomas, D
    Glazer, RI
    Mueller, SC
    [J]. ONCOGENE, 1999, 18 (31) : 4440 - 4449
  • [8] Decoding signalling networks by mass spectrometry-based proteomics
    Choudhary, Chunaram
    Mann, Matthias
    [J]. NATURE REVIEWS MOLECULAR CELL BIOLOGY, 2010, 11 (06) : 427 - 439
  • [9] COOPER JA, 1984, J BIOL CHEM, V259, P7835
  • [10] Genetic network inference: from co-expression clustering to reverse engineering
    D'haeseleer, P
    Liang, SD
    Somogyi, R
    [J]. BIOINFORMATICS, 2000, 16 (08) : 707 - 726