Massively parallel unsupervised single-particle cryo-EM data clustering via statistical manifold learning

被引:29
作者
Wu, Jiayi [1 ,2 ]
Ma, Yong-Bei [2 ]
Congdon, Charles [3 ]
Brett, Bevin [3 ]
Chen, Shuobing [1 ,2 ]
Xu, Yaofang [2 ,4 ]
Ouyang, Qi [1 ,5 ]
Mao, Youdong [1 ,2 ,6 ]
机构
[1] Peking Univ, Sch Phys, State Key Lab Artificial Microstruct & Mesoscop P, Inst Condensed Matter Phys,Ctr Quantitat Biol, Beijing, Peoples R China
[2] Dana Farber Canc Inst, Intel Parallel Comp Ctr Struct Biol, Boston, MA 02115 USA
[3] Intel Corp, Software & Serv Grp, Santa Clara, CA USA
[4] Peking Univ, Hlth Sci Ctr, Dept Biophys, Beijing, Peoples R China
[5] Peking Univ, Peking Tsinghua Joint Ctr Life Sci, Beijing, Peoples R China
[6] Harvard Med Sch, Dept Microbiol & Immunobiol, Boston, MA USA
基金
中国国家自然科学基金; 美国国家科学基金会;
关键词
NONLINEAR DIMENSIONALITY REDUCTION; MICROSCOPY; CLASSIFICATION; PROJECTION; MACROMOLECULES; IMAGES; SPARX; SUITE; XMIPP;
D O I
10.1371/journal.pone.0182130
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Structural heterogeneity in single-particle cryo-electron microscopy (cryo-EM) data represents a major challenge for high-resolution structure determination. Unsupervised classification may serve as the first step in the assessment of structural heterogeneity. However, traditional algorithms for unsupervised classification, such as K-means clustering and maximum likelihood optimization, may classify images into wrong classes with decreasing signal-to-noise-ratio (SNR) in the image data, yet demand increased computational costs. Overcoming these limitations requires further development of clustering algorithms for high-performance cryo-EM data processing. Here we introduce an unsupervised single-particle clustering algorithm derived from a statistical manifold learning framework called generative topographic mapping (GTM). We show that unsupervised GTM clustering improves classification accuracy by about 40% in the absence of input references for data with lower SNRs. Applications to several experimental datasets suggest that our algorithm can detect subtle structural differences among classes via a hierarchical clustering strategy. After code optimization over a high-performance computing (HPC) environment, our software implementation was able to generate thousands of reference-free class averages within hours in a massively parallel fashion, which allows a significant improvement on ab initio 3D reconstruction and assists in the computational purification of homogeneous datasets for high-resolution visualization.
引用
收藏
页数:25
相关论文
共 50 条
[21]   Disentangling conformational states of macromolecules in 3D-EM through likelihood optimization [J].
Scheres, Sjors H. W. ;
Gao, Haixiao ;
Valle, Mikel ;
Herman, Gabor T. ;
Eggermont, Paul P. B. ;
Frank, Joachim ;
Carazo, Jose-Maria .
NATURE METHODS, 2007, 4 (01) :27-29
[22]   RELION: Implementation of a Bayesian approach to cryo-EM structure determination [J].
Scheres, Sjors H. W. .
JOURNAL OF STRUCTURAL BIOLOGY, 2012, 180 (03) :519-530
[23]   A Bayesian View on Cryo-EM Structure Determination [J].
Scheres, Sjors H. W. .
JOURNAL OF MOLECULAR BIOLOGY, 2012, 415 (02) :406-418
[24]   Conformations of macromolecules and their complexes from heterogeneous datasets [J].
Schwander, P. ;
Fung, R. ;
Ourmazd, A. .
PHILOSOPHICAL TRANSACTIONS OF THE ROYAL SOCIETY B-BIOLOGICAL SCIENCES, 2014, 369 (1647)
[25]   SPIDER image processing for single-particle reconstruction of biological macromolecules from electron micrographs [J].
Shaikh, Tanvir R. ;
Gao, Haixiao ;
Baxter, William T. ;
Asturias, Francisco J. ;
Boisset, Nicolas ;
Leith, Ardean ;
Frank, Joachim .
NATURE PROTOCOLS, 2008, 3 (12) :1941-1974
[26]   A maximum-likelihood approach to single-particle image refinement [J].
Sigworth, FJ .
JOURNAL OF STRUCTURAL BIOLOGY, 1998, 122 (03) :328-339
[27]  
Silva V., 2002, Advances in Neural Information Processing Systems 15 (NIPS 2002)
[28]   Viewing Angle Classification of Cryo-Electron Microscopy Images Using Eigenvectors [J].
Singer, A. ;
Zhao, Z. ;
Shkolnisky, Y. ;
Hadani, R. .
SIAM JOURNAL ON IMAGING SCIENCES, 2011, 4 (02) :723-759
[29]   A clustering approach to multireference alignment of single-particle projections in electron microscopy [J].
Sorzano, C. O. S. ;
Bilbao-Castro, J. R. ;
Shkolnisky, Y. ;
Alcorlo, M. ;
Melero, R. ;
Caffarena-Fernandez, G. ;
Li, M. ;
Xu, G. ;
Marabini, R. ;
Carazo, J. M. .
JOURNAL OF STRUCTURAL BIOLOGY, 2010, 171 (02) :197-206
[30]   XMIPP:: a new generation of an open-source image processing package for electron microscopy [J].
Sorzano, COS ;
Marabini, R ;
Velázquez-Muriel, J ;
Bilbao-Castro, JR ;
Scheres, SHW ;
Carazo, JM ;
Pascual-Montano, A .
JOURNAL OF STRUCTURAL BIOLOGY, 2004, 148 (02) :194-204