On bias, variance, overfitting, gold standard and consensus in single-particle analysis by cryo-electron microscopy
被引:14
|
作者:
Sorzano, C. O. S.
论文数: 0引用数: 0
h-index: 0
机构:
Ctr Nacl Biotecnol CNB CSIC, Biocomp Unit, Calle Darwin 3, Madrid 28049, SpainCtr Nacl Biotecnol CNB CSIC, Biocomp Unit, Calle Darwin 3, Madrid 28049, Spain
Sorzano, C. O. S.
[1
]
Jimenez-Moreno, A.
论文数: 0引用数: 0
h-index: 0
机构:
Ctr Nacl Biotecnol CNB CSIC, Biocomp Unit, Calle Darwin 3, Madrid 28049, SpainCtr Nacl Biotecnol CNB CSIC, Biocomp Unit, Calle Darwin 3, Madrid 28049, Spain
Jimenez-Moreno, A.
[1
]
Maluenda, D.
论文数: 0引用数: 0
h-index: 0
机构:
Ctr Nacl Biotecnol CNB CSIC, Biocomp Unit, Calle Darwin 3, Madrid 28049, SpainCtr Nacl Biotecnol CNB CSIC, Biocomp Unit, Calle Darwin 3, Madrid 28049, Spain
Maluenda, D.
[1
]
Martinez, M.
论文数: 0引用数: 0
h-index: 0
机构:
Ctr Nacl Biotecnol CNB CSIC, Biocomp Unit, Calle Darwin 3, Madrid 28049, SpainCtr Nacl Biotecnol CNB CSIC, Biocomp Unit, Calle Darwin 3, Madrid 28049, Spain
Martinez, M.
[1
]
Ramirez-Aportela, E.
论文数: 0引用数: 0
h-index: 0
机构:
Ctr Nacl Biotecnol CNB CSIC, Biocomp Unit, Calle Darwin 3, Madrid 28049, SpainCtr Nacl Biotecnol CNB CSIC, Biocomp Unit, Calle Darwin 3, Madrid 28049, Spain
Ramirez-Aportela, E.
[1
]
Krieger, J.
论文数: 0引用数: 0
h-index: 0
机构:
Ctr Nacl Biotecnol CNB CSIC, Biocomp Unit, Calle Darwin 3, Madrid 28049, SpainCtr Nacl Biotecnol CNB CSIC, Biocomp Unit, Calle Darwin 3, Madrid 28049, Spain
Krieger, J.
[1
]
Melero, R.
论文数: 0引用数: 0
h-index: 0
机构:
Ctr Nacl Biotecnol CNB CSIC, Biocomp Unit, Calle Darwin 3, Madrid 28049, SpainCtr Nacl Biotecnol CNB CSIC, Biocomp Unit, Calle Darwin 3, Madrid 28049, Spain
Melero, R.
[1
]
Cuervo, A.
论文数: 0引用数: 0
h-index: 0
机构:
Ctr Nacl Biotecnol CNB CSIC, Biocomp Unit, Calle Darwin 3, Madrid 28049, SpainCtr Nacl Biotecnol CNB CSIC, Biocomp Unit, Calle Darwin 3, Madrid 28049, Spain
Cuervo, A.
[1
]
Conesa, J.
论文数: 0引用数: 0
h-index: 0
机构:
Ctr Nacl Biotecnol CNB CSIC, Biocomp Unit, Calle Darwin 3, Madrid 28049, SpainCtr Nacl Biotecnol CNB CSIC, Biocomp Unit, Calle Darwin 3, Madrid 28049, Spain
Conesa, J.
[1
]
Filipovic, J.
论文数: 0引用数: 0
h-index: 0
机构:
Masaryk Univ, Brno, Czech RepublicCtr Nacl Biotecnol CNB CSIC, Biocomp Unit, Calle Darwin 3, Madrid 28049, Spain
Filipovic, J.
[2
]
Conesa, P.
论文数: 0引用数: 0
h-index: 0
机构:
Ctr Nacl Biotecnol CNB CSIC, Biocomp Unit, Calle Darwin 3, Madrid 28049, SpainCtr Nacl Biotecnol CNB CSIC, Biocomp Unit, Calle Darwin 3, Madrid 28049, Spain
Conesa, P.
[1
]
del Cano, L.
论文数: 0引用数: 0
h-index: 0
机构:
Ctr Nacl Biotecnol CNB CSIC, Biocomp Unit, Calle Darwin 3, Madrid 28049, SpainCtr Nacl Biotecnol CNB CSIC, Biocomp Unit, Calle Darwin 3, Madrid 28049, Spain
del Cano, L.
[1
]
Fonseca, Y. C.
论文数: 0引用数: 0
h-index: 0
机构:
Ctr Nacl Biotecnol CNB CSIC, Biocomp Unit, Calle Darwin 3, Madrid 28049, SpainCtr Nacl Biotecnol CNB CSIC, Biocomp Unit, Calle Darwin 3, Madrid 28049, Spain
Fonseca, Y. C.
[1
]
Jimenez-de la Morena, J.
论文数: 0引用数: 0
h-index: 0
机构:
Ctr Nacl Biotecnol CNB CSIC, Biocomp Unit, Calle Darwin 3, Madrid 28049, SpainCtr Nacl Biotecnol CNB CSIC, Biocomp Unit, Calle Darwin 3, Madrid 28049, Spain
Jimenez-de la Morena, J.
[1
]
Losana, P.
论文数: 0引用数: 0
h-index: 0
机构:
Ctr Nacl Biotecnol CNB CSIC, Biocomp Unit, Calle Darwin 3, Madrid 28049, SpainCtr Nacl Biotecnol CNB CSIC, Biocomp Unit, Calle Darwin 3, Madrid 28049, Spain
Losana, P.
[1
]
Sanchez-Garcia, R.
论文数: 0引用数: 0
h-index: 0
机构:
Ctr Nacl Biotecnol CNB CSIC, Biocomp Unit, Calle Darwin 3, Madrid 28049, SpainCtr Nacl Biotecnol CNB CSIC, Biocomp Unit, Calle Darwin 3, Madrid 28049, Spain
Sanchez-Garcia, R.
[1
]
Strelak, D.
论文数: 0引用数: 0
h-index: 0
机构:
Ctr Nacl Biotecnol CNB CSIC, Biocomp Unit, Calle Darwin 3, Madrid 28049, SpainCtr Nacl Biotecnol CNB CSIC, Biocomp Unit, Calle Darwin 3, Madrid 28049, Spain
Strelak, D.
[1
]
Fernandez-Gimenez, E.
论文数: 0引用数: 0
h-index: 0
机构:
Ctr Nacl Biotecnol CNB CSIC, Biocomp Unit, Calle Darwin 3, Madrid 28049, SpainCtr Nacl Biotecnol CNB CSIC, Biocomp Unit, Calle Darwin 3, Madrid 28049, Spain
Fernandez-Gimenez, E.
[1
]
de Isidro-Gomez, F. P.
论文数: 0引用数: 0
h-index: 0
机构:
Ctr Nacl Biotecnol CNB CSIC, Biocomp Unit, Calle Darwin 3, Madrid 28049, SpainCtr Nacl Biotecnol CNB CSIC, Biocomp Unit, Calle Darwin 3, Madrid 28049, Spain
de Isidro-Gomez, F. P.
[1
]
Herreros, D.
论文数: 0引用数: 0
h-index: 0
机构:
Ctr Nacl Biotecnol CNB CSIC, Biocomp Unit, Calle Darwin 3, Madrid 28049, SpainCtr Nacl Biotecnol CNB CSIC, Biocomp Unit, Calle Darwin 3, Madrid 28049, Spain
Herreros, D.
[1
]
Vilas, J. L.
论文数: 0引用数: 0
h-index: 0
机构:
Yale Univ, Sch Engn & Appl Sci, New Haven, CT 06520 USACtr Nacl Biotecnol CNB CSIC, Biocomp Unit, Calle Darwin 3, Madrid 28049, Spain
Vilas, J. L.
[3
]
Marabini, R.
论文数: 0引用数: 0
h-index: 0
机构:
Univ Autonoma Madrid, Escuela Politecn Super, Madrid 28049, SpainCtr Nacl Biotecnol CNB CSIC, Biocomp Unit, Calle Darwin 3, Madrid 28049, Spain
Marabini, R.
[4
]
Carazo, J. M.
论文数: 0引用数: 0
h-index: 0
机构:
Ctr Nacl Biotecnol CNB CSIC, Biocomp Unit, Calle Darwin 3, Madrid 28049, SpainCtr Nacl Biotecnol CNB CSIC, Biocomp Unit, Calle Darwin 3, Madrid 28049, Spain
Carazo, J. M.
[1
]
机构:
[1] Ctr Nacl Biotecnol CNB CSIC, Biocomp Unit, Calle Darwin 3, Madrid 28049, Spain
[2] Masaryk Univ, Brno, Czech Republic
[3] Yale Univ, Sch Engn & Appl Sci, New Haven, CT 06520 USA
[4] Univ Autonoma Madrid, Escuela Politecn Super, Madrid 28049, Spain
Cryo-electron microscopy (cryoEM) has become a well established technique to elucidate the 3D structures of biological macromolecules. Projection images from thousands of macromolecules that are assumed to be structurally identical are combined into a single 3D map representing the Coulomb potential of the macromolecule under study. This article discusses possible caveats along the image-processing path and how to avoid them to obtain a reliable 3D structure. Some of these problems are very well known in the community. These may be referred to as sample-related (such as specimen denaturation at interfaces or non-uniform projection geometry leading to underrepresented projection directions). The rest are related to the algorithms used. While some have been discussed in depth in the literature, such as the use of an incorrect initial volume, others have received much less attention. However, they are fundamental in any data-analysis approach. Chiefly among them, instabilities in estimating many of the key parameters that are required for a correct 3D reconstruction that occur all along the processing workflow are referred to, which may significantly affect the reliability of the whole process. In the field, the term overfitting has been coined to refer to some particular kinds of artifacts. It is argued that overfitting is a statistical bias in key parameter-estimation steps in the 3D reconstruction process, including intrinsic algorithmic bias. It is also shown that common tools (Fourier shell correlation) and strategies (gold standard) that are normally used to detect or prevent overfitting do not fully protect against it. Alternatively, it is proposed that detecting the bias that leads to overfitting is much easier when addressed at the level of parameter estimation, rather than detecting it once the particle images have been combined into a 3D map. Comparing the results from multiple algorithms (or at least, independent executions of the same algorithm) can detect parameter bias. These multiple executions could then be averaged to give a lower variance estimate of the underlying parameters.
机构:
Univ Calif San Francisco, Dept Biochem & Biophys, San Francisco, CA 94158 USAUniv Calif San Francisco, Dept Biochem & Biophys, San Francisco, CA 94158 USA
Cheng, Yifan
Grigorieff, Nikolaus
论文数: 0引用数: 0
h-index: 0
机构:Univ Calif San Francisco, Dept Biochem & Biophys, San Francisco, CA 94158 USA
Grigorieff, Nikolaus
Penczek, Pawel A.
论文数: 0引用数: 0
h-index: 0
机构:
Univ Texas Houston, Sch Med, Dept Biochem & Mol Biol, Houston, TX 77030 USAUniv Calif San Francisco, Dept Biochem & Biophys, San Francisco, CA 94158 USA
Penczek, Pawel A.
Walz, Thomas
论文数: 0引用数: 0
h-index: 0
机构:
Harvard Univ, Sch Med, Dept Cell Biol, Boston, MA 02115 USA
Harvard Univ, Sch Med, Howard Hughes Med Inst, Boston, MA 02115 USAUniv Calif San Francisco, Dept Biochem & Biophys, San Francisco, CA 94158 USA
机构:
Univ Calif San Francisco, Howard Hughes Med Inst, San Francisco, CA 94158 USA
Univ Calif San Francisco, Dept Biochem & Biophys, San Francisco, CA 94158 USAUniv Calif San Francisco, Howard Hughes Med Inst, San Francisco, CA 94158 USA
机构:
Univ Michigan, Inst Life Sci, Ann Arbor, MI 48109 USA
Univ Michigan, Dept Biol Chem, Ann Arbor, MI 48109 USAUniv Michigan, Inst Life Sci, Ann Arbor, MI 48109 USA
机构:
Sorbonne Univ, Inst Rech Dev IRD, Inst Mineral Phys Mat & Cosmochim IMPMC, Unite Mixte Rech UMR 7590,Ctr Natl Rech Sci CNRS,M, F-75005 Paris, FranceSorbonne Univ, Inst Rech Dev IRD, Inst Mineral Phys Mat & Cosmochim IMPMC, Unite Mixte Rech UMR 7590,Ctr Natl Rech Sci CNRS,M, F-75005 Paris, France
Venien-Bryan, Catherine
Fernandes, Carlos A. H.
论文数: 0引用数: 0
h-index: 0
机构:
Sorbonne Univ, Inst Rech Dev IRD, Inst Mineral Phys Mat & Cosmochim IMPMC, Unite Mixte Rech UMR 7590,Ctr Natl Rech Sci CNRS,M, F-75005 Paris, FranceSorbonne Univ, Inst Rech Dev IRD, Inst Mineral Phys Mat & Cosmochim IMPMC, Unite Mixte Rech UMR 7590,Ctr Natl Rech Sci CNRS,M, F-75005 Paris, France