Machine Learning Methods for X-Ray Scattering Data Analysis from Biomacromolecular Solutions

被引:76
作者
Franke, Daniel [1 ]
Jeffries, Cy M. [1 ]
Svergun, Dmitri, I [1 ]
机构
[1] European Mol Biol Lab, Hamburg, Germany
基金
欧盟地平线“2020”;
关键词
SMALL-ANGLE SCATTERING; MACROMOLECULES; PROTEINS; BEAMLINE; PROGRAM;
D O I
10.1016/j.bpj.2018.04.018
中图分类号
Q6 [生物物理学];
学科分类号
071011 ;
摘要
Small-angle x-ray scattering (SAXS) of biological macromolecules in solutions is a widely employed method in structural biology. SAXS patterns include information about the overall shape and low-resolution structure of dissolved particles. Here, we describe how to transform experimental SAXS patterns to feature vectors and how a simple k-nearest neighbor approach is able to retrieve information on overall particle shape and maximal diameter (D-max) as well as molecular mass directly from experimental scattering data. Based on this transformation, we develop a rapid multiclass shape-classification ranging from compact, extended, and flat categories to hollow and random-chain-like objects. This classification may be employed, e.g., as a decision block in automated data analysis pipelines. Further, we map protein structures from the Protein Data Bank into the classification space and, in a second step, use this mapping as a data source to obtain accurate estimates for the structural parameters (D-max,D- molecular mass) of the macromolecule under study based on the experimental scattering pattern alone, without inverse Fourier transform for D-max. All methods presented are implemented in a Fortran binary DATCLASS, part of the ATSAS data analysis suite, available on Linux, Mac, and Windows and free for academic use.
引用
收藏
页码:2485 / 2492
页数:8
相关论文
共 34 条
[1]  
Bellman R. E., 1957, Dynamic programming. Princeton landmarks in mathematics
[2]   MULTIDIMENSIONAL BINARY SEARCH TREES USED FOR ASSOCIATIVE SEARCHING [J].
BENTLEY, JL .
COMMUNICATIONS OF THE ACM, 1975, 18 (09) :509-517
[3]   The Protein Data Bank [J].
Berman, HM ;
Westbrook, J ;
Feng, Z ;
Gilliland, G ;
Bhat, TN ;
Weissig, H ;
Shindyalov, IN ;
Bourne, PE .
NUCLEIC ACIDS RESEARCH, 2000, 28 (01) :235-242
[4]   Versatile sample environments and automation for biological solution X-ray scattering experiments at the P12 beamline (PETRA III, DESY) [J].
Blanchet, Clement E. ;
Spilotros, Alessandro ;
Schwemmer, Frank ;
Graewert, Melissa A. ;
Kikhney, Alexey ;
Jeffries, Cy M. ;
Franke, Daniel ;
Mark, Daniel ;
Zengerle, Roland ;
Cipriani, Florent ;
Fiedler, Stefan ;
Roessle, Manfred ;
Svergun, Dmitri I. .
JOURNAL OF APPLIED CRYSTALLOGRAPHY, 2015, 48 :431-443
[5]   Online data analysis at the ESRF bioSAXS beamline, BM29 [J].
Brennich, Martha Elisabeth ;
Kieffer, Jerome ;
Bonamis, Guillaume ;
Antolinos, Alejandro De Maria ;
Hutin, Stephanie ;
Pernot, Petra ;
Round, Adam .
JOURNAL OF APPLIED CRYSTALLOGRAPHY, 2016, 49 :203-212
[6]   NADPH oxidase activator P67phox behaves in solution as a multidomain protein with semi-flexible linkers [J].
Durand, Dominique ;
Vives, Corinne ;
Cannella, Dominique ;
Perez, Javier ;
Pebay-Peyroula, Eva ;
Vachette, Patrice ;
Fieschi, Franck .
JOURNAL OF STRUCTURAL BIOLOGY, 2010, 169 (01) :45-53
[7]  
Fayyad U, 1996, AI MAG, V17, P37
[8]   Determination of the molecular weight of proteins in solution from a single small-angle X-ray scattering measurement on a relative scale [J].
Fischer, H. ;
de Oliveira Neto, M. ;
Napolitano, H. B. ;
Polikarpov, I. ;
Craievich, A. F. .
JOURNAL OF APPLIED CRYSTALLOGRAPHY, 2010, 43 :101-109
[9]   ATSAS 2.8: a comprehensive data analysis suite for small-angle scattering from macromolecular solutions [J].
Franke, D. ;
Petoukhov, M. V. ;
Konarev, P. V. ;
Panjkovich, A. ;
Tuukkanen, A. ;
Mertens, H. D. T. ;
Kikhney, A. G. ;
Hajizadeh, N. R. ;
Franklin, J. M. ;
Jeffries, C. M. ;
Svergun, D. I. .
JOURNAL OF APPLIED CRYSTALLOGRAPHY, 2017, 50 :1212-1225
[10]  
Franke D, 2015, NAT METHODS, V12, P419, DOI [10.1038/NMETH.3358, 10.1038/nmeth.3358]