New opportunities for materials informatics: Resources and data mining techniques for uncovering hidden relationships

被引:185
作者
Jain, Anubhav [1 ]
Hautier, Geoffroy [2 ]
Ong, Shyue Ping [3 ]
Persson, Kristin [1 ,4 ]
机构
[1] Univ Calif Berkeley, Lawrence Berkeley Natl Lab, Energy & Environm Technol Div, Berkeley, CA 94720 USA
[2] Catholic Univ Louvain, Inst Condensed Matter & Nanosci IMCN, B-1348 Louvain La Neuve, Belgium
[3] Univ Calif San Diego, Dept NanoEngn, La Jolla, CA 92093 USA
[4] Univ Calif Berkeley, Mat Sci & Engn, Berkeley, CA 94720 USA
关键词
DENSITY-FUNCTIONAL THEORY; CRYSTAL-STRUCTURE; NEURAL-NETWORKS; OXIDE COMPOUNDS; DESIGN; CATHODES; INFRASTRUCTURE; SEMICONDUCTORS; PRINCIPLES; PREDICTION;
D O I
10.1557/jmr.2016.80
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
Data mining has revolutionized sectors as diverse as pharmaceutical drug discovery, finance, medicine, and marketing, and has the potential to similarly advance materials science. In this paper, we describe advances in simulation-based materials databases, open-source software tools, and machine learning algorithms that are converging to create new opportunities for materials informatics. We discuss the data mining techniques of exploratory data analysis, clustering, linear models, kernel ridge regression, tree-based regression, and recommendation engines. We present these techniques in the context of several materials application areas, including compound prediction, Li-ion battery design, piezoelectric materials, photocatalysts, and thermoelectric materials. Finally, we demonstrate how new data and tools are making it easier and more accessible than ever to perform data mining through a new analysis that learns trends in the valence and conduction band character of compounds in the Materials Project database using data on over 2500 compounds.
引用
收藏
页码:977 / 994
页数:18
相关论文
共 127 条
[111]   Unravelling the materials genome: Symmetry relationships in alloy properties [J].
Toda-Caraballo, Isaac ;
Galindo-Nava, Enrique I. ;
Rivera-Diaz-del-Castillo, Pedro E. J. .
JOURNAL OF ALLOYS AND COMPOUNDS, 2013, 566 :217-228
[112]   First principles phonon calculations in materials science [J].
Togo, Atsushi ;
Tanaka, Isao .
SCRIPTA MATERIALIA, 2015, 108 :1-5
[113]   Using design principles to systematically plan the synthesis of hole-conducting transparent oxides: Cu3VO4 and Ag3VO4 as a case study [J].
Trimarchi, Giancarlo ;
Peng, Haowei ;
Im, Jino ;
Freeman, Arthur J. ;
Cloet, Veerle ;
Raw, Adam ;
Poeppelmeier, Kenneth R. ;
Biswas, Koushik ;
Lany, Stephan ;
Zunger, Alex .
PHYSICAL REVIEW B, 2011, 84 (16)
[114]  
Turner H, 2012, J STAT SOFTW, V48, P1
[115]   The Linus Pauling file (LPF) and its application to materials design [J].
Villars, P ;
Onodera, N ;
Iwata, S .
JOURNAL OF ALLOYS AND COMPOUNDS, 1998, 279 (01) :1-7
[116]  
Villars P., 2010, Pearson's Crystal Data: Crystal Structure Database for Inorganic Compounds
[117]   Multi-component transparent conducting oxides: progress in materials modelling [J].
Walsh, Aron ;
Da Silva, Juarez L. F. ;
Wei, Su-Huai .
JOURNAL OF PHYSICS-CONDENSED MATTER, 2011, 23 (33)
[118]   CRYSTMET: a database of the structures and powder patterns of metals and intermetallics [J].
White, PS ;
Rodgers, JR ;
Le Page, Y .
ACTA CRYSTALLOGRAPHICA SECTION B-STRUCTURAL SCIENCE, 2002, 58 :343-348
[119]   Survey of clustering algorithms [J].
Xu, R ;
Wunsch, D .
IEEE TRANSACTIONS ON NEURAL NETWORKS, 2005, 16 (03) :645-678
[120]  
Yang KS, 2012, NAT MATER, V11, P614, DOI [10.1038/NMAT3332, 10.1038/nmat3332]