BLASTNet: A call for community-involved big data in combustion machine learning

被引:13
作者
Chung, Wai Tong [1 ]
Jung, Ki Sung [2 ]
Chen, Jacqueline H. [2 ]
Ihme, Matthias [1 ,3 ]
机构
[1] Stanford Univ, Dept Mech Engn, Stanford, CA 94305 USA
[2] Sandia Natl Labs, Combust Res Facil, Livermore, CA 94550 USA
[3] SLAC Natl Accelerator Lab, Dept Photon Sci, Menlo Pk, CA 94025 USA
来源
APPLICATIONS IN ENERGY AND COMBUSTION SCIENCE | 2022年 / 12卷
关键词
Big data; Deep learning; Direct numerical simulation; BLASTNet; CHARACTERISTIC BOUNDARY-CONDITIONS; SIMULATIONS; NETWORKS; NOISE;
D O I
10.1016/j.jaecs.2022.100087
中图分类号
O414.1 [热力学];
学科分类号
摘要
Many state-of-the-art machine learning (ML) fields rely on large datasets and massive deep learning models (with O(109) trainable parameters) to predict target variables accurately without overfitting. Within combustion, a wealth of data exists in the form of high-fidelity simulation data and detailed measurements that have been accumulating since the past decade. Yet, this data remains distributed and can be difficult to access. In this work, we present a realistic and feasible framework which combines (i) community involvement, (ii) public data repositories, and (iii) lossy compression algorithms for enabling broad access to high-fidelity data via a network-of-datasets approach. This Bearable Large Accessible Scientific Training Network-ofDatasets (BLASTNet) is consolidated on a community-hosted web-platform (at https://blastnet.github.io/), and is targeted towards improving accessibility to diverse scientific data for deep learning algorithms. For datasets that exceed the storage limitations in public ML repositories, we propose employing lossy compression algorithms on high-fidelity data, at the cost of introducing controllable amounts of error to the data. This framework leverages the well-known robustness of modern deep learning methods to noisy data, which we demonstrate is also applicable in combustion by training deep learning models on lossy direct numerical simulation (DNS) data in two completely different ML problems - one in combustion regime classification and the other in filtered reaction rate regression. Our results show that combustion DNS data can be compressed by at least 10-fold without affecting deep learning models, and that the resulting lossy errors can even improve their training. We thus call on the research community to help contribute to opening a bearable pathway towards accessible big data in combustion.
引用
收藏
页数:15
相关论文
共 72 条
[1]   NTIRE 2017 Challenge on Single Image Super-Resolution: Dataset and Study [J].
Agustsson, Eirikur ;
Timofte, Radu .
2017 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW), 2017, :1122-1131
[2]   TTHRESH: Tensor Compression for Multidimensional Visual Data [J].
Ballester-Ripoll, Rafael ;
Lindstrom, Peter ;
Pajarola, Renato .
IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS, 2020, 26 (09) :2891-2903
[3]  
BILGER RW, 1989, ANNU REV FLUID MECH, V21, P101
[4]   TRAINING WITH NOISE IS EQUIVALENT TO TIKHONOV REGULARIZATION [J].
BISHOP, CM .
NEURAL COMPUTATION, 1995, 7 (01) :108-116
[5]   Sloan Digital Sky Survey IV: Mapping the Milky Way, Nearby Galaxies, and the Distant Universe [J].
Blanton, Michael R. ;
Bershady, Matthew A. ;
Abolfathi, Bela ;
Albareti, Franco D. ;
Allende Prieto, Carlos ;
Almeida, Andres ;
Alonso-Garcia, Javier ;
Anders, Friedrich ;
Anderson, Scott F. ;
Andrews, Brett ;
Aquino-Ortiz, Erik ;
Aragon-Salamanca, Alfonso ;
Argudo-Fernandez, Maria ;
Armengaud, Eric ;
Aubourg, Eric ;
Avila-Reese, Vladimir ;
Badenes, Carles ;
Bailey, Stephen ;
Barger, Kathleen A. ;
Barrera-Ballesteros, Jorge ;
Bartosz, Curtis ;
Bates, Dominic ;
Baumgarten, Falk ;
Bautista, Julian ;
Beaton, Rachael ;
Beers, Timothy C. ;
Belfiore, Francesco ;
Bender, Chad F. ;
Berlind, Andreas A. ;
Bernardi, Mariangela ;
Beutler, Florian ;
Bird, Jonathan C. ;
Bizyaev, Dmitry ;
Blanc, Guillermo A. ;
Blomqvist, Michael ;
Bolton, Adam S. ;
Boquien, Mederic ;
Borissova, Jura ;
Van den Bosch, Remco ;
Bovy, Jo ;
Brandt, William N. ;
Brinkmann, Jonathan ;
Brownstein, Joel R. ;
Bundy, Kevin ;
Burgasser, Adam J. ;
Burtin, Etienne ;
Busca, Nicolas G. ;
Cappellari, Michele ;
Delgado Carigi, Maria Leticia ;
Carlberg, Joleen K. .
ASTRONOMICAL JOURNAL, 2017, 154 (01)
[6]   Using physics-informed enhanced super-resolution generative adversarial networks for subfilter modeling in turbulent reactive flows [J].
Bode, Mathis ;
Gauding, Michael ;
Lian, Zeyu ;
Denker, Dominik ;
Davidovic, Marco ;
Kleinheinz, Konstantin ;
Jitsev, Jenia ;
Pitsch, Heinz .
PROCEEDINGS OF THE COMBUSTION INSTITUTE, 2021, 38 (02) :2617-2625
[7]  
Bommasani R., 2022, On the opportunities and risks of foundation models, DOI [10.48550/arXiv.2108.07258, DOI 10.48550/ARXIV.2108.07258]
[8]   Diffusion flames [J].
Burke, SP ;
Schumann, TEW .
INDUSTRIAL AND ENGINEERING CHEMISTRY, 1928, 20 (01) :998-1004
[9]   FPC: A High-Speed Compressor for Double-Precision Floating-Point Data [J].
Burtscher, Martin ;
Ratanaworabhan, Paruj .
IEEE TRANSACTIONS ON COMPUTERS, 2009, 58 (01) :18-31
[10]   Detection of precursors of combustion instability using convolutional recurrent neural networks [J].
Cellier, A. ;
Lapeyre, C. J. ;
Oztarlik, G. ;
Poinsot, T. ;
Schuller, T. ;
Selle, L. .
COMBUSTION AND FLAME, 2021, 233