Compositional design of multicomponent alloys using reinforcement learning

被引:6
|
作者
Xian, Yuehui [1 ]
Dang, Pengfei [1 ]
Tian, Yuan [1 ]
Jiang, Xue [2 ]
Zhou, Yumei [1 ]
Ding, Xiangdong [1 ]
Sun, Jun [1 ]
Lookman, Turab [1 ,2 ,3 ]
Xue, Dezhen [1 ]
机构
[1] Xi An Jiao Tong Univ, State Key Lab Mech Behav Mat, Xian 710049, Peoples R China
[2] Univ Sci & Technol Beijing, Beijing Adv Innovat Ctr Mat Genome Engn, Beijing 100083, Peoples R China
[3] AiMat Res LLC, Santa Fe, NM 87501 USA
基金
中国国家自然科学基金;
关键词
Compositional design; Reinforcement learning; Multicomponent alloys; Transformational enthalpy; Phase change materials; PHASE-CHANGE MATERIALS; HIGH ENTROPY ALLOYS; TEMPERATURES; STORAGE;
D O I
10.1016/j.actamat.2024.120017
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
The design of alloys has typically involved adaptive experimental synthesis and characterization guided by machine learning models fitted to available data. A bottleneck for sequential design, be it for self-driven or manual synthesis, by Bayesian Global Optimization (BGO) for example, is that the search space becomes intractable as the number of alloy elements and its compositions exceed a threshold. Here we investigate how reinforcement learning (RL) performs in the compositional design of alloys within a closed loop with manual synthesis and characterization. We demonstrate this strategy by designing a phase change multicomponent alloy (Ti 27.2 Ni 47 Hf 13.8 Zr 12 ) with the highest transformation enthalpy (Delta H) Delta H)-37.1 J/g (-39.0 J/g with further calibration) within the TiNi-based family of alloys from a space of over 2 x 108 8 candidates, although the initial training is only on a compact dataset of 112 alloys. We show how the training efficiency is increased by employing acquisition functions containing uncertainties, such as expected improvement (EI), as the reward itself. Existing alloy data is often limited, however, if the agent is pretrained on experimental results prior to the training process, it can access regions of higher reward values more frequently. In addition, the experimental feedback enables the agent to gradually explore new regions with higher rewards, compositionally different from the initial dataset. Our approach directly applies to processing conditions where the actions would be performed in a given order. We also compare RL performance to BGO and the genetic algorithm on several test functions to gain insight on their relative strengths in materials design.
引用
收藏
页数:9
相关论文
共 50 条
  • [31] Computational design exploration of rocket nozzle using deep reinforcement learning
    Neelakandan, Aagashram
    Doss, Arockia Selvakumar Arockia
    Lakshmaiya, Natrayan
    RESULTS IN ENGINEERING, 2025, 25
  • [32] LQR controller design for affine LPV systems using reinforcement learning
    Ranjbarpur, Hosein
    Atrianfar, Hajar
    Menhaj, Mohammad Bagher
    INTERNATIONAL JOURNAL OF SYSTEMS SCIENCE, 2024, 55 (09) : 1807 - 1819
  • [33] An ensemble learning based amorphous state predictor for multicomponent alloys
    Hu, Jingyi
    Xu, Xiang
    Cui, Yongcheng
    Xu, Mingxian
    Gao, Xiaojin
    Ji, Xiaomei
    JOURNAL OF NON-CRYSTALLINE SOLIDS, 2023, 607
  • [34] Cluster-formula-embedded machine learning for design of multicomponent β-Ti alloys with low Young's modulus
    Yang, Fei
    Li, Zhen
    Wang, Qing
    Jiang, Beibei
    Yan, Biaojie
    Zhang, Pengcheng
    Xu, Wei
    Dong, Chuang
    Liaw, Peter K.
    NPJ COMPUTATIONAL MATERIALS, 2020, 6 (01)
  • [35] Design with shape grammars and reinforcement learning
    Ruiz-Montiel, Manuela
    Boned, Javier
    Gavilanes, Juan
    Jimenez, Eduardo
    Mandow, Lawrence
    Perez-de-la-Cruz, Jose-Luis
    ADVANCED ENGINEERING INFORMATICS, 2013, 27 (02) : 230 - 245
  • [36] Design of a reinforcement learning PID controller
    Guan, Zhe
    Yamamoto, Toru
    IEEJ TRANSACTIONS ON ELECTRICAL AND ELECTRONIC ENGINEERING, 2021, 16 (10) : 1354 - 1360
  • [37] Design of a Reinforcement Learning PID controller
    Guan, Zhe
    Yamamoto, Tom
    2020 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2020,
  • [38] Lyapunov design for safe reinforcement learning
    Perkins, TJ
    Barto, AG
    JOURNAL OF MACHINE LEARNING RESEARCH, 2003, 3 (4-5) : 803 - 832
  • [39] Optimal Hierarchical Learning Path Design With Reinforcement Learning
    Li, Xiao
    Xu, Hanchen
    Zhang, Jinming
    Chang, Hua-hua
    APPLIED PSYCHOLOGICAL MEASUREMENT, 2021, 45 (01) : 54 - 70
  • [40] Theory of the energy fluctuation of multicomponent alloys
    Pei, Zongrui
    SCRIPTA MATERIALIA, 2019, 162 : 503 - 506