Predicting the maximum absorption wavelength of azo dyes using an interpretable machine learning strategy

被引:26
作者
Mai, Jiaqi [1 ]
Lu, Tian [2 ]
Xu, Pengcheng [2 ]
Lian, Zhengheng [2 ]
Li, Minjie [1 ]
Lu, Wencong [1 ,2 ,3 ]
机构
[1] Shanghai Univ, Coll Sci, Dept Chem, Shanghai 200444, Peoples R China
[2] Shanghai Univ, Mat Genome Inst, Shanghai 200444, Peoples R China
[3] Zhejiang Lab, Hangzhou 311100, Peoples R China
关键词
Azo dyes; Machine learning; Maximum absorption wavelength; SHAP; HOLOGRAPHIC DISPLAY; ORGANIC-MOLECULES; TD-DFT; DEFINITION; SPECTRA;
D O I
10.1016/j.dyepig.2022.110647
中图分类号
O69 [应用化学];
学科分类号
081704 ;
摘要
The maximum absorption wavelength (lambda(max)) is one of the most important properties of azo dyes. It is essential to obtain lambda(max) of azo dyes for the development of new molecules in a short time. Herein, the machine learning algorithm "XGBoost " was used to establish a model for predicting lambda(max )of azo dyes. It was found that the coef-ficient of determinations (R-2) of leave-one-out cross-validation (LOOCV) and test set were 0.87, 0.73, respec-tively. According to SHapley Additive exPlanations (SHAP) analysis, the number of sulfur atoms of R-2 group has a strong positive correlation with lambda(max). The more C-N pairs of topological distance 4 appear in R1 group, the more likely the molecular lambda(max )is red-shifted. Further, the high-throughput screening strategy was adopted to screen out 26 azo molecules with larger lambda(max )from nearly 20,000 virtual samples. These molecular lambda(max )are expected to be red shifted from the 610 nm in the dataset. Our study provides a convenient way to search for azo dyes with larger lambda(max).
引用
收藏
页数:9
相关论文
共 58 条
[1]   Perspective: Materials informatics and big data: Realization of the "fourth paradigm" of science in materials science [J].
Agrawal, Ankit ;
Choudhary, Alok .
APL MATERIALS, 2016, 4 (05)
[2]   Synthesis of disperse dyes derived from 4-amino-N-decyl-1, 8-naphthalimide and their dyeing properties on polyester fabrics [J].
Ameuru, Umar Salami ;
Yakubu, Mohammed Kabir ;
Bello, Kasali Ademola ;
Nkeonye, Peter Obinna ;
Halimehjani, Azim Ziyaei .
DYES AND PIGMENTS, 2018, 157 :190-197
[3]   Synthesis and computational study of coumarin thiophene-based D-π-A azo bridge colorants for DSSC and NLOphoric application [J].
Ayare, Nitesh N. ;
Sharma, Suryapratap ;
Sonigara, Keval K. ;
Prasad, Jyoti ;
Soni, Saurabh S. ;
Sekar, Nagaiyan .
JOURNAL OF PHOTOCHEMISTRY AND PHOTOBIOLOGY A-CHEMISTRY, 2020, 394
[4]   Charge transfer and nonlinear optical properties of anthraquinone D-π-A dyes in relation with the DFT based molecular descriptors and perturbational potential [J].
Ayare, Nitesh Niranjan ;
Shukla, Vandana Kumari ;
Sekar, Nagaiyan .
COMPUTATIONAL AND THEORETICAL CHEMISTRY, 2020, 1174
[5]   DENSITY-FUNCTIONAL THERMOCHEMISTRY .3. THE ROLE OF EXACT EXCHANGE [J].
BECKE, AD .
JOURNAL OF CHEMICAL PHYSICS, 1993, 98 (07) :5648-5652
[6]   ATOM PAIRS AS MOLECULAR-FEATURES IN STRUCTURE ACTIVITY STUDIES - DEFINITION AND APPLICATIONS [J].
CARHART, RE ;
SMITH, DH ;
VENKATARAGHAVAN, R .
JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES, 1985, 25 (02) :64-73
[7]   REAL-TIME HOLOGRAPHY IN AZO-DYE-DOPED LIQUID-CRYSTALS [J].
CHEN, AG ;
BRADY, DJ .
OPTICS LETTERS, 1992, 17 (06) :441-443
[8]   Random Forest Approach to QSPR Study of Fluorescence Properties Combining Quantum Chemical Descriptors and Solvent Conditions [J].
Chen, Chia-Hsiu ;
Tanaka, Kenichi ;
Funatsu, Kimito .
JOURNAL OF FLUORESCENCE, 2018, 28 (02) :695-706
[9]   XGBoost: A Scalable Tree Boosting System [J].
Chen, Tianqi ;
Guestrin, Carlos .
KDD'16: PROCEEDINGS OF THE 22ND ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2016, :785-794
[10]   Azo dyes and human health: A review [J].
Chung, King-Thom .
JOURNAL OF ENVIRONMENTAL SCIENCE AND HEALTH PART C-ENVIRONMENTAL CARCINOGENESIS & ECOTOXICOLOGY REVIEWS, 2016, 34 (04) :233-261