Prediction of the taxonomical classification of the Ranunculaceae family using a machine learning method

被引:4
作者
Chen, Jiao [1 ]
Yang, Wenlu [2 ]
Tan, Guodong [1 ]
Tian, Chunyao [1 ]
Wang, Hongjun [2 ]
Zhou, Jiayu [1 ]
Liao, Hai [1 ]
机构
[1] Southwest Jiaotong Univ, Sch Life Sci & Engn, Chengdu 610031, Sichuan, Peoples R China
[2] Southwest Jiaotong Univ, Inst Artificial Intelligence, Chengdu 610031, Sichuan, Peoples R China
基金
中国国家自然科学基金;
关键词
EVOLUTION; PLANTS;
D O I
10.1039/d1nj03632g
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
Ranunculaceae is a botanical source for various pharmaceutically active compounds, which has been commonly utilized in traditional Chinese medicine. Increasing interest in Ranunculaceae pharmaceutical resources has led to a taxonomical study of this family, which might provide new insight to understand its diversification, relationship and phylogenetic position, and further to find new medicinal resources and promising compounds. In this study, we used the machine learning method to explore the classification of the medicinal Ranunculaceae family. 204 species representing 17 genera of the Ranunculaceae family were collected from the TCMID with their 1280 active compounds composed of structure-based fingerprints. After the construction of species-compound and genus-compound matrices, CNNs and Ext fingerprints were determined as the best machine learning method and fingerprint type using ACC and F-score as clustering criteria, respectively. We found that taxonomical classification within the Ranunculaceae family could be accurately predicted, especially at the genus level with a top ACC of 0.86 and an F-score of 0.85. The top features of compounds that were important for the classification of 17 genera were also identified, and thus some genera with high medicinal values were associated with characteristic cis and (or) trans features. As far as we know, this is the first time that some genera are found to be associated with the structural features of compounds.
引用
收藏
页码:5150 / 5161
页数:12
相关论文
共 51 条
[41]  
Wang YY, 2019, PLOS COMPUT BIOL, V15, DOI [10.1371/journal.pcbi.1007249, 10.1371/journal.pcbi.1007249.r001, 10.1371/journal.pcbi.1007249.r002, 10.1371/journal.pcbi.1007249.r003, 10.1371/journal.pcbi.1007249.r004]
[42]   Multisolvent Similarity Measure of Chinese Herbal Medicine Ingredients for Cold-Hot Nature Identification [J].
Wei, Guohui ;
Fu, Xianjun ;
Wang, Zhenguo .
JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2019, 59 (12) :5065-5073
[43]   Pollinator shifts drive increasingly long nectar spurs in columbine flowers [J].
Whittall, Justen B. ;
Hodges, Scott A. .
NATURE, 2007, 447 (7145) :706-U12
[44]  
Wu Y C, 1989, Gaoxiong Yi Xue Ke Xue Za Zhi, V5, P409
[45]   MODERNIZATION One step at a time [J].
Xu, Zhiguo .
NATURE, 2011, 480 (7378) :S90-S92
[46]   TCMID: traditional Chinese medicine integrative database for herb molecular mechanism analysis [J].
Xue, Ruichao ;
Fang, Zhao ;
Zhang, Meixia ;
Yi, Zhenghui ;
Wen, Chengping ;
Shi, Tieliu .
NUCLEIC ACIDS RESEARCH, 2013, 41 (D1) :D1089-D1095
[47]  
Yang XM, 2016, J TRADIT CHIN MED, V36, P538
[48]   Machine learning driven non-invasive approach of water content estimation in living plant leaves using terahertz waves [J].
Zahid, Adnan ;
Abbas, Hasan T. ;
Ren, Aifeng ;
Zoha, Ahmed ;
Heidari, Hadi ;
Shah, Syed A. ;
Imran, Muhammad A. ;
Alomainy, Akram ;
Abbasi, Qammer H. .
PLANT METHODS, 2019, 15 (01)
[49]   Chloroplast genomic data provide new and robust insights into the phylogeny and evolution of the Ranunculaceae [J].
Zhai, Wei ;
Duan, Xiaoshan ;
Zhang, Rui ;
Guo, Chunce ;
Li, Lin ;
Xu, Guixia ;
Shan, Hongyan ;
Kong, Hongzhi ;
Ren, Yi .
MOLECULAR PHYLOGENETICS AND EVOLUTION, 2019, 135 :12-21
[50]   In silico prediction of drug-induced myelotoxicity by using Na⟨ve Bayes method [J].
Zhang, Hui ;
Yu, Peng ;
Zhang, Teng-Guo ;
Kang, Yan-Li ;
Zhao, Xiao ;
Li, Yuan-Yuan ;
He, Jia-Hui ;
Zhang, Ji .
MOLECULAR DIVERSITY, 2015, 19 (04) :945-953