PlantMine: A Machine-Learning Framework to Detect Core SNPs in Rice Genomics

被引:4
作者
Tong, Kai [1 ]
Chen, Xiaojing [2 ,3 ]
Yan, Shen [4 ]
Dai, Liangli [1 ]
Liao, Yuxue [1 ]
Li, Zhaoling [1 ]
Wang, Ting [5 ,6 ]
机构
[1] Sichuan Univ Sci & Engn, Sch Biol Engn, Yibin 644000, Peoples R China
[2] Chinese Acad Agr Sci, Agr Informat Inst, Natl Agr Sci Data Ctr, Beijing 100081, Peoples R China
[3] Chinese Acad Agr Sci, Natl Nanfan Res Inst, Sanya 572024, Peoples R China
[4] Chinese Acad Agr Sci, Inst Crop Sci, State Key Lab Crop Gene Resources & Breeding, Beijing 100081, Peoples R China
[5] Chinese Acad Agr Sci, Agr Informat Inst, Beijing 100081, Peoples R China
[6] Minist Agr & Rural Areas, Key Lab Big Agridata, Beijing 100081, Peoples R China
基金
中国国家自然科学基金;
关键词
feature selection; genomic prediction; machine learning; rice breeding; SNP; ENSEMBLE;
D O I
10.3390/genes15050603
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
As a fundamental global staple crop, rice plays a pivotal role in human nutrition and agricultural production systems. However, its complex genetic architecture and extensive trait variability pose challenges for breeders and researchers in optimizing yield and quality. Particularly to expedite breeding methods like genomic selection, isolating core SNPs related to target traits from genome-wide data reduces irrelevant mutation noise, enhancing computational precision and efficiency. Thus, exploring efficient computational approaches to mine core SNPs is of great importance. This study introduces PlantMine, an innovative computational framework that integrates feature selection and machine learning techniques to effectively identify core SNPs critical for the improvement of rice traits. Utilizing the dataset from the 3000 Rice Genomes Project, we applied different algorithms for analysis. The findings underscore the effectiveness of combining feature selection with machine learning in accurately identifying core SNPs, offering a promising avenue to expedite rice breeding efforts and improve crop productivity and resilience to stress.
引用
收藏
页数:12
相关论文
共 27 条
[1]   PLANET-SNP pipeline: PLants based ANnotation and Establishment of True SNP pipeline [J].
Bhardwaj, Archana ;
Bag, Sumit K. .
GENOMICS, 2019, 111 (05) :1066-1077
[2]   Feature selection in machine learning: A new perspective [J].
Cai, Jie ;
Luo, Jiawei ;
Wang, Shulin ;
Yang, Sheng .
NEUROCOMPUTING, 2018, 300 :70-79
[3]   XGBoost: A Scalable Tree Boosting System [J].
Chen, Tianqi ;
Guestrin, Carlos .
KDD'16: PROCEEDINGS OF THE 22ND ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2016, :785-794
[4]   SUPPORT-VECTOR NETWORKS [J].
CORTES, C ;
VAPNIK, V .
MACHINE LEARNING, 1995, 20 (03) :273-297
[5]   A comprehensive survey on feature selection in the various fields of machine learning [J].
Dhal, Pradip ;
Azad, Chandrashekhar .
APPLIED INTELLIGENCE, 2022, 52 (04) :4543-4581
[6]   Greedy function approximation: A gradient boosting machine [J].
Friedman, JH .
ANNALS OF STATISTICS, 2001, 29 (05) :1189-1232
[7]   SNP identification in crop plants [J].
Ganal, Martin W. ;
Altmann, Thomas ;
Roeder, Marion S. .
CURRENT OPINION IN PLANT BIOLOGY, 2009, 12 (02) :211-217
[8]   Research on Plant Genomics and Breeding [J].
Huang, Jie ;
Li, Zhiyong ;
Zhang, Jian .
INTERNATIONAL JOURNAL OF MOLECULAR SCIENCES, 2023, 24 (20)
[9]   GenoCore: A simple and fast algorithm for core subset selection from large genotype datasets [J].
Jeong, Seongmun ;
Kim, Jae-Yoon ;
Jeong, Soon-Chun ;
Kang, Sung-Taeg ;
Moon, Jung-Kyung ;
Kim, Namshin .
PLOS ONE, 2017, 12 (07)
[10]   Predicting Cell Wall Lytic Enzymes Using Combined Features [J].
Jing, Xiao-Yang ;
Li, Feng-Min .
FRONTIERS IN BIOENGINEERING AND BIOTECHNOLOGY, 2021, 8