The Prediction of Recombination Hotspot Based on Automated Machine Learning

被引:1
|
作者
Ye, Dong-Xin [1 ]
Yu, Jun-Wen [1 ]
Li, Rui [1 ]
Hao, Yu-Duo [1 ]
Wang, Tian-Yu [1 ]
Yang, Hui [2 ]
Ding, Hui [1 ]
机构
[1] Univ Elect Sci & Technol China, Sch Life Sci & Technol, Chengdu 610054, Peoples R China
[2] Univ Elect Sci & Technol China, Yangtze Delta Reg Inst Quzhou, Quzhou, Zhejiang, Peoples R China
基金
中国国家自然科学基金;
关键词
recombination hotspot; sequence property; automated machine learning; interpretability; DINUCLEOTIDE COMPOSITION; DNA-SEQUENCE; PROMOTERS; FEATURES; SPOTS; SKEW;
D O I
10.1016/j.jmb.2024.168653
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Meiotic recombination plays a pivotal role in genetic evolution. Genetic variation induced by recombination is a crucial factor in generating biodiversity and a driving force for evolution. At present, the development of recombination hotspot prediction methods has encountered challenges related to insufficient feature extraction and limited generalization capabilities. This paper focused on the research of recombination hotspot prediction methods. We explored deep learning-based recombination hotspot prediction and scrutinized the shortcomings of prevalent models in addressing the challenge of recombination hotspot prediction. To addressing these deficiencies, an automated machine learning approach was utilized to construct recombination hotspot prediction model. The model combined sequence information with physicochemical properties by employing TF-IDF-Kmer and DNA composition components to acquire more effective feature data. Experimental results validate the effectiveness of the feature extraction method and automated machine learning technology used in this study. The final model was validated on three distinct datasets and yielded accuracy rates of 97.14%, 79.71%, and 98.73%, surpassing the current leading models by 2%, 2.56%, and 4%, respectively. In addition, we incorporated tools such as SHAP and Auto- Gluon to analyze the interpretability of black-box models, delved into the impact of individual features on the results, and investigated the reasons behind misclassification of samples. Finally, an application of recombination hotspot prediction was established to facilitate easy access to necessary information and tools for researchers. The research outcomes of this paper underscore the enormous potential of automated machine learning methods in gene sequence prediction. (c) 2024 Elsevier Ltd. All rights are reserved, including those for text and data mining, AI training, and similar technologies.
引用
收藏
页数:16
相关论文
共 50 条
  • [1] Machine Learning Approaches to Traffic Accident Analysis and Hotspot Prediction
    Santos, Daniel
    Saias, Jose
    Quaresma, Paulo
    Nogueira, Vitor Beires
    COMPUTERS, 2021, 10 (12)
  • [2] Air pollution prediction and hotspot detection using machine learning
    Bhatia, Shailee
    Sachdeva, Shelly
    Goswami, Puneet
    JOURNAL OF STATISTICS AND MANAGEMENT SYSTEMS, 2022, 25 (07) : 1553 - 1564
  • [3] Development of an automated photolysis rates prediction system based on machine learning
    Pan, Weijun
    Gong, Sunling
    Ke, Huabing
    Li, Xin
    Chen, Duohong
    Huang, Cheng
    Song, Danlin
    JOURNAL OF ENVIRONMENTAL SCIENCES, 2025, 151 : 211 - 224
  • [4] Automated Machine Learning for Time Series Prediction
    da Silva, Felipe Rooke
    Vieira, Alex Borges
    Bernardino, Heder Soares
    Alencar, Victor Aquiles
    Pessamilio, Lucas Ribeiro
    Correa Barbosa, Helio Jose
    2022 IEEE CONGRESS ON EVOLUTIONARY COMPUTATION (CEC), 2022,
  • [5] Machine Learning Hotspot Prediction Significantly Improve Capture Rate on Wafer
    Yuan, Wei
    Lu, Yifei
    Li, Ming
    Pan, Bingyang
    Gao, Ying
    Tian, Yu
    Li, Zhi-qin
    Ji, Liang
    Huang, Ying
    Chen, Hao
    Yao, Yueliang
    Park, Sean
    IWAPS 2020: PROCEEDINGS OF 2020 4TH INTERNATIONAL WORKSHOP ON ADVANCED PATTERNING SOLUTIONS (IWAPS), 2020, : 75 - 78
  • [6] Automated machine learning-based building energy load prediction method
    Zhang, Chaobo
    Tian, Xiangning
    Zhao, Yang
    Lu, Jie
    JOURNAL OF BUILDING ENGINEERING, 2023, 80
  • [7] Global landslide susceptibility prediction based on the automated machine learning (AutoML) framework
    Tang, Guixi
    Fang, Zhice
    Wang, Yi
    GEOCARTO INTERNATIONAL, 2023, 38 (01)
  • [8] IoT Based Automated Weather Report Generation and Prediction Using Machine Learning
    Parashar, Anubha
    2019 2ND INTERNATIONAL CONFERENCE ON INTELLIGENT COMMUNICATION AND COMPUTATIONAL TECHNIQUES (ICCT), 2019, : 339 - 344
  • [9] iRecSpot-EF: Effective sequence based features for recombination hotspot prediction
    Jani, Md Rafsan
    Mozlish, Md Toha Khan
    Ahmed, Sajid
    Tahniat, Niger Sultana
    Farid, Dewan Md
    Shatabda, Swakkhar
    COMPUTERS IN BIOLOGY AND MEDICINE, 2018, 103 : 17 - 23
  • [10] Prediction of Software Defects Using Automated Machine Learning
    Tanaka, Kazuya
    Monden, Akito
    Yucel, Zeynep
    2019 20TH IEEE/ACIS INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING, ARTIFICIAL INTELLIGENCE, NETWORKING AND PARALLEL/DISTRIBUTED COMPUTING (SNPD), 2019, : 490 - 494