Validation of machine learning approach for direct mutation rate estimation

被引:3
作者
Burda, Katarzyna [1 ]
Konczal, Mateusz [1 ,2 ]
机构
[1] Adam Mickiewicz Univ, Fac Biol, Evolutionary Biol Grp, Poznan, Poland
[2] Adam Mickiewicz Univ, Fac Biol, Evolutionary Biol Grp, PL-60614 Poznan, Poland
关键词
guppy; machine learning; mutation rate; teleost; whole-genome sequencing; DE-NOVO MUTATIONS; GERMLINE MUTATION; POPULATION HISTORY; METABOLIC-RATE; EVOLUTION; SELECTION; DYNAMICS; GENETICS; INSIGHTS; FORMAT;
D O I
10.1111/1755-0998.13841
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Mutations are the primary source of all genetic variation. Knowledge about their rates is critical for any evolutionary genetic analyses, but for a long time, that knowledge has remained elusive and indirectly inferred. In recent years, parent-offspring comparisons have yielded the first direct mutation rate estimates. The analyses are, however, challenging due to high rate of false positives and no consensus regarding standardized filtering of candidate de novo mutations. Here, we validate the application of a machine learning approach for such a task and estimate the mutation rate for the guppy (Poecilia reticulata), a model species in eco-evolutionary studies. We sequenced 4 parents and 20 offspring, followed by screening their genomes for de novo mutations. The initial large number of candidate de novo mutations was hard-filtered to remove false-positive results. These results were compared with mutation rate estimated with a supervised machine learning approach. Both approaches were followed by molecular validation of all candidate de novo mutations and yielded similar results. The ML method uniquely identified three mutations, but overall required more hands-on curation and had higher rates of false positives and false negatives. Both methods concordantly showed no difference in mutation rates between families. Estimated here the guppy mutation rate is among the lowest directly estimated mutation rates in vertebrates; however, previous research has also found low estimated rates in other teleost fishes. We discuss potential explanations for such a pattern, as well as future utility and limitations of machine learning approaches.
引用
收藏
页码:1757 / 1771
页数:15
相关论文
共 50 条
  • [21] Efficient hyperparameter-tuned machine learning approach for estimation of supercapacitor performance attributes
    Ahmed, Syed Ishtiyaq
    Radhakrishnan, Sreevatsan
    Nair, Binoy B.
    Thiruvengadathan, Rajagopalan
    JOURNAL OF PHYSICS COMMUNICATIONS, 2021, 5 (11):
  • [22] Machine-Learning Approach to Non-Destructive Biomass and Relative Growth Rate Estimation in Aeroponic Cultivation
    Astrom, Oskar
    Hedlund, Henrik
    Sopasakis, Alexandros
    AGRICULTURE-BASEL, 2023, 13 (04):
  • [23] Machine Learning and Data Fusion Approach for Elastic Rock Properties Estimation and Fracturability Evaluation
    Gong, Yiwen
    El-Monier, Ilham
    Mehana, Mohamed
    ENERGY AND AI, 2024, 16
  • [24] Revisiting the Risk Factors for Endometriosis: A Machine Learning Approach
    Blass, Ido
    Sahar, Tali
    Shraibman, Adi
    Ofer, Dan
    Rappoport, Nadav
    Linial, Michal
    JOURNAL OF PERSONALIZED MEDICINE, 2022, 12 (07):
  • [25] A Machine Learning Regression Approach for Throughput Estimation in an IoT Environment
    Hameed, Aroosa
    Violos, John
    Santi, Nina
    Leivadeas, Aris
    Mitton, Nathalie
    IEEE CONGRESS ON CYBERMATICS / 2021 IEEE INTERNATIONAL CONFERENCES ON INTERNET OF THINGS (ITHINGS) / IEEE GREEN COMPUTING AND COMMUNICATIONS (GREENCOM) / IEEE CYBER, PHYSICAL AND SOCIAL COMPUTING (CPSCOM) / IEEE SMART DATA (SMARTDATA), 2021, : 29 - 36
  • [26] Ten Year Cardiovascular Risk Estimation: A Machine Learning Approach
    Babic, Dejan
    Filipovic, Luka
    Tinaj, Sandra
    Katnic, Ivana
    Cakic, Stevan
    MEDICON 2023 AND CMBEBIH 2023, VOL 1, 2024, 93 : 604 - 612
  • [27] Rate of penetration estimation downhole with machine learning for drilling position control
    Keller, Alexander Mathew
    Feng, Tianheng
    Demirer, Nazli
    Darbe, Robert
    Chen, Dongmei
    GEOENERGY SCIENCE AND ENGINEERING, 2023, 224
  • [28] Wind estimation using quadcopter motion: A machine learning approach
    Allison, Sam
    Bai, He
    Jayaraman, Balaji
    AEROSPACE SCIENCE AND TECHNOLOGY, 2020, 98
  • [29] BioBodyComp: A Machine Learning Approach for Estimation of Percentage Body Fat
    Kirar, Vishnu Pratap Singh
    Burse, Kavita
    Burse, Abhishek
    MACHINE LEARNING, IMAGE PROCESSING, NETWORK SECURITY AND DATA SCIENCES, MIND 2022, PT I, 2022, 1762 : 240 - 251
  • [30] Machine learning based parametric estimation approach for poll prediction
    Koli A.M.
    Ahmed M.
    Recent Advances in Computer Science and Communications, 2021, 14 (04) : 1287 - 1299