Missing genotype imputation in non-model species using self-organizing maps

被引:1
作者
Mora-Marquez, Fernando [1 ]
Nuno, Juan Carlos [2 ]
Soto, Alvaro [1 ]
de Heredia, Unai Lopez [1 ]
机构
[1] Univ Politecn Madrid, Dept Sistemas & Recursos Nat, GI Especies Lenosas WooSp, ETSI Montes Forestal & Medio Nat, Jose Antonio Novais 10,Ciudad Univ, Madrid 28040, Spain
[2] Univ Politecn Madrid, Dept Matemat Aplicada, GI Especies Lenosas WooSp, ETSI Montes Forestal & Medio Nat, Ciudad Univ, Madrid, Spain
关键词
imputation; machine learning; missing data; SNP genotyping; SOM; ASSOCIATION; ALGORITHM; INFERENCE; RADSEQ; PCA;
D O I
10.1111/1755-0998.13992
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Current methodologies of genome-wide single-nucleotide polymorphism (SNP) genotyping produce large amounts of missing data that may affect statistical inference and bias the outcome of experiments. Genotype imputation is routinely used in well-studied species to buffer the impact in downstream analysis, and several algorithms are available to fill in missing genotypes. The lack of reference haplotype panels precludes the use of these methods in genomic studies on non-model organisms. As an alternative, machine learning algorithms are employed to explore the genotype data and to estimate the missing genotypes. Here, we propose an imputation method based on self-organizing maps (SOM), a widely used neural networks formed by spatially distributed neurons that cluster similar inputs into close neurons. The method explores genotype datasets to select SNP loci to build binary vectors from the genotypes, and initializes and trains neural networks for each query missing SNP genotype. The SOM-derived clustering is then used to impute the best genotype. To automate the imputation process, we have implemented gtImputation, an open-source application programmed in Python3 and with a user-friendly GUI to facilitate the whole process. The method performance was validated by comparing its accuracy, precision and sensitivity on several benchmark genotype datasets with other available imputation algorithms. Our approach produced highly accurate and precise genotype imputations even for SNPs with alleles at low frequency and outperformed other algorithms, especially for datasets from mixed populations with unrelated individuals.
引用
收藏
页数:16
相关论文
共 50 条
  • [21] Analysis of Atmospheric Pollutant Data Using Self-Organizing Maps
    Costa, Emanoel L. R.
    Braga, Taiane
    Dias, Leonardo A.
    de Albuquerque, Edler L.
    Fernandes, Marcelo A. C.
    SUSTAINABILITY, 2022, 14 (16)
  • [22] Self-Organizing Maps for Agile Requirements Prioritization
    Hudaib, Amjad
    Alhaj, Fatima
    2019 2ND INTERNATIONAL CONFERENCE ON NEW TRENDS IN COMPUTING SCIENCES (ICTCS), 2019, : 324 - 328
  • [23] Different Aspects of Clustering The Self-Organizing Maps
    Haytham Elghazel
    Khalid Benabdeslem
    Neural Processing Letters, 2014, 39 : 97 - 114
  • [24] Different Aspects of Clustering The Self-Organizing Maps
    Elghazel, Haytham
    Benabdeslem, Khalid
    NEURAL PROCESSING LETTERS, 2014, 39 (01) : 97 - 114
  • [25] GTSOM: Game Theoretic Self-Organizing Maps
    Herbert, Joseph
    Yao, JingTao
    TRENDS IN NEURAL COMPUTATION, 2007, 35 : 199 - +
  • [26] Using Self-Organizing Maps to Elucidate Patterns among Variables in Simulated Syngas Combustion
    Fortela, Dhan Lord B.
    Crawford, Matthew
    DeLattre, Alyssa
    Kowalski, Spencer
    Lissard, Mary
    Fremin, Ashton
    Sharp, Wayne
    Revellame, Emmanuel
    Hernandez, Rafael
    Zappi, Mark
    CLEAN TECHNOLOGIES, 2020, 2 (02): : 156 - 169
  • [27] FPGA PLACEMENT BASED ON SELF-ORGANIZING MAPS
    Amagasaki, Motoki
    Iida, Masahiro
    Kuga, Morihiro
    Sueyoshi, Toshinori
    INTERNATIONAL JOURNAL OF INNOVATIVE COMPUTING INFORMATION AND CONTROL, 2015, 11 (06): : 2001 - 2012
  • [28] Integer Self-Organizing Maps for Digital Hardware
    Kleyko, Denis
    Osipov, Evgeny
    De Silva, Daswin
    Wiklund, Urban
    Alahakoon, Damminda
    2019 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2019,
  • [29] Probabilistic self-organizing maps for qualitative data
    Lopez-Rubio, Ezequiel
    NEURAL NETWORKS, 2010, 23 (10) : 1208 - 1225
  • [30] Multimodal System Based on Self-organizing Maps
    Johnsson, Magnus
    Balkenius, Christian
    Hesslow, Germund
    COMPUTATIONAL INTELLIGENCE, 2011, 343 : 251 - +