Missing genotype imputation in non-model species using self-organizing maps

被引:1
作者
Mora-Marquez, Fernando [1 ]
Nuno, Juan Carlos [2 ]
Soto, Alvaro [1 ]
de Heredia, Unai Lopez [1 ]
机构
[1] Univ Politecn Madrid, Dept Sistemas & Recursos Nat, GI Especies Lenosas WooSp, ETSI Montes Forestal & Medio Nat, Jose Antonio Novais 10,Ciudad Univ, Madrid 28040, Spain
[2] Univ Politecn Madrid, Dept Matemat Aplicada, GI Especies Lenosas WooSp, ETSI Montes Forestal & Medio Nat, Ciudad Univ, Madrid, Spain
关键词
imputation; machine learning; missing data; SNP genotyping; SOM; ASSOCIATION; ALGORITHM; INFERENCE; RADSEQ; PCA;
D O I
10.1111/1755-0998.13992
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Current methodologies of genome-wide single-nucleotide polymorphism (SNP) genotyping produce large amounts of missing data that may affect statistical inference and bias the outcome of experiments. Genotype imputation is routinely used in well-studied species to buffer the impact in downstream analysis, and several algorithms are available to fill in missing genotypes. The lack of reference haplotype panels precludes the use of these methods in genomic studies on non-model organisms. As an alternative, machine learning algorithms are employed to explore the genotype data and to estimate the missing genotypes. Here, we propose an imputation method based on self-organizing maps (SOM), a widely used neural networks formed by spatially distributed neurons that cluster similar inputs into close neurons. The method explores genotype datasets to select SNP loci to build binary vectors from the genotypes, and initializes and trains neural networks for each query missing SNP genotype. The SOM-derived clustering is then used to impute the best genotype. To automate the imputation process, we have implemented gtImputation, an open-source application programmed in Python3 and with a user-friendly GUI to facilitate the whole process. The method performance was validated by comparing its accuracy, precision and sensitivity on several benchmark genotype datasets with other available imputation algorithms. Our approach produced highly accurate and precise genotype imputations even for SNPs with alleles at low frequency and outperformed other algorithms, especially for datasets from mixed populations with unrelated individuals.
引用
收藏
页数:16
相关论文
共 50 条
  • [41] PVC discrimination using the QRS power spectrum and self-organizing maps
    Talbi, M. L.
    Charef, A.
    COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE, 2009, 94 (03) : 223 - 231
  • [42] A Discussion on Visual Interactive Data Exploration Using Self-Organizing Maps
    Moehrmann, Julia
    Burkovski, Andre
    Baranovskiy, Evgeny
    Heinze, Geoffrey-Alexeij
    Rapoport, Andrej
    Heidemann, Gunther
    ADVANCES IN SELF-ORGANIZING MAPS, WSOM 2011, 2011, 6731 : 178 - 187
  • [43] AN EFFECTIVE COLOR QUANTIZATION METHOD USING COLOR IMPORTANCE-BASED SELF-ORGANIZING MAPS
    Park, Hyun Jun
    Kim, Kwang Baek
    Cha, Eui Young
    NEURAL NETWORK WORLD, 2015, 25 (02) : 121 - 137
  • [44] From CPU to FPGA - Acceleration of Self-Organizing Maps for Data Mining
    Lachmair, Jan
    Mieth, Thomas
    Griessl, Rene
    Hagemeyer, Jens
    Porrmann, Mario
    2017 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2017, : 4299 - 4308
  • [45] Computational Model of the Cerebral Cortex That Performs Sparse Coding Using a Bayesian Network and Self-Organizing Maps
    Ichisugi, Yuuji
    Hosoya, Haruo
    NEURAL INFORMATION PROCESSING: THEORY AND ALGORITHMS, PT I, 2010, 6443 : 33 - +
  • [46] Self-Organizing Maps for the Analysis of Complex Movement Patterns
    H.U. Bauer
    W. Schöllhorn
    Neural Processing Letters, 1997, 5 : 193 - 199
  • [47] Self-organizing maps for the analysis of complex movement patterns
    Bauer, HU
    Schollhorn, W
    NEURAL PROCESSING LETTERS, 1997, 5 (03) : 193 - 199
  • [48] Self-organizing maps based on limit cycle attractors
    Huang, Di-Wei
    Gentili, Rodolphe J.
    Reggia, James A.
    NEURAL NETWORKS, 2015, 63 : 208 - 222
  • [49] Robust adaptive learning approach to self-organizing maps
    Hameed, Alaa Ali
    Karlik, Bekir
    Salman, Mohammad Shukri
    Eleyan, Gulden
    KNOWLEDGE-BASED SYSTEMS, 2019, 171 : 25 - 36
  • [50] Topology-oriented self-organizing maps: a survey
    Astudillo, Cesar A.
    Oommen, B. John
    PATTERN ANALYSIS AND APPLICATIONS, 2014, 17 (02) : 223 - 248