Low-Dimensional Genotype Embeddings for Predictive Models

被引:0
|
作者
Sultan, Syed Fahad [1 ]
Guo, Xingzhi [2 ]
Skiena, Steven [2 ]
机构
[1] Furman Univ, Greenville, SC 29613 USA
[2] SUNY Stony Brook, Stony Brook, NY USA
来源
13TH ACM INTERNATIONAL CONFERENCE ON BIOINFORMATICS, COMPUTATIONAL BIOLOGY AND HEALTH INFORMATICS, BCB 2022 | 2022年
关键词
genotype; embeddings; privacy-preserving;
D O I
10.1145/3535508.3545507
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We develop methods for constructing low-dimensional vector representations (embeddings) of large-scale genotyping data, capable of reducing genotypes of hundreds of thousands of SNPs to 100-dimensional embeddings that retain substantial predictive power for inferring medical phenotypes. We demonstrate that embedding-based models yield an average F-score of 0.605 on a test of ten phenoypes (including BMI prediction, genetic relatedness, and depression) versus 0.339 for baseline models. Genotype embeddings also hold promise for creating sharing data while preserving subject anonymity: we show that they retain substantial predictive power even after anonymization by adding Gaussian noise to each dimension.
引用
收藏
页数:4
相关论文
共 50 条
  • [21] Embeddings of low-dimensional strange attractors: Topological invariants and degrees of freedom
    Romanazzi, Nicola
    Lefranc, Marc
    Gilmore, Robert
    PHYSICAL REVIEW E, 2007, 75 (06):
  • [22] Predicting multiple observations in complex systems through low-dimensional embeddings
    Tao Wu
    Xiangyun Gao
    Feng An
    Xiaotian Sun
    Haizhong An
    Zhen Su
    Shraddha Gupta
    Jianxi Gao
    Jürgen Kurths
    Nature Communications, 15
  • [23] On the choice of the low-dimensional domain for global optimization via random embeddings
    Binois, Mickael
    Ginsbourger, David
    Roustant, Olivier
    JOURNAL OF GLOBAL OPTIMIZATION, 2020, 76 (01) : 69 - 90
  • [24] Link prediction using low-dimensional node embeddings: The measurement problem
    Menand, Nicolas
    Seshadhri, C.
    PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2024, 121 (08)
  • [25] Predicting multiple observations in complex systems through low-dimensional embeddings
    Wu, Tao
    Gao, Xiangyun
    An, Feng
    Sun, Xiaotian
    An, Haizhong
    Su, Zhen
    Gupta, Shraddha
    Gao, Jianxi
    Kurths, Juergen
    NATURE COMMUNICATIONS, 2024, 15 (01)
  • [26] Utilizing Low-Dimensional Molecular Embeddings for Rapid Chemical Similarity Search
    Kirchoff, Kathryn E.
    Wellnitz, James
    Hochuli, Joshua E.
    Maxfield, Travis
    Popov, Konstantin I.
    Gomez, Shawn
    Tropsha, Alexander
    ADVANCES IN INFORMATION RETRIEVAL, ECIR 2024, PT II, 2024, 14609 : 34 - 49
  • [27] Multiple Run Ensemble Learning with Low-Dimensional Knowledge Graph Embeddings
    Xu, Chengjin
    Nayyeri, Mojtaba
    Vahdati, Sahar
    Lehmann, Jens
    2021 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2021,
  • [28] Low-dimensional models of single neurons: a review
    Chialva, Ulises
    Gonzalez Bosca, Vicente
    Rotstein, Horacio G.
    BIOLOGICAL CYBERNETICS, 2023, 117 (03) : 163 - 183
  • [29] Low-dimensional models of stellar and galactic dynamos
    Sokoloff, Dimitry
    Nefyodov, S.
    COSMIC MAGNETIC FIELDS: FROM PLANETS, TO STARS AND GALAXIES, 2009, (259): : 419 - 420
  • [30] Low-Dimensional Models for Aerofoil Icing Predictions
    Massegur, David
    Clifford, Declan
    Da Ronch, Andrea
    Lombardi, Riccardo
    Panzeri, Marco
    AEROSPACE, 2023, 10 (05)