Hypergraph factorization for multi-tissue gene expression imputation

被引:0
作者
Ramon Viñas
Chaitanya K. Joshi
Dobrik Georgiev
Phillip Lin
Bianca Dumitrascu
Eric R. Gamazon
Pietro Liò
机构
[1] University of Cambridge,Department of Computer Science and Technology
[2] Vanderbilt University Medical Center,Division of Genetic Medicine
[3] Columbia University,Department of Statistics and Irving Institute for Cancer Dynamics
[4] University of Cambridge,Vanderbilt Genetics Institute and Data Science Institute, MRC Epidemiology Unit
来源
Nature Machine Intelligence | 2023年 / 5卷
关键词
D O I
暂无
中图分类号
学科分类号
摘要
Integrating gene expression across tissues and cell types is crucial for understanding the coordinated biological mechanisms that drive disease and characterize homoeostasis. However, traditional multi-tissue integration methods either cannot handle uncollected tissues or rely on genotype information, which is often unavailable and subject to privacy concerns. Here we present HYFA (hypergraph factorization), a parameter-efficient graph representation learning approach for joint imputation of multi-tissue and cell-type gene expression. HYFA is genotype agnostic, supports a variable number of collected tissues per individual, and imposes strong inductive biases to leverage the shared regulatory architecture of tissues and genes. In performance comparison on Genotype–Tissue Expression project data, HYFA achieves superior performance over existing methods, especially when multiple reference tissues are available. The HYFA-imputed dataset can be used to identify replicable regulatory genetic variations (expression quantitative trait loci), with substantial gains over the original incomplete dataset. HYFA can accelerate the effective and scalable integration of tissue and cell-type transcriptome biorepositories.
引用
收藏
页码:739 / 753
页数:14
相关论文
共 95 条
[1]  
Basu M(2021)Predicting tissue-specific gene expression from whole blood transcriptome Sci. Adv. 7 eabd6991-2257
[2]  
Wang K(2020)High-throughput transcriptome profiling in drug and biomarker discovery Front. Genet. 11 19-967
[3]  
Ruppin E(2021)Probabilistic harmonization and annotation of single-cell transcriptomics data with deep generative models Mol. Syst. Biol. 17 e9620-1246
[4]  
Hannenhalli S(2000)Molecular markers in blood as surrogate prognostic indicators of melanoma recurrence Cancer Res. 60 2253-708
[5]  
Yang X(2010)Is human blood a good surrogate for brain tissue in transcriptional studies? BMC Genom. 11 589-4169
[6]  
Xu C(2017)Identification of differentially methylated Sci. Rep. 7 5120-3421
[7]  
Hoon DS(2018) and Nat. Genet. 50 956-R443
[8]  
Cai C(2020) DNA regions as blood surrogate markers for cardiovascular disease Nat. Commun. 11 119-1362
[9]  
Istas G(2020)Using an atlas of gene regulation across 44 human tissues to inform complex disease- and trait-associated variation Nat. Genet. 52 1239-20875
[10]  
Gamazon ER(2016)Clinically accurate diagnosis of Alzheimer’s disease via multiplexed sensing of core biomarkers in human plasma Am. J. Hum. Genet. 98 697-227