omicsGAT: Graph Attention Network for Cancer Subtype Analyses

被引:9
作者
Baul, Sudipto [1 ,2 ]
Ahmed, Khandakar Tanvir [1 ,2 ]
Filipek, Joseph [1 ,2 ]
Zhang, Wei [1 ,2 ]
机构
[1] Univ Cent Florida, Dept Comp Sci, Orlando, FL 32816 USA
[2] Univ Cent Florida, Genom & Bioinformat Cluster, Orlando, FL 32816 USA
关键词
graph attention network; single-cell RNA-seq; patient stratification; cancer outcome prediction; RNA-SEQ; BREAST;
D O I
10.3390/ijms231810220
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
The use of high-throughput omics technologies is becoming increasingly popular in all facets of biomedical science. The mRNA sequencing (RNA-seq) method reports quantitative measures of more than tens of thousands of biological features. It provides a more comprehensive molecular perspective of studied cancer mechanisms compared to traditional approaches. Graph-based learning models have been proposed to learn important hidden representations from gene expression data and network structure to improve cancer outcome prediction, patient stratification, and cell clustering. However, these graph-based methods cannot rank the importance of the different neighbors for a particular sample in the downstream cancer subtype analyses. In this study, we introduce omicsGAT, a graph attention network (GAT) model to integrate graph-based learning with an attention mechanism for RNA-seq data analysis. The multi-head attention mechanism in omicsGAT can more effectively secure information of a particular sample by assigning different attention coefficients to its neighbors. Comprehensive experiments on The Cancer Genome Atlas (TCGA) breast cancer and bladder cancer bulk RNA-seq data and two single-cell RNA-seq datasets validate that (1) the proposed model can effectively integrate neighborhood information of a sample and learn an embedding vector to improve disease phenotype prediction, cancer patient stratification, and cell clustering of the sample and (2) the attention matrix generated from the multi-head attention coefficients provides more useful information compared to the sample correlation-based adjacency matrix. From the results, we can conclude that some neighbors play a more important role than others in cancer subtype analyses of a particular sample based on the attention coefficient.
引用
收藏
页数:16
相关论文
共 47 条
[1]   Multi-omics data integration by generative adversarial network [J].
Ahmed, Khandakar Tanvir ;
Sun, Jiao ;
Cheng, Sze ;
Yong, Jeongsik ;
Zhang, Wei .
BIOINFORMATICS, 2022, 38 (01) :179-186
[2]   Network-based drug sensitivity prediction [J].
Ahmed, Khandakar Tanvir ;
Park, Sunho ;
Jiang, Qibing ;
Yeu, Yunku ;
Hwang, TaeHyun ;
Zhang, Wei .
BMC MEDICAL GENOMICS, 2020, 13 (Suppl 11)
[3]   Subtyping of Breast Cancer by Immunohistochemistry to Investigate a Relationship between Subtype and Short and Long Term Survival: A Collaborative Analysis of Data for 10,159 Cases from 12 Studies [J].
Blows, Fiona M. ;
Driver, Kristy E. ;
Schmidt, Marjanka K. ;
Broeks, Annegien ;
van Leeuwen, Flora E. ;
Wesseling, Jelle ;
Cheang, Maggie C. ;
Gelmon, Karen ;
Nielsen, Torsten O. ;
Blomqvist, Carl ;
Heikkila, Paivi ;
Heikkinen, Tuomas ;
Nevanlinna, Heli ;
Akslen, Lars A. ;
Begin, Louis R. ;
Foulkes, William D. ;
Couch, Fergus J. ;
Wang, Xianshu ;
Cafourek, Vicky ;
Olson, Janet E. ;
Baglietto, Laura ;
Giles, Graham G. ;
Severi, Gianluca ;
McLean, Catriona A. ;
Southey, Melissa C. ;
Rakha, Emad ;
Green, Andrew R. ;
Ellis, Ian O. ;
Sherman, Mark E. ;
Lissowska, Jolanta ;
Anderson, William F. ;
Cox, Angela ;
Cross, Simon S. ;
Reed, Malcolm W. R. ;
Provenzano, Elena ;
Dawson, Sarah-Jane ;
Dunning, Alison M. ;
Humphreys, Manjeet ;
Easton, Douglas F. ;
Garcia-Closas, Montserrat ;
Caldas, Carlos ;
Pharoah, Paul D. ;
Huntsman, David .
PLOS MEDICINE, 2010, 7 (05)
[4]  
Devlin J, 2019, 2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, P4171
[5]  
Fey M., 2019, ARXIV
[6]   Developmental and oncogenic programs in H3K27M gliomas dissected by single-cell RNA-seq [J].
Filbin, Mariella G. ;
Tirosh, Itay ;
Hovestadt, Volker ;
Shaw, McKenzie L. ;
Escalante, Leah E. ;
Mathewson, Nathan D. ;
Neftel, Cyril ;
Frank, Nelli ;
Pelton, Kristine ;
Hebert, ChristineM. ;
Haberler, Christine ;
Yizhak, Keren ;
Gojo, Johannes ;
Egervari, Kristof ;
Mount, Christopher ;
van Galen, Peter ;
Bonal, Dennis M. ;
Quang-De Nguyen ;
Beck, Alexander ;
Sinai, Claire ;
Czech, Thomas ;
Dorfer, Christian ;
Goumnerova, Liliana ;
Lavarino, Cinzia ;
Carcaboso, Angel M. ;
Mora, Jaume ;
Mylvaganam, Ravindra ;
Luo, Christina C. ;
Peyrl, Andreas ;
Popovic, Mara ;
Azizi, Amedeo ;
Batchelor, Tracy T. ;
Frosch, Matthew P. ;
Martinez-Lage, Maria ;
Kieran, Mark W. ;
Bandopadhayay, Pratiti ;
Beroukhim, Rameen ;
Fritsch, Gerhard ;
Getz, Gad ;
Rozenblatt-Rosen, Orit ;
Wucherpfennig, Kai W. ;
Louis, David N. ;
Monje, Michelle ;
Slavc, Irene ;
Ligon, Keith L. ;
Golub, Todd R. ;
Regev, Aviv ;
Bernstein, Bradley E. ;
Suva, Mario L. .
SCIENCE, 2018, 360 (6386) :331-335
[7]   DeepCC: a novel deep learning-based framework for cancer molecular subtype classification [J].
Gao, Feng ;
Wang, Wei ;
Tan, Miaomiao ;
Zhu, Lina ;
Zhang, Yuchen ;
Fessler, Evelyn ;
Vermeulen, Louis ;
Wang, Xin .
ONCOGENESIS, 2019, 8
[8]   Integrative Analysis of Complex Cancer Genomics and Clinical Profiles Using the cBioPortal [J].
Gao, Jianjiong ;
Aksoy, Buelent Arman ;
Dogrusoz, Ugur ;
Dresdner, Gideon ;
Gross, Benjamin ;
Sumer, S. Onur ;
Sun, Yichao ;
Jacobsen, Anders ;
Sinha, Rileen ;
Larsson, Erik ;
Cerami, Ethan ;
Sander, Chris ;
Schultz, Nikolaus .
SCIENCE SIGNALING, 2013, 6 (269) :pl1
[9]   Visualizing and interpreting cancer genomics data via the Xena platform [J].
Goldman, Mary J. ;
Craft, Brian ;
Hastie, Mim ;
Repecka, Kristupas ;
McDade, Fran ;
Kamath, Akhil ;
Banerjee, Ayan ;
Luo, Yunhai ;
Rogers, Dave ;
Brooks, Angela N. ;
Zhu, Jingchun ;
Haussler, David .
NATURE BIOTECHNOLOGY, 2020, 38 (06) :675-678
[10]  
Gori M, 2005, IEEE IJCNN, P729