Transfer learning with convolutional neural networks for cancer survival prediction using gene-expression data

被引：55

作者：

Lopez-Garcia, Guillermo ^{[1
]}

Jerez, Jose M. ^{[1
]}

Franco, Leonardo ^{[1
]}

Veredas, Francisco J. ^{[1
]}

机构：

[1] Univ Malaga, Dept Lenguajes & Ciencias Comp, ETSI Informat, Malaga, Spain

来源：

PLOS ONE | 2020年 / 15卷 / 03期

关键词：

RNA-SEQ; DEEP; GENOME; RECURRENCE; SIGNATURE; KEGG;

D O I：

10.1371/journal.pone.0230536

中图分类号：

O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];

学科分类号：

07 ; 0710 ; 09 ;

摘要：

Precision medicine in oncology aims at obtaining data from heterogeneous sources to have a precise estimation of a given patient's state and prognosis. With the purpose of advancing to personalized medicine framework, accurate diagnoses allow prescription of more effective treatments adapted to the specificities of each individual case. In the last years, next-generation sequencing has impelled cancer research by providing physicians with an overwhelming amount of gene-expression data from RNA-seq high-throughput platforms. In this scenario, data mining and machine learning techniques have widely contribute to gene-expression data analysis by supplying computational models to supporting decision-making on real-world data. Nevertheless, existing public gene-expression databases are characterized by the unfavorable imbalance between the huge number of genes (in the order of tenths of thousands) and the small number of samples (in the order of a few hundreds) available. Despite diverse feature selection and extraction strategies have been traditionally applied to surpass derived over-fitting issues, the efficacy of standard machine learning pipelines is far from being satisfactory for the prediction of relevant clinical outcomes like follow-up endpoints or patient's survival. Using the public Pan-Cancer dataset, in this study we pre-train convolutional neural network architectures for survival prediction on a subset composed of thousands of gene-expression samples from thirty-one tumor types. The resulting architectures are subsequently fine-tuned to predict lung cancer progression-free interval. The application of convolutional networks to gene-expression data has many limitations, derived from the unstructured nature of these data. In this work we propose a methodology to rearrange RNA-seq data by transforming RNA-seq samples into gene-expression images, from which convolutional networks can extract high-level features. As an additional objective, we investigate whether leveraging the information extracted from other tumor-type samples contributes to the extraction of high-level features that improve lung cancer progression prediction, compared to other machine learning approaches.

引用

页数：24

共 50 条

[21] Prediction of Bladder Cancer Recurrences Using Artificial Neural Networks
Zulueta Guerrero, Ekaitz
Telleria Garay, Naiara
Manuel Lopez-Guede, Jose
Ayerdi Vilches, Borja
Egilegor Iragorri, Eider
Lecumberri Castanos, David
de la Hoz Rastrollo, Ana Belen
Pertusa Pena, Carlos
HYBRID ARTIFICIAL INTELLIGENCE SYSTEMS, PT 1, 2010, 6076 : 492 - +
[22] Risk classification of cancer survival using ANN with gene expression data from multiple laboratories
Chen, Yen-Chen
Ke, Wan-Chi
Chiu, Hung-Wen
COMPUTERS IN BIOLOGY AND MEDICINE, 2014, 48 : 1 - 7
[23] Re-evaluation of publicly available gene-expression databases using machine-learning yields a maximum prognostic power in breast cancer
Tschodu, Dimitrij
Lippoldt, Juergen
Gottheil, Pablo
Wegscheider, Anne-Sophie
Kaes, Josef A.
Niendorf, Axel
SCIENTIFIC REPORTS, 2023, 13 (01)
[24] Using Multi-task Learning to Improve Diagnostic Performance of Convolutional Neural Networks
Fang, Mengjie
Dong, Di
Sun, Ruijia
Fan, Li
Sun, Yingshi
Liu, Shiyuan
Tian, Jie
MEDICAL IMAGING 2019: COMPUTER-AIDED DIAGNOSIS, 2019, 10950
[25] Convolutional Neural Networks with Transfer Learning for Recognition of COVID-19: A Comparative Study of Different Approaches
Garg, Tanmay
Garg, Mamta
Mahela, Om Prakash
Garg, Akhil Ranjan
AI, 2020, 1 (04) : 586 - 606
[26] Tomato Leaf Disease Classification via Compact Convolutional Neural Networks with Transfer Learning and Feature Selection
Attallah, Omneya
HORTICULTURAE, 2023, 9 (02)
[27] All You Need is Color: Image Based Spatial Gene Expression Prediction Using Neural Stain Learning
Dawood, Muhammad
Branson, Kim
Rajpoot, Nasir M.
Minhas, Fayyaz ul Amir Afsar
MACHINE LEARNING AND PRINCIPLES AND PRACTICE OF KNOWLEDGE DISCOVERY IN DATABASES, PT II, 2021, 1525 : 437 - 450
[28] Diabetic Retinopathy Detection Using Convolutional Neural Networks with Background Removal, and Data Augmentation
Suedumrong, Chaichana
Phongmoo, Suriya
Akarajaka, Tachanat
Leksakul, Komgrit
APPLIED SCIENCES-BASEL, 2024, 14 (19):
[29] Validation of genetic variants from NGS data using deep convolutional neural networks
Vaisband, Marc
Schubert, Maria
Gassner, Franz Josef
Geisberger, Roland
Greil, Richard
Zaborsky, Nadja
Hasenauer, Jan
BMC BIOINFORMATICS, 2023, 24 (01)
[30] Change Detection of Deforestation in the Brazilian Amazon Using Landsat Data and Convolutional Neural Networks
de Bem, Pablo Pozzobon
de Carvalho Junior, Osmar Abilio
Guimaraes, Renato Fontes
Trancoso Gomes, Roberto Arnaldo
REMOTE SENSING, 2020, 12 (06)

← 1 2 3 4 5 →