Patterns of Gene Expression Profiles Associated with Colorectal Cancer in Colorectal Mucosa by Using Machine Learning Methods

被引:16
作者
Ren, Jing Xin [1 ]
Chen, Lei [2 ]
Guo, Wei [3 ,4 ]
Feng, Kai Yan [5 ]
Cai, Yu-Dong [1 ]
Huang, Tao [6 ,7 ]
机构
[1] Shanghai Univ, Sch Life Sci, Shanghai 200444, Peoples R China
[2] Shanghai Maritime Univ, Coll Informat Engn, Shanghai 201306, Peoples R China
[3] Shanghai Jiao Tong Univ Sch Med SJTUSM, Key Lab Stem Cell Biol, Shanghai 200030, Peoples R China
[4] Chinese Acad Sci, Shanghai Inst Biol Sci SIBS, Shanghai 200030, Peoples R China
[5] Guangdong AIB Polytech Coll, Dept Comp Sci, Guangzhou 510507, Peoples R China
[6] Univ Chinese Acad Sci, Chinese Acad Sci, Shanghai Inst Nutr & Hlth, Biomed Big Data Ctr,CAS Key Lab Computat Biol, Shanghai 200031, Peoples R China
[7] Univ Chinese Acad Sci, Chinese Acad Sci, Shanghai Inst Nutr & Hlth, CAS Key Lab Tissue Microenvironm & Tumor, Shanghai 200031, Peoples R China
基金
国家重点研发计划;
关键词
Colorectal cancer; mucosa; machine learning; biomarker; gene expression; feature selection; ULCERATIVE-COLITIS; TUMOR PROGRESSION; FEATURE-SELECTION; CELL INVASION; IMMUNE-SYSTEM; HOX GENES; COLON; IDENTIFICATION; SOMATOSTATIN; S100A12;
D O I
10.2174/0113862073266300231026103844
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background Colorectal cancer (CRC) has a very high incidence and lethality rate and is one of the most dangerous cancer types. Timely diagnosis can effectively reduce the incidence of colorectal cancer. Changes in para-cancerous tissues may serve as an early signal for tumorigenesis. Comparison of the differences in gene expression between para-cancerous and normal mucosa can help in the diagnosis of CRC and understanding the mechanisms of development.Objectives This study aimed to identify specific genes at the level of gene expression, which are expressed in normal mucosa and may be predictive of CRC risk.Methods A machine learning approach was used to analyze transcriptomic data in 459 samples of normal colonic mucosal tissue from 322 CRC cases and 137 non-CRC, in which each sample contained 28,706 gene expression levels. The genes were ranked using four ranking methods based on importance estimation (LASSO, LightGBM, MCFS, and mRMR) and four classification algorithms (decision tree [DT], K-nearest neighbor [KNN], random forest [RF], and support vector machine [SVM]) were combined with incremental feature selection [IFS] methods to construct a prediction model with excellent performance.Results The top-ranked genes, namely, HOXD12, CDH1, and S100A12, were associated with tumorigenesis based on previous studies.Conclusion This study summarized four sets of quantitative classification rules based on the DT algorithm, providing clues for understanding the microenvironmental changes caused by CRC. According to the rules, the effect of CRC on normal mucosa can be determined.
引用
收藏
页码:2921 / 2934
页数:14
相关论文
共 136 条
[81]   Metastatic-niche labelling reveals parenchymal cells with stem features [J].
Ombrato, Luigi ;
Nolan, Emma ;
Kurelac, Ivana ;
Mavousian, Antranik ;
Bridgeman, Victoria Louise ;
Heinze, Ivonne ;
Chakravarty, Probir ;
Horswell, Stuart ;
Gonzalez-Gualda, Estela ;
Matacchione, Giulia ;
Weston, Anne ;
Kirkpatrick, Joanna ;
Husain, Ehab ;
Speirs, Valerie ;
Collinson, Lucy ;
Ori, Alessandro ;
Lee, Joo-Hyeon ;
Malanchi, Ilaria .
NATURE, 2019, 572 (7771) :603-+
[82]   Sodium arsenite-induced inhibition of eukaryotic translation initiation factor 4E (eIF4E) results in cytotoxicity and cell death [J].
Othumpangat, S ;
Kashon, M ;
Joseph, P .
MOLECULAR AND CELLULAR BIOCHEMISTRY, 2005, 279 (1-2) :123-131
[83]   Restoring HOXD10 Exhibits Therapeutic Potential for Ameliorating Malignant Progression and 5-Fluorouracil Resistance in Colorectal Cancer [J].
Pan, Weijie ;
Wang, Kaijing ;
Li, Jiayong ;
Li, Hanhua ;
Cai, Yuchan ;
Zhang, Min ;
Wang, Aili ;
Wu, Yazhou ;
Gao, Wei ;
Weng, Wenhao .
FRONTIERS IN ONCOLOGY, 2021, 11
[84]   Identifying Protein Subcellular Locations With Embeddings-Based node2loc [J].
Pan, Xiaoyong ;
Chen, Lei ;
Liu, Min ;
Niu, Zhibin ;
Huang, Tao ;
Cai, Yu-Dong .
IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2022, 19 (02) :666-675
[85]   Field cancerisation in colorectal cancer: A new frontier or pastures past? [J].
Patel, Abhilasha ;
Tripathi, Gyanendra ;
Gopalakrishnan, Kishore ;
Williams, Nigel ;
Arasaradnam, Ramesh P. .
WORLD JOURNAL OF GASTROENTEROLOGY, 2015, 21 (13) :3763-3772
[86]   Mathematical model of colorectal cancer initiation [J].
Paterson, Chay ;
Clevers, Hans ;
Bozic, Ivana .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2020, 117 (34) :20681-20688
[87]  
Pedregosa F, 2011, J MACH LEARN RES, V12, P2825
[88]   Feature selection based on mutual information: Criteria of max-dependency, max-relevance, and min-redundancy [J].
Peng, HC ;
Long, FH ;
Ding, C .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2005, 27 (08) :1226-1238
[89]   Decellularized colorectal cancer matrix as bioactive microenvironment for in vitro 3D cancer research [J].
Piccoli, Martina ;
D'Angelo, Edoardo ;
Crotti, Sara ;
Sensi, Francesca ;
Urbani, Luca ;
Maghin, Edoardo ;
Burns, Alan ;
De Coppi, Paolo ;
Fassan, Matteo ;
Rugge, Massimo ;
Rizzolio, Flavio ;
Giordano, Antonio ;
Pilati, Pierluigi ;
Mammano, Enzo ;
Pucciarelli, Salvatore ;
Agostini, Marco .
JOURNAL OF CELLULAR PHYSIOLOGY, 2018, 233 (08) :5937-5948
[90]   Transcriptional analysis of the intestinal mucosa of patients with ulcerative colitis in remission reveals lasting epithelial cell alterations [J].
Planell, Nuria ;
Lozano, Juan J. ;
Mora-Buch, Rut ;
Masamunt, M. Carme ;
Jimeno, Mireya ;
Ordas, Ingrid ;
Esteller, Miriam ;
Ricart, Elena ;
Pique, Josep M. ;
Panes, Julian ;
Salas, Azucena .
GUT, 2013, 62 (07) :967-976