Optimisation Models for Pathway Activity Inference in Cancer

被引:0
作者
Chen, Yongnan [1 ]
Liu, Songsong [2 ]
Papageorgiou, Lazaros G. [3 ]
Theofilatos, Konstantinos [4 ]
Tsoka, Sophia [1 ]
机构
[1] Kings Coll London, Fac Nat Math & Engn Sci, Dept Informat, London WC2B 4BG, England
[2] Harbin Inst Technol, Sch Management, Harbin 150001, Peoples R China
[3] UCL, Sargent Ctr Proc Syst Engn, Dept Chem Engn, Torrington Pl, London WC1E 7JE, England
[4] Kings Coll London, British Heart Fdn Ctr, Sch Cardiovasc & Metab Med & Sci, London SE1 7EH, England
基金
中国国家自然科学基金; 英国工程与自然科学研究理事会;
关键词
pathway activity; RNA sequencing; optimisation; breast cancer; colorectal cancer; GENE-EXPRESSION PATTERNS; RISK; KEGG; INTEGRATION; REGARDLESS; SUBTYPES; ROBUST;
D O I
10.3390/cancers15061787
中图分类号
R73 [肿瘤学];
学科分类号
100214 ;
摘要
Simple Summary Subtype classification and prognostic prediction are key research targets in complex diseases such as cancer. In this work, an optimisation model was designed to infer the activity of biological pathways from gene expression values. The optimisation model enables the pathway activity values to separate the sample subtypes to the greatest extent, thereby improving sample classification accuracy. The proposed model was evaluated on cancer molecular subtype classification, robustness to noisy data and survival prediction, and allowed the identification of disease-important genes and pathways. Background: With advances in high-throughput technologies, there has been an enormous increase in data related to profiling the activity of molecules in disease. While such data provide more comprehensive information on cellular actions, their large volume and complexity pose difficulty in accurate classification of disease phenotypes. Therefore, novel modelling methods that can improve accuracy while offering interpretable means of analysis are required. Biological pathways can be used to incorporate a priori knowledge of biological interactions to decrease data dimensionality and increase the biological interpretability of machine learning models. Methodology: A mathematical optimisation model is proposed for pathway activity inference towards precise disease phenotype prediction and is applied to RNA-Seq datasets. The model is based on mixed-integer linear programming (MILP) mathematical optimisation principles and infers pathway activity as the linear combination of pathway member gene expression, multiplying expression values with model-determined gene weights that are optimised to maximise discrimination of phenotype classes and minimise incorrect sample allocation. Results: The model is evaluated on the transcriptome of breast and colorectal cancer, and exhibits solution results of good optimality as well as good prediction performance on related cancer subtypes. Two baseline pathway activity inference methods and three advanced methods are used for comparison. Sample prediction accuracy, robustness against noise expression data, and survival analysis suggest competitive prediction performance of our model while providing interpretability and insight on key pathways and genes. Overall, our work demonstrates that the flexible nature of mathematical programming lends itself well to developing efficient computational strategies for pathway activity inference and disease subtype prediction.
引用
收藏
页数:19
相关论文
共 71 条
  • [1] Glutathione-S-transferase (GSTM1) genetic polymorphisms do not affect human breast cancer risk, regardless of dietary antioxidants
    Ambrosone, CB
    Coles, BF
    Freudenheim, JL
    Shields, PG
    [J]. JOURNAL OF NUTRITION, 1999, 129 (02) : 565S - 568S
  • [2] Understanding the Emerging Link Between Circadian Rhythm, Nrf2 Pathway, and Breast Cancer to Overcome Drug Resistance
    Bevinakoppamath, Supriya
    Ramachandra, Shobha Chikkavaddaragudi
    Yadav, Anshu Kumar
    Basavaraj, Vijaya
    Vishwanath, Prashant
    Prashant, Akila
    [J]. FRONTIERS IN PHARMACOLOGY, 2022, 12
  • [3] Oncogenic pathway signatures in human cancers as a guide to targeted therapies
    Bild, AH
    Yao, G
    Chang, JT
    Wang, QL
    Potti, A
    Chasse, D
    Joshi, MB
    Harpole, D
    Lancaster, JM
    Berchuck, A
    Olson, JA
    Marks, JR
    Dressman, HK
    West, M
    Nevins, JR
    [J]. NATURE, 2006, 439 (7074) : 353 - 357
  • [4] Gene expression patterns unveil a new level of molecular heterogeneity in colorectal cancer
    Budinska, Eva
    Popovici, Vlad
    Tejpar, Sabine
    D'Ario, Giovanni
    Lapique, Nicolas
    Sikora, Katarzyna Otylia
    Di Narzo, Antonio Fabio
    Yan, Pu
    Hodgson, John Graeme
    Weinrich, Scott
    Bosman, Fred
    Roth, Arnaud
    Delorenzi, Mauro
    [J]. JOURNAL OF PATHOLOGY, 2013, 231 (01) : 63 - 76
  • [5] Bussieck MR, 2004, APPL OPTIMIZAT, V88, P137
  • [6] Chemotherapy-induced peripheral neurotoxicity in the era of pharmacogenomics
    Cavaletti, Guido
    Alberti, Paola
    Marmiroli, Paola
    [J]. LANCET ONCOLOGY, 2011, 12 (12) : 1151 - 1161
  • [7] Higher dietary folate intake reduces the breast cancer risk: a systematic review and meta-analysis
    Chen, P.
    Li, C.
    Li, X.
    Li, J.
    Chu, R.
    Wang, H.
    [J]. BRITISH JOURNAL OF CANCER, 2014, 110 (09) : 2327 - 2338
  • [8] Integrating Biological Knowledge with Gene Expression Profiles for Survival Prediction of Cancer
    Chen, Xi
    Wang, Lily
    [J]. JOURNAL OF COMPUTATIONAL BIOLOGY, 2009, 16 (02) : 265 - 278
  • [9] Chen Y., 2020, P 2020 12 INT C BIOI, P25, DOI [10.1145/3405758.3405767, DOI 10.1145/3405758.3405767]
  • [10] Integration of biological networks and gene expression data using Cytoscape
    Cline, Melissa S.
    Smoot, Michael
    Cerami, Ethan
    Kuchinsky, Allan
    Landys, Nerius
    Workman, Chris
    Christmas, Rowan
    Avila-Campilo, Iliana
    Creech, Michael
    Gross, Benjamin
    Hanspers, Kristina
    Isserlin, Ruth
    Kelley, Ryan
    Killcoyne, Sarah
    Lotia, Samad
    Maere, Steven
    Morris, John
    Ono, Keiichiro
    Pavlovic, Vuk
    Pico, Alexander R.
    Vailaya, Aditya
    Wang, Peng-Liang
    Adler, Annette
    Conklin, Bruce R.
    Hood, Leroy
    Kuiper, Martin
    Sander, Chris
    Schmulevich, Ilya
    Schwikowski, Benno
    Warner, Guy J.
    Ideker, Trey
    Bader, Gary D.
    [J]. NATURE PROTOCOLS, 2007, 2 (10) : 2366 - 2382