Perspectives on making big data analytics work for oncology

被引:27
作者
El Naqa, Issam [1 ]
机构
[1] Univ Michigan, Dept Radiat Oncol, Ann Arbor, MI 48109 USA
关键词
Big data; Oncology; Machine learning; Clinical decision support; PREDICT RADIATION PNEUMONITIS; DOSE-VOLUME; BAYESIAN NETWORK; NEURAL-NETWORK; RADIOTHERAPY OUTCOMES; TEXTURAL FEATURES; PROSTATE-CANCER; TUMOR RESPONSE; NECK-CANCER; FDG-PET;
D O I
10.1016/j.ymeth.2016.08.010
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Oncology, with its unique combination of clinical, physical, technological, and biological data provides an ideal case study for applying big data analytics to improve cancer treatment safety and outcomes. An oncology treatment course such as chemoradiotherapy can generate a large pool of information carrying the 5 Vs hallmarks of big data. This data is comprised of a heterogeneous mixture of patient demographics, radiationichemo dosimetry, multimodality imaging features, and biological markers generated over a treatment period that can span few days to several weeks. Efforts using commercial and in-house tools are underway to facilitate data aggregation, ontology creation, sharing, visualization and varying analytics in a secure environment. However, open questions related to proper data structure representation and effective analytics tools to support oncology decision-making need to be addressed. It is recognized that oncology data constitutes a mix of structured (tabulated) and unstructured (electronic documents) that need to be processed to facilitate searching and subsequent knowledge discovery from relational or NoSQL databases. In this context, methods based on advanced analytics and image feature extraction for oncology applications will be discussed. On the other hand, the classical p (variables) >> n (samples) inference problem of statistical learning is challenged in the Big data realm and this is particularly true for oncology applications where p-omics is witnessing exponential growth while the number of cancer incidences has generally plateaued over the past 5-years leading to a quasi-linear growth in samples per patient. Within the Big data paradigm, this kind of phenomenon may yield undesirable effects such as echo chamber anomalies, Yule-Simpson reversal paradox, or misleading ghost analytics. In this work, we will present these effects as they pertain to oncology and engage small thinking methodologies to counter these effects ranging from incorporating prior knowledge, using information-theoretic techniques to modern ensemble machine learning approaches or combination of these. We will particularly discuss the pros and cons of different approaches to improve mining of big data in oncology. (C) 2016 Elsevier Inc. All rights reserved.
引用
收藏
页码:32 / 44
页数:13
相关论文
共 84 条
  • [1] The Molecular Taxonomy of Primary Prostate Cancer
    Abeshouse, Adam
    Ahn, Jaeil
    Akbani, Rehan
    Ally, Adrian
    Amin, Samirkumar
    Andry, Christopher D.
    Annala, Matti
    Aprikian, Armen
    Armenia, Joshua
    Arora, Arshi
    Auman, J. Todd
    Balasundaram, Miruna
    Balu, Saianand
    Barbieri, Christopher E.
    Bauer, Thomas
    Benz, Christopher C.
    Bergeron, Alain
    Beroukhim, Rameen
    Berrios, Mario
    Bivol, Adrian
    Bodenheimer, Tom
    Boice, Lori
    Bootwalla, Moiz S.
    dos Reis, Rodolfo Borges
    Boutros, Paul C.
    Bowen, Jay
    Bowlby, Reanne
    Boyd, Jeffrey
    Bradley, Robert K.
    Breggia, Anne
    Brimo, Fadi
    Bristow, Christopher A.
    Brooks, Denise
    Broom, Bradley M.
    Bryce, Alan H.
    Bubley, Glenn
    Burks, Eric
    Butterfield, Yaron S. N.
    Button, Michael
    Canes, David
    Carlotti, Carlos G.
    Carlsen, Rebecca
    Carmel, Michel
    Carroll, Peter R.
    Carter, Scott L.
    Cartun, Richard
    Carver, Brett S.
    Chan, June M.
    Chang, Matthew T.
    Chen, Yu
    [J]. CELL, 2015, 163 (04) : 1011 - 1025
  • [2] Strategies to Prevent "Bad Luck" in Cancer
    Albini, Adriana
    Cavuto, Silvio
    Apolone, Giovanni
    Noonan, Douglas M.
    [J]. JNCI-JOURNAL OF THE NATIONAL CANCER INSTITUTE, 2015, 107 (10):
  • [3] Biomarker studies: a call for a comprehensive biomarker study registry
    Andre, Fabrice
    McShane, Lisa M.
    Michiels, Stefan
    Ransohoff, David F.
    Altman, Douglas G.
    Reis-Filho, Jorge S.
    Hayes, Daniel F.
    Pusztai, Lajos
    [J]. NATURE REVIEWS CLINICAL ONCOLOGY, 2011, 8 (03) : 171 - 176
  • [4] Biomarkers and surrogate endpoints: Preferred definitions and conceptual framework
    Atkinson, AJ
    Colburn, WA
    DeGruttola, VG
    DeMets, DL
    Downing, GJ
    Hoth, DF
    Oates, JA
    Peck, CC
    Schooley, RT
    Spilker, BA
    Woodcock, J
    Zeger, SL
    [J]. CLINICAL PHARMACOLOGY & THERAPEUTICS, 2001, 69 (03) : 89 - 95
  • [5] Introduction to Big Data in Radiation Oncology: Exploring Opportunities for Research, Quality Assessment, and Clinical Care
    Benedict, Stanley H.
    El Naqa, Issam
    Klein, Eric E.
    [J]. INTERNATIONAL JOURNAL OF RADIATION ONCOLOGY BIOLOGY PHYSICS, 2016, 95 (03): : 871 - 872
  • [6] Radical prostatectomy versus watchful waiting in early prostate cancer
    Bill-Axelson, A
    Holmberg, L
    Ruutu, M
    Häggman, M
    Andersson, SO
    Bratell, S
    Spångberg, A
    Busch, C
    Nordling, S
    Garmo, H
    Palmgren, J
    Adami, HO
    Norlén, BJ
    Johansson, JE
    [J]. NEW ENGLAND JOURNAL OF MEDICINE, 2005, 352 (19) : 1977 - 1984
  • [7] Dose-volume modeling of salivary function in patients with head-and-neck cancer receiving radiotherapy
    Blanco, AI
    Chao, KSC
    El Naqa, I
    Franklin, GE
    Zakarian, K
    Vicic, M
    Deasy, JO
    [J]. INTERNATIONAL JOURNAL OF RADIATION ONCOLOGY BIOLOGY PHYSICS, 2005, 62 (04): : 1055 - 1069
  • [8] Dosimetric correlates for acute esophagitis in patients treated with radiotherapy for lung carcinoma
    Bradley, J
    Deasy, JO
    Bentzen, S
    El Naqa, I
    [J]. INTERNATIONAL JOURNAL OF RADIATION ONCOLOGY BIOLOGY PHYSICS, 2004, 58 (04): : 1106 - 1113
  • [9] A nomogram to predict radiation pneumonitis, derived from a combined analysis of rtog 9311 and institutional data
    Bradley, Jeffrey D.
    Hope, Andrew
    El Naqa, Issam
    Apte, Aditya
    Lindsay, Patricia E.
    Bosch, Walter
    Matthews, John
    Sause, Wrlliam
    Graham, Mary V.
    Deasy, Joseph O.
    [J]. INTERNATIONAL JOURNAL OF RADIATION ONCOLOGY BIOLOGY PHYSICS, 2007, 69 (04): : 985 - 992
  • [10] Random forests
    Breiman, L
    [J]. MACHINE LEARNING, 2001, 45 (01) : 5 - 32