Proteomic prediction of diverse incident diseases: a machine learning-guided biomarker discovery study using data from a prospective cohort study

被引:0
作者
Carrasco-Zanini, Julia [1 ,3 ]
Pietzner, Maik [1 ,2 ,3 ]
Koprulu, Mine [1 ]
Wheeler, Eleanor [1 ]
Kerrison, Nicola [1 ]
Wareham, Nicholas J. [1 ]
Langenberg, Claudia [1 ,2 ,3 ,4 ]
机构
[1] Univ Cambridge, Sch Clin Med, Inst Metab Sci, MRC Epidemiol Unit, Cambridge, England
[2] Charite Univ Med Berlin, Berlin Inst Hlth, Computat Med, Berlin, Germany
[3] Queen Mary Univ London, Precis Healthcare Univ Res Inst, London, England
[4] Charite Univ Med Berlin, Berlin Inst Hlth, Computat Med, D-10117 Berlin, Germany
来源
LANCET DIGITAL HEALTH | 2024年 / 6卷 / 07期
基金
英国医学研究理事会; 英国惠康基金; 英国科研创新办公室;
关键词
PLASMA PROTEOME; LUNG-FUNCTION; CXCL17; RISK;
D O I
暂无
中图分类号
R-058 [];
学科分类号
摘要
Background Broad-capture proteomic technologies have the potential to improve disease prediction, enabling targeted prevention and management, but studies have so far been limited to very few selected diseases and have not evaluated predictive performance across multiple conditions. We aimed to evaluate the potential of serum proteins to improve risk prediction over and above health-derived information and polygenic risk scores across a diverse set of 24 outcomes. Methods We designed multiple case-cohorts nested in the EPIC-Norfolk prospective study, from participants with available serum samples and genome-wide genotype data, with more than 32 974 person -years of follow-up. Participants were middle-aged individuals (aged 40-79 years at baseline) of European ancestry who were recruited from the general population of Norfolk, England, between March, 1993 and December, 1997. We selected participants who developed one of ten less common diseases within 10 years of follow-up; we also subsampled a randomly drawn control subcohort, which also served to investigate 14 more common outcomes (n>70), including all-cause premature mortality (death before the age of 75 years; case numbers 71-437; controls 608-1556). Individuals were excluded from the current study owing to failed genotyping or proteomic quality control, relatedness, or missing information on age, sex, BMI, or smoking status. We used a machine learning framework to derive sparse predictive protein models for the onset of the the 23 individual diseases and all-cause premature mortality, and to derive a single common sparse multimorbidity signature that was predictive across multiple diseases from 2923 serum proteins. Findings Participants who developed one of ten less common diseases within 10 years of follow-up included 482 women and 507 men, with a mean age at baseline of 64<middle dot>56 years (8<middle dot>08). The random subcohort included 990 women and 769 men, with a mean age of 58<middle dot>79 years (9<middle dot>31). As few as five proteins alone outperformed polygenic risk scores for 17 of 23 outcomes (median dfference in concordance index [C-index] 0<middle dot>13 [0 <middle dot> 10-0 <middle dot> 17]) and improved predictive performance when added over basic patient-derived information models for seven outcomes, achieving a median C-index of 0<middle dot>82 (IQR 0<middle dot>77-0<middle dot>82). This included diseases with poor prognosis such as lung cancer (C-index 0<middle dot>85 [+/- cross-validation error 0<middle dot>83-0<middle dot>87]), for which we identified unreported biomarkers such as C -X -C motif chemokine ligand 17. A sparse multimorbidity signature of ten proteins improved prediction across seven outcomes over patient-derived information models, achieving performances (median C-index 0<middle dot>81 [IQR 0<middle dot>80-0<middle dot>82]) similar to those of disease-specific signatures. Interpretation We show the value of broad-capture proteomic biomarker discovery studies across multiple diseases of diverse causes, pointing to those that might benefit the most from proteomic approaches, and the potential to derive common sparse biomarker panels for prediction of multiple diseases at once. This framework could enable follow-up studies to explore the generalisability of proteomic models and to benchmark these against clinical assays, which are required to understand the translational potential of these findings. Funding Medical Research Council, Health Data Research UK, UK Research and Innovation-National Institute for Health and Care Research, Cancer Research UK, and Wellcome Trust. Copyright (c) 2024 The Author(s). Published by Elsevier Ltd. This is an Open Access article under the CC BY 4.0 license.
引用
收藏
页码:e470 / e479
页数:10
相关论文
共 27 条
  • [1] Responsible use of polygenic risk scores in the clinic: potential benefits, risks and gaps
    Adeyemo, Adebowale
    Balaconis, Mary K.
    Darnes, Deanna R.
    Fatumo, Segun
    Granados Moreno, Palmira
    Hodonsky, Chani J.
    Inouye, Michael
    Kanai, Masahiro
    Kato, Kazuto
    Knoppers, Bartha M.
    Lewis, Anna C. F.
    Martin, Alicia R.
    McCarthy, Mark I.
    Meyer, Michelle N.
    Okada, Yukinori
    Richards, J. Brent
    Richter, Lucas
    Ripatti, Samuli
    Rotimi, Charles N.
    Sanderson, Saskia C.
    Sturm, Amy C.
    Verdugo, Ricardo A.
    Widen, Elisabeth
    Willer, Cristen J.
    Wojcik, Genevieve L.
    Zhou, Alicia
    [J]. NATURE MEDICINE, 2021, 27 (11) : 1876 - 1884
  • [2] The blood proteome of imminent lung cancer diagnosis
    Albanes, Demetrius
    Alcala, Karine
    Alcala, Nicolas
    Amos, Christopher, I
    Arslan, Alan A.
    Bassett, Julie K.
    Brennan, Paul
    Cai, Qiuyin
    Chen, Chu
    Feng, Xiaoshuang
    Freedman, Neal D.
    Guida, Florence
    Hung, Rayjean J.
    Hveem, Kristian
    Johansson, Mikael
    Johansson, Mattias
    Koh, Woon-Puay
    Langhammer, Arnulf
    Milne, Roger L.
    Muller, David
    Onwuka, Justina
    Sorgjerd, Elin Pettersen
    Robbins, Hilary A.
    Sesso, Howard D.
    Severi, Gianluca
    Shu, Xiao-Ou
    Sieri, Sabina
    Smith-Byrne, Karl
    Stevens, Victoria
    Tinker, Lesley
    Tjonneland, Anne
    Visvanathan, Kala
    Wang, Ying
    Wang, Renwei
    Weinstein, Stephanie
    Yuan, Jian-Min
    Zahed, Hana
    Zhang, Xuehong
    Zheng, Wei
    [J]. NATURE COMMUNICATIONS, 2023, 14 (01)
  • [3] The Clinical Plasma Proteome: A Survey of Clinical Assays for Proteins in Plasma and Serum
    Anderson, N. Leigh
    [J]. CLINICAL CHEMISTRY, 2010, 56 (02) : 177 - 185
  • [4] The human plasma proteome - History, character, and diagnostic prospects
    Anderson, NL
    Anderson, NG
    [J]. MOLECULAR & CELLULAR PROTEOMICS, 2002, 1 (11) : 845 - 867
  • [5] CXCL17 Is a Major Chemotactic Factor for Lung Macrophages
    Burkhardt, Amanda M.
    Maravillas-Montero, Jose L.
    Carnevale, Christina D.
    Vilches-Cisneros, Natalia
    Flores, Juan P.
    Hevezi, Peter A.
    Zlotnik, Albert
    [J]. JOURNAL OF IMMUNOLOGY, 2014, 193 (03) : 1468 - 1474
  • [6] Proteomic signatures for identification of impaired glucose tolerance
    Carrasco-Zanini, Julia
    Pietzner, Maik
    Lindbohm, Joni, V
    Wheeler, Eleanor
    Oerton, Erin
    Kerrison, Nicola
    Simpson, Missy
    Westacott, Matthew
    Drolet, Dan
    Kivimaki, Mika
    Ostroff, Rachel
    Williams, Stephen A.
    Wareham, Nicholas J.
    Langenberg, Claudia
    [J]. NATURE MEDICINE, 2022, 28 (11) : 2293 - +
  • [7] CXCL17 Is a Specific Diagnostic Biomarker for Severe Pandemic Influenza A(H1N1) That Predicts Poor Clinical Outcome
    Choreno-Parra, Jose Alberto
    Jimenez-Alvarez, Luis Armando
    Ramirez-Martinez, Gustavo
    Sandoval-Vega, Montserrat
    Salinas-Lara, Citlaltepetl
    Sanchez-Garibay, Carlos
    Luna-Rivero, Cesar
    Hernandez-Montiel, Erika Mariana
    Fernandez-Lopez, Luis Alejandro
    Cabrera-Cornejo, Maria Fernanda
    Choreno-Parra, Eduardo Misael
    Cruz-Lagunas, Alfredo
    Dominguez, Andrea
    Marquez-Garcia, Eduardo
    Cabello-Gutierrez, Carlos
    Bolanos-Morales, Francina Valezka
    Mena-Hernandez, Lourdes
    Delgado-Zaldivar, Diego
    Rebolledo-Garcia, Daniel
    Guadarrama-Ortiz, Parmenides
    Regino-Zamarripa, Nora E.
    Mendoza-Milla, Criselda
    Garcia-Latorre, Ethel A.
    Rodiguez-Reyna, Tatiana Sofia
    Cervantes-Rosete, Diana
    Hernandez-Cardenas, Carmen M.
    Khader, Shabaana A.
    Zlotnik, Albert
    Zuniga, Joaquin
    [J]. FRONTIERS IN IMMUNOLOGY, 2021, 12
  • [8] Day N, 1999, BRIT J CANCER, V80, P95
  • [9] Multiplex plasma protein profiling identifies novel markers to discriminate patients with adenocarcinoma of the lung
    Djureinovic, Dijana
    Ponten, Victor
    Landelius, Per
    Al Sayegh, Sahar
    Kappert, Kai
    Kamali-Moghaddam, Masood
    Micke, Patrick
    Stahle, Elisabeth
    [J]. BMC CANCER, 2019, 19 (1)
  • [10] Engineered nanoparticles enable deep proteomics studies at scale by leveraging tunable nano-bio interactions
    Ferdosi, Shadi
    Tangeysh, Behzad
    Brown, Tristan R.
    Everley, Patrick A.
    Figa, Michael
    McLean, Matthew
    Elgierari, Eltaher M.
    Zhao, Xiaoyan
    Garcia, Veder J.
    Wang, Tianyu
    Chang, Matthew E. K.
    Riedesel, Kateryna
    Chu, Jessica
    Mahoney, Max
    Xia, Hongwei
    O'Brien, Evan S.
    Stolarczyk, Craig
    Harris, Damian
    Platt, Theodore L.
    Ma, Philip
    Goldberg, Martin
    Langer, Robert
    Flory, Mark R.
    Benz, Ryan
    Tao, Wei
    Cuevas, Juan Cruz
    Batzoglou, Serafim
    Blume, John E.
    Siddiqui, Asim
    Hornburg, Daniel
    Farokhzad, Omid C.
    [J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2022, 119 (11)