Integration of datasets for individual prediction of DNA methylation-based biomarkers

被引:1
|
作者
Merzbacher, Charlotte [1 ]
Ryan, Barry [1 ]
Goldsborough, Thibaut [1 ]
Hillary, Robert F. [2 ]
Campbell, Archie [2 ]
Murphy, Lee [3 ]
Mcintosh, Andrew M. [2 ,4 ]
Liewald, David [5 ]
Harris, Sarah E. [5 ]
Mcrae, Allan F. [6 ]
Cox, Simon R. [5 ]
Cannings, Timothy I. [7 ]
Vallejos, Catalina A. [8 ,9 ]
Mccartney, Daniel L. [2 ]
Marioni, Riccardo E. [2 ]
机构
[1] Univ Edinburgh, Sch Informat, Edinburgh EH8 9AB, Scotland
[2] Univ Edinburgh, Inst Genet & Canc, Ctr Genom & Expt Med, Edinburgh EH4 2XU, Scotland
[3] Univ Edinburgh, Edinburgh Clin Res Facil, Edinburgh EH4 2XU, Scotland
[4] Univ Edinburgh, Ctr Clin Brain Sci, Div Psychiat, Edinburgh, Scotland
[5] Univ Edinburgh, Dept Psychol, Lothian Birth Cohorts, Edinburgh EH8 9JZ, Scotland
[6] Univ Queensland, Inst Mol Biosci, Brisbane, Australia
[7] Univ Edinburgh, Maxwell Inst Math Sci, Sch Math, Edinburgh EH9 3FD, Scotland
[8] Univ Edinburgh, Inst Genet & Canc, MRC Human Genet Unit, Edinburgh EH4 2XU, Scotland
[9] Alan Turing Inst, London, England
基金
英国惠康基金;
关键词
DNA methylation; Prediction; Biomarker; QUANTILE NORMALIZATION; PACKAGE; DESIGN;
D O I
10.1186/s13059-023-03114-5
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
BackgroundEpigenetic scores (EpiScores) can provide biomarkers of lifestyle and disease risk. Projecting new datasets onto a reference panel is challenging due to separation of technical and biological variation with array data. Normalisation can standardise data distributions but may also remove population-level biological variation.ResultsWe compare two birth cohorts (Lothian Birth Cohorts of 1921 and 1936 - nLBC1921 = 387 and nLBC1936 = 498) with blood-based DNA methylation assessed at the same chronological age (79 years) and processed in the same lab but in different years and experimental batches. We examine the effect of 16 normalisation methods on a novel BMI EpiScore (trained in an external cohort, n = 18,413), and Horvath's pan-tissue DNA methylation age, when the cohorts are normalised separately and together. The BMI EpiScore explains a maximum variance of R2=24.5% in BMI in LBC1936 (SWAN normalisation). Although there are cross-cohort R2 differences, the normalisation method makes a minimal difference to within-cohort estimates. Conversely, a range of absolute differences are seen for individual-level EpiScore estimates for BMI and age when cohorts are normalised separately versus together. While within-array methods result in identical EpiScores whether a cohort is normalised on its own or together with the second dataset, a range of differences is observed for between-array methods.ConclusionsNormalisation methods returning similar EpiScores, whether cohorts are analysed separately or together, will minimise technical variation when projecting new data onto a reference panel. These methods are important for cases where raw data is unavailable and joint normalisation of cohorts is computationally expensive.
引用
收藏
页数:12
相关论文
共 50 条
  • [21] DNA methylation-based biomarkers for ageing long-lived cetaceans
    Parsons, Kim M.
    Haghani, Amin
    Zoller, Joseph A.
    Lu, Ake T.
    Fei, Zhe
    Ferguson, Steven H.
    Garde, Eva
    Hanson, M. Bradley
    Emmons, Candice K.
    Matkin, Craig O.
    Young, Brent G.
    Koski, William R.
    Horvath, Steve
    MOLECULAR ECOLOGY RESOURCES, 2023, 23 (06) : 1241 - 1256
  • [22] Whole-genome DNA methylation and DNA methylation-based biomarkers in lung squamous cell carcinoma
    Cai, Qidong
    He, Boxue
    Tu, Guangxu
    Peng, Weilin
    Shi, Shuai
    Qian, Banglun
    Liang, Qingchun
    Peng, Shaoliang
    Tao, Yongguang
    Wang, Xiang
    ISCIENCE, 2023, 26 (07)
  • [23] The association between DNA methylation and human height and a prospective model of DNA methylation-based height prediction
    Zhonghua Wang
    Guangping Fu
    Guanju Ma
    Chunyan Wang
    Qian Wang
    Chaolong Lu
    Lihong Fu
    Xiaojing Zhang
    Bin Cong
    Shujin Li
    Human Genetics, 2024, 143 : 401 - 421
  • [24] DNA Methylation-Based Prediction of Post-operative Atrial Fibrillation
    Fischer, Matthew A.
    Mahajan, Aman
    Cabaj, Maximilian
    Kimball, Todd H.
    Morselli, Marco
    Soehalim, Elizabeth
    Chapski, Douglas J.
    Montoya, Dennis
    Farrell, Colin P.
    Scovotti, Jennifer
    Bueno, Claudia T.
    Mimila, Naomi A.
    Shemin, Richard J.
    Elashoff, David
    Pellegrini, Matteo
    Monte, Emma
    Vondriska, Thomas M.
    FRONTIERS IN CARDIOVASCULAR MEDICINE, 2022, 9
  • [25] The association between DNA methylation and human height and a prospective model of DNA methylation-based height prediction
    Wang, Zhonghua
    Fu, Guangping
    Ma, Guanju
    Wang, Chunyan
    Wang, Qian
    Lu, Chaolong
    Fu, Lihong
    Zhang, Xiaojing
    Cong, Bin
    Li, Shujin
    HUMAN GENETICS, 2024, 143 (03) : 401 - 421
  • [26] DNA methylation-based age prediction using cell separation algorithm
    Jaddi, Najmeh Sadat
    Abadeh, Mohammad Saniee
    COMPUTERS IN BIOLOGY AND MEDICINE, 2020, 121
  • [27] DNA methylation-based diagnostic and prognostic biomarkers of nonsmoking lung adenocarcinoma patients
    Zhang, Xiaoming
    Gao, Chundi
    Liu, Lijuan
    Zhou, Chao
    Liu, Cun
    Li, Jia
    Zhuang, Jing
    Sun, Changgang
    JOURNAL OF CELLULAR BIOCHEMISTRY, 2019, 120 (08) : 13520 - 13530
  • [28] Adaptive feature selection framework for DNA methylation-based age prediction
    Zahra Momeni
    Mohammad Saniee Abadeh
    Soft Computing, 2022, 26 : 3777 - 3788
  • [29] Circulating DNA methylation-based diagnostic, prognostic, and predictive biomarkers in colorectal cancer
    Beibei Chen
    Huan Zhao
    Huihui Hu
    Haili Shang
    Hui Wang
    Zhentao Yao
    Jinxi Huang
    Huifang Lv
    Weifeng Xu
    Jianzheng Wang
    Caiyun Nie
    Jing Zhao
    Yunduan He
    Saiqi Wang
    Xiaobing Chen
    Scientific Reports, 15 (1)
  • [30] Adaptive feature selection framework for DNA methylation-based age prediction
    Momeni, Zahra
    Abadeh, Mohammad Saniee
    SOFT COMPUTING, 2022, 26 (08) : 3777 - 3788