Investigating data-driven biological subtypes of psychiatric disorders using specification-curve analysis

被引:6
|
作者
Beijers, Lian [1 ]
van Loo, Hanna M. [1 ]
Romeijn, Jan-Willem [2 ]
Lamers, Femke [3 ,4 ]
Schoevers, Robert A. [1 ,5 ]
Wardenaar, Klaas J. [1 ]
机构
[1] Univ Groningen, Univ Med Ctr Groningen, Interdisciplinary Ctr Psychopathol & Emot Regulat, Dept Psychiat, Groningen, Netherlands
[2] Univ Groningen, Fac Philosophy, Groningen, Netherlands
[3] Vrije Univ Amsterdam Med Ctr, Amsterdam Publ Hlth Res Inst, GGZ inGeest, Amsterdam, Netherlands
[4] Vrije Univ Amsterdam Med Ctr, Amsterdam Publ Hlth Res Inst, Dept Psychiat, Amsterdam, Netherlands
[5] Univ Groningen, Univ Med Ctr Groningen, Res Sch Behav & Cognit Neurosci, Dept Psychiat, Groningen, Netherlands
关键词
biochemistry; cluster analysis; complexity; heterogeneity; psychiatry; specification-curve analysis; subtyping; PERSONALIZED MEDICINE; HETEROGENEITY; DEPRESSION; ANXIETY; CLUSTERS; NUMBER;
D O I
10.1017/S0033291720002846
中图分类号
B849 [应用心理学];
学科分类号
040203 ;
摘要
Background Cluster analyses have become popular tools for data-driven classification in biological psychiatric research. However, these analyses are known to be sensitive to the chosen methods and/or modelling options, which may hamper generalizability and replicability of findings. To gain more insight into this problem, we used Specification-Curve Analysis (SCA) to investigate the influence of methodological variation on biomarker-based cluster-analysis results. Methods Proteomics data (31 biomarkers) were used from patients (n = 688) and healthy controls (n = 426) in the Netherlands Study of Depression and Anxiety. In SCAs, consistency of results was evaluated across 1200 k-means and hierarchical clustering analyses, each with a unique combination of the clustering algorithm, fit-index, and distance metric. Next, SCAs were run in simulated datasets with varying cluster numbers and noise/outlier levels to evaluate the effect of data properties on SCA outcomes. Results The real data SCA showed no robust patterns of biological clustering in either the MDD or a combined MDD/healthy dataset. The simulation results showed that the correct number of clusters could be identified quite consistently across the 1200 model specifications, but that correct cluster identification became harder when the number of clusters and noise levels increased. Conclusion SCA can provide useful insights into the presence of clusters in biomarker data. However, SCA is likely to show inconsistent results in real-world biomarker datasets that are complex and contain considerable levels of noise. Here, the number and nature of the observed clusters may depend strongly on the chosen model-specification, precluding conclusions about the existence of biological clusters among psychiatric patients.
引用
收藏
页码:1089 / 1100
页数:12
相关论文
共 16 条
  • [1] Probing Birth-Order Effects on Narrow Traits Using Specification-Curve Analysis
    Rohrer, Julia M.
    Egloff, Boris
    Schmukle, Stefan C.
    PSYCHOLOGICAL SCIENCE, 2017, 28 (12) : 1821 - 1832
  • [2] Data-driven biological subtypes of depression: systematic review of biological approaches to depression subtyping
    Beijers, Lian
    Wardenaar, Klaas J.
    van Loo, Hanna M.
    Schoevers, Robert A.
    MOLECULAR PSYCHIATRY, 2019, 24 (06) : 888 - 900
  • [3] Motor and psychiatric features in idiopathic blepharospasm: A data-driven cluster analysis
    Defazio, Giovanni
    Gigante, Angelo F.
    Hallett, Mark
    Berardelli, Alfredo
    Perlmutter, Joel S.
    Berman, Brian D.
    Jankovic, Joseph
    Baumer, Tobias
    Comella, Cynthia
    Ercoli, Tommaso
    Ferrazzano, Gina
    Fox, Susan H.
    Kim, Han-Joon
    Moukheiber, Emile Sami
    Richardson, Sarah Pirio
    Weissbach, Anne
    Jinnah, Hyder A.
    PARKINSONISM & RELATED DISORDERS, 2022, 104 : 94 - 98
  • [4] Which Data to Meta-Analyze, and How? A Specification-Curve and Multiverse-Analysis Approach to Meta-Analysis
    Voracek, Martin
    Kossmeier, Michael
    Tran, Ulrich S.
    ZEITSCHRIFT FUR PSYCHOLOGIE-JOURNAL OF PSYCHOLOGY, 2019, 227 (01): : 64 - 82
  • [5] A data-driven transdiagnostic analysis of white matter integrity in young adults with major psychiatric disorders
    Hermens, Daniel F.
    Hatton, Sean N.
    White, Django
    Lee, Rico S. C.
    Guastella, Adam J.
    Scott, Elizabeth M.
    Naismith, Sharon L.
    Hickie, Ian B.
    Lagopoulos, Jim
    PROGRESS IN NEURO-PSYCHOPHARMACOLOGY & BIOLOGICAL PSYCHIATRY, 2019, 89 : 73 - 83
  • [6] Uncovering psychiatric phenotypes using unsupervised machine learning: A data-driven symptoms approach
    Hofman, Amy
    Lier, Isabelle
    Ikram, M. Arfan
    van Wingerden, Marijn
    Luik, Annemarie I.
    EUROPEAN PSYCHIATRY, 2023, 66 (01)
  • [7] Data-driven subtypes of late-life depression-secondary analysis of a cluster-randomized RCT
    van Diepen, Judith
    Hendriks, Gert-Jan
    Zuidersma, Marij
    Oude Voshaar, Richard
    Janssen, Noortje
    AGING & MENTAL HEALTH, 2025,
  • [8] Deep Clinical Phenotyping of Schizophrenia Spectrum Disorders Using Data-Driven Methods: Marching towards Precision Psychiatry
    Habtewold, Tesfa Dejenie
    Hao, Jiasi
    Liemburg, Edith J.
    Basturk, Nalan
    Bruggeman, Richard
    Alizadeh, Behrooz Z.
    JOURNAL OF PERSONALIZED MEDICINE, 2023, 13 (06):
  • [9] Latent class analysis of psychotic-affective disorders with data-driven plasma proteomics
    Rhee, Sang Jin
    Shin, Dongyoon
    Shin, Daun
    Song, Yoojin
    Joo, Eun-Jeong
    Jung, Hee Yeon
    Roh, Sungwon
    Lee, Sang-Hyuk
    Kim, Hyeyoung
    Bang, Minji
    Lee, Kyu Young
    Kim, Se Hyun
    Kim, Minah
    Lee, Jihyeon
    Kim, Jaenyeon
    Kim, Yeongshin
    Kwon, Jun Soo
    Ha, Kyooseob
    Kim, Youngsoo
    Ahn, Yong Min
    TRANSLATIONAL PSYCHIATRY, 2023, 13 (01)
  • [10] Big Data-driven for Fuel Quality using NIR Spectrometry Analysis
    Almanjahie, Ibrahim M.
    Kaid, Zoulikha
    Assiri, Khlood A.
    Laksaci, Ali
    CHIANG MAI JOURNAL OF SCIENCE, 2021, 48 (04): : 1161 - 1172