Unveiling the impact of dataset size on machine learning models for anxiety and depression prediction amid the COVID-19 pandemic: determining optimal data collection thresholds

被引:0
作者
Arora, Priyanka [1 ]
Dahiya, Sonika [1 ,2 ]
机构
[1] Delhi Technol Univ, Dept Software Engn, Delhi, India
[2] Delhi Technol Univ, Off Software Engn Dept, AB4-101,Main Bawana Rd, Delhi 110042, India
关键词
Machine learning; Depression; Anxiety; Dataset size; Classification; Performance analysis;
D O I
10.1007/s12144-025-07432-8
中图分类号
B84 [心理学];
学科分类号
04 ; 0402 ;
摘要
Our emotional, psychological, and social well-being are all parts of our mental health. An individual's routine can be disrupted and their mental is health affected by stress, despair, and anxiety. Mental health preservation and restoration are essential for each person as well as for communities and society as a whole. The COVID-19 pandemic has triggered a strong emotional and psychological reaction in many people, in addition to triggering a global health emergency. The pandemic's uncertainty, disruptions, and social changes have amplified stress, fear, and depression, which are common responses to crises. Data collection for the COVID-19-related depression and anxiety assessment was limited to online methods because of the ongoing pandemic. In the field of mental health evaluation, the application of machine learning techniques has emerged as a promising strategy for identifying and grasping anxiety and depression symptoms. This paper employed K-Nearest Neighbors (kNN), Random Forest (RF), Decision Tree (DT), and Support Vector Machine (SVM) techniques on the prevalence of anxiety and depression among Bangladeshi university students during the COVID-19 pandemic. This paper addresses how the accuracy of predictions made by various machine learning models is affected by the size of the datasets. The findings of this study illuminate the scalability and generalizability of different machine-learning methods. The findings validate that how accuracy of the models has consistently and significantly improved as the dataset size varies. The performance of classification models is further assessed using the F1 score, precision, and recall.
引用
收藏
页码:8106 / 8119
页数:14
相关论文
共 45 条
  • [1] Healthcare in Metaverse: A Survey on Current Metaverse Applications in Healthcare
    Bansal, Gaurang
    Rajgopal, Karthik
    Chamola, Vinay
    Xiong, Zehui
    Niyato, Dusit
    [J]. IEEE ACCESS, 2022, 10 : 119914 - 119946
  • [2] A lot of randomness is hiding in accuracy
    Ben-David, Arle
    [J]. ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2007, 20 (07) : 875 - 885
  • [3] Breck Eric, 2019, MLSys
  • [4] Random forests
    Breiman, L
    [J]. MACHINE LEARNING, 2001, 45 (01) : 5 - 32
  • [5] Managing boundaries for well-being: a study of work-nonwork balance crafting during the COVID-19 pandemic
    Brogle, Sophie E.
    Kerksieck, Philipp
    Bauer, Georg F.
    Morstatt, Anja I.
    [J]. CURRENT PSYCHOLOGY, 2024, 43 (43) : 33626 - 33639
  • [6] Callahan A, 2017, KEY ADVANCES IN CLINICAL INFORMATICS: TRANSFORMING HEALTH CARE THROUGH HEALTH INFORMATION TECHNOLOGY, P279, DOI 10.1016/B978-0-12-809523-2.00019-4
  • [7] Carollo A., 2022, Dataset-self-perceived loneliness and depression during the COVID-19 Pandemic: a two-wave replication study, DOI [10.5522/04/20183858.v1, DOI 10.5522/04/20183858.V1]
  • [8] Carollo A., 2022, Self-perceived loneliness and depression during the Covid-19 pandemic: a two-wave replication study
  • [9] Modern views of machine learning for precision psychiatry
    Chen, Zhe Sage
    Kulkarni, Prathamesh Param
    Galatzer-Levy, Isaac R.
    Bigio, Benedetta
    Nasca, Carla
    Zhang, Yu
    [J]. PATTERNS, 2022, 3 (11):
  • [10] Early Detection and Prevention of Mental Health Problems: Developmental Epidemiology and Systems of Support
    Costello, E. Jane
    [J]. JOURNAL OF CLINICAL CHILD AND ADOLESCENT PSYCHOLOGY, 2016, 45 (06) : 710 - 717