Symptom-based drug prediction of lifestyle-related chronic diseases using unsupervised machine learning techniques

被引:2
作者
Bhattacharjee S. [1 ]
Saha B. [1 ]
Saha S. [2 ]
机构
[1] Department of Computer Science and Engineering, University of Calcutta, JD-2, Sector-III, Salt Lake, Kolkata
[2] Department of Biological Sciences, Bose Institute, EN 80, Sector V, Bidhan Nagar, Kolkata
关键词
Clustering; Drugs; Lifestyle-related diseases; Machine learning; Symptoms;
D O I
10.1016/j.compbiomed.2024.108413
中图分类号
学科分类号
摘要
Background and objectives: Lifestyle-related diseases (LSDs) impose a substantial economic burden on patients and health care services. LSDs are chronic in nature and can directly affect the heart and lungs. Therapeutic interventions only based on symptoms can be crucial for prompt treatment initiation in LSDs, as symptoms are the first information available to clinicians. So, this work aims to apply unsupervised machine learning (ML) techniques for developing models to predict drugs from symptoms for LSDs, with a specific focus on pulmonary and heart diseases. Methods: The drug-disease and disease-symptom associations of 143 LSDs, 1271 drugs, and 305 symptoms were used to compute direct associations between drugs and symptoms. ML models with four different algorithms – K-Means, Bisecting K-Means, Mean Shift, and Balanced Iterative Reducing and Clustering using Hierarchies (BIRCH) – were developed to cluster the drugs using symptoms as features. The optimal model was saved in a server for the development of a web application. A web application was developed to perform the prediction based on the optimal model. Results: The Bisecting K-means model showed the best performance with a silhouette coefficient of 0.647 and generated 138 drug clusters. The drugs within the optimal clusters showed good similarity based on i) gene ontology annotations of the gene targets, ii) chemical ontology annotations, and iii) maximum common substructure of the drugs. In the web application, the model also provides a confidence score for each predicted drug while predicting from a new set of input symptoms. Conclusion: In summary, direct associations between drugs and symptoms were computed, and those were used to develop a symptom-based drug prediction tool for LSDs with unsupervised ML models. The ML-based prediction can provide a second opinion to clinicians to aid their decision-making for early treatment of LSD patients. The web application (URL - http://bicresources.jcbose.ac.in/ssaha4/sdldpred) can provide a simple interface for all end-users to perform the ML-based prediction. © 2024
引用
收藏
相关论文
共 93 条
  • [1] Mathur P., Mascarenhas L., Lifestyle diseases: keeping fit for a better tomorrow, Indian J. Med. Res., 149, (2019)
  • [2] Mitrou P., Is lifestyle modification the key to counter chronic diseases?, Nutrients, 14, (2022)
  • [3] Slomski A., Chronic disease burden and financial problems are intertwined, JAMA, 328, pp. 1288-1289, (2022)
  • [4] Jung H., Kwon Y.D., Noh J.-W., Financial burden of catastrophic health expenditure on households with chronic diseases: financial ratio analysis, BMC Health Serv. Res., 22, (2022)
  • [5] Ng R., Sutradhar R., Yao Z., Wodchis W.P., Rosella L.C., Smoking, drinking, diet and physical activity—modifiable lifestyle risk factors and their associations with age to first chronic disease, Int. J. Epidemiol., 49, pp. 113-130, (2020)
  • [6] Engelen L., Gale J., Chau J.Y., Hardy L.L., Mackey M., Johnson N., Shirley D., Bauman A., Who is at risk of chronic disease? Associations between risk profiles of physical activity, sitting and cardio‐metabolic disease in Australian adults, Aust. N. Z. J. Publ. Health, 41, pp. 178-183, (2017)
  • [7] Jirik V., Rimanova V., Janulkova T., Siemiatkowski G., Osrodka L., Krajny E., Lifetime losses due to cardiovascular and respiratory diseases attributable to air pollution in polluted and unpolluted areas, Int. J. Environ. Health Res., pp. 1-15, (2023)
  • [8] Zeckhauser R., Shepard D., Where now for saving lives?, Law Contemp. Probl., 40, pp. 5-45, (1976)
  • [9] Murray C.J., Lopez A.D., Jamison D.T., The global burden of disease in 1990: summary results, sensitivity analysis and future directions, Bull. World Health Organ., 72, pp. 495-509, (1994)
  • [10] Van Wilder L., Devleesschauwer B., Clays E., Van der Heyden J., Charafeddine R., Scohy A., De Smedt D., QALY losses for chronic diseases and its social distribution in the general population: results from the Belgian Health Interview Survey, BMC Publ. Health, 22, (2022)