Evaluating the relevance of health-related topics using three similarity measures

被引:0
作者
Zhu, Yifan [1 ]
Zhang, Jin [2 ]
机构
[1] Hangzhou Normal Univ, Sch Publ Hlth, 2318 Yuhangtang Rd, Hangzhou 311121, Zhejiang, Peoples R China
[2] Univ Wisconsin Milwaukee, Sch Informat Studies, Milwaukee, WI USA
关键词
Similarity measures; health topics analysis; medical corpus; semantic linkages; MedlinePlus; MENTAL-HEALTH; INFORMATION-RETRIEVAL; SEMANTIC SIMILARITY; OLDER-ADULTS; MODEL; ENVIRONMENT; NAVIGATION; ACCURACY; INTERNET; CHILDREN;
D O I
10.1177/02666669251316264
中图分类号
G25 [图书馆学、图书馆事业]; G35 [情报学、情报工作];
学科分类号
1205 ; 120501 ;
摘要
This study evaluated the effectiveness of three similarity measures-Cosine similarity, Pearson correlation, and Euclidean distance-in assessing health-related topics on MedlinePlus. The focus was on four health topic subcategories: mental health, children, teenagers, and older adults. Using adjacency matrices of graph theory and the three similarity measures, the study found that both Cosine and Pearson correlation measures were more empirically robust than the Euclidean distance measure. Notably, the alignment in findings from Cosine and Pearson correlation suggests their potential combined use in future research as complementary strategies. To validate the findings, hypothesis testing showed that Cosine and Pearson correlation were significantly effective in identifying similar health topics and distinguishing between different semantic subgroups, whereas Euclidean distance showed limitations. These insights guide the application of adjacency matrices and the selection of suitable similarity measures to evaluate semantic linkages in health topics, enhancing relevance recognition and supporting classification in medical domains.
引用
收藏
页数:20
相关论文
共 82 条
  • [1] Aggarwal CC., 2001, DATABASE THEORYICDT, p420 434, DOI 10.10073-540-44503-X27
  • [2] MedlinePlus at 21: A website devoted to consumer health information
    Ahmed T.
    [J]. Information Services and Use, 2019, 39 (1-2): : 5 - 14
  • [3] Evaluation of semantic similarity metrics applied to the automatic retrieval of medical documents: An UMLS approach
    Alonso, Israel
    Contreras, David
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2016, 44 : 386 - 399
  • [4] [Anonymous], 2007, Programming Collective Intelligence: Building Smart Web 2.0 Applications
  • [5] Baeza-Yates R., 1999, MODERN INFORM RETRIE, V463, DOI DOI 10.1145/553876
  • [6] An ontology-based measure to compute semantic similarity in biomedicine
    Batet, Montserrat
    Sanchez, David
    Valls, Aida
    [J]. JOURNAL OF BIOMEDICAL INFORMATICS, 2011, 44 (01) : 118 - 125
  • [7] Beier J, 2001, INT CONGR SER, V1230, P367
  • [8] Automatic extraction of semantic relations between medical entities: A rule based approach
    Ben Abacha A.
    Zweigenbaum P.
    [J]. Journal of Biomedical Semantics, 2 (Suppl 5)
  • [9] Shortcomings of health information on the Internet
    Benigeri, M
    Pluye, P
    [J]. HEALTH PROMOTION INTERNATIONAL, 2003, 18 (04) : 381 - 386
  • [10] Borgatti S. P., 2018, Analyzing social networks