An Approach to Measuring Semantic Relatedness of Geographic Terminologies Using a Thesaurus and Lexical Database Sources

被引:14
作者
Chen, Zugang [1 ,2 ,3 ]
Song, Jia [1 ,2 ]
Yang, Yaping [1 ,2 ,4 ]
机构
[1] State Key Lab Resources & Environm Informat Syst, Beijing 100101, Peoples R China
[2] Chinese Acad Sci, Inst Geog Sci & Nat Resources Res, Beijing 100101, Peoples R China
[3] Univ Chinese Acad Sci, Beijing 100049, Peoples R China
[4] Jiangsu Ctr Collaborat Innovat Geog Informat Reso, Nanjing 210023, Jiangsu, Peoples R China
来源
ISPRS INTERNATIONAL JOURNAL OF GEO-INFORMATION | 2018年 / 7卷 / 03期
基金
中国国家自然科学基金;
关键词
geographic terminology; semantic relatedness; thesaurus; lexical databases; thesaurus-lexical relatedness measure (TLRM); Geospatial Information Retrieval (GIR); INTERRATER RELIABILITY; SIMILARITY; INFORMATION; CONTEXT; RETRIEVAL; WORDNET; RATINGS;
D O I
10.3390/ijgi7030098
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In geographic information science, semantic relatedness is important for Geographic Information Retrieval (GIR), Linked Geospatial Data, geoparsing, and geo-semantics. But computing the semantic similarity/relatedness of geographic terminology is still an urgent issue to tackle. The thesaurus is a ubiquitous and sophisticated knowledge representation tool existing in various domains. In this article, we combined the generic lexical database (WordNet or HowNet) with the Thesaurus for Geographic Science and proposed a thesaurus-lexical relatedness measure (TLRM) to compute the semantic relatedness of geographic terminology. This measure quantified the relationship between terminologies, interlinked the discrete term trees by using the generic lexical database, and realized the semantic relatedness computation of any two terminologies in the thesaurus. The TLRM was evaluated on a new relatedness baseline, namely, the Geo-Terminology Relatedness Dataset (GTRD) which was built by us, and the TLRM obtained a relatively high cognitive plausibility. Finally, we applied the TLRM on a geospatial data sharing portal to support data retrieval. The application results of the 30 most frequently used queries of the portal demonstrated that using TLRM could improve the recall of geospatial data retrieval in most situations and rank the retrieval results by the matching scores between the query of users and the geospatial dataset.
引用
收藏
页数:22
相关论文
共 63 条
[1]  
AISSI S, 2016, P 17 IEEE ACIS INT C, P457
[2]  
[Anonymous], 1997, P 10 RES COMP LING I
[3]  
[Anonymous], 1989, Stat. Sci., DOI DOI 10.1214/SS/1177012580
[4]  
[Anonymous], 2001, P 12 EUR C MACH LEAR, DOI DOI 10.1007/3-540-44795-4_42
[5]  
Ballatore Andrea, 2012, Web and Wireless Geographical Information Systems. Proceedings 11th International Symposium, W2GIS 2012, P151, DOI 10.1007/978-3-642-29247-7_12
[6]   The semantic similarity ensemble [J].
Ballatore, Andrea ;
Bertolotto, Michela ;
Wilson, David C. .
JOURNAL OF SPATIAL INFORMATION SCIENCE, 2013, (07) :27-43
[7]   A Structural-Lexical Measure of Semantic Similarity for Geo-Knowledge Graphs [J].
Ballatore, Andrea ;
Bertolotto, Michela ;
Wilson, David C. .
ISPRS INTERNATIONAL JOURNAL OF GEO-INFORMATION, 2015, 4 (02) :471-492
[8]   An evaluative baseline for geo-semantic relatedness and similarity [J].
Ballatore, Andrea ;
Bertolotto, Michela ;
Wilson, David C. .
GEOINFORMATICA, 2014, 18 (04) :747-767
[9]   Computing the semantic similarity of geographic terms using volunteered lexical definitions [J].
Ballatore, Andrea ;
Wilson, David C. ;
Bertolotto, Michela .
INTERNATIONAL JOURNAL OF GEOGRAPHICAL INFORMATION SCIENCE, 2013, 27 (10) :2099-2118
[10]   Geographic knowledge extraction and semantic similarity in OpenStreetMap [J].
Ballatore, Andrea ;
Bertolotto, Michela ;
Wilson, David C. .
KNOWLEDGE AND INFORMATION SYSTEMS, 2013, 37 (01) :61-81