Revisiting Keyword Analysis in a Specialized Corpus: Religious Terminology Extraction

被引:3
作者
Lien, Hsin-Yi [1 ]
机构
[1] Ming Chuan Univ, Grad Sch Educ, Taoyuan Dist, Taiwan
关键词
D O I
10.1080/09296174.2020.1865668
中图分类号
H0 [语言学];
学科分类号
030303 ; 0501 ; 050102 ;
摘要
This study investigates keyword extraction using a compiled Buddhist corpus. It sets out the fundamental mode of generation and refinement of keywords with statistical measures and manual screening with specific criteria. The Buddhist Word List contains 1244 keywords with 375 Pali words in Buddhist literacy. We compared the results of applying occurring frequency, log-likelihood (LL), and odds ratio (OR) in keyword analyses, each of which resulted in different keyword rankings. Our results show that statistical measures are useful for the identification of particular keywords in specific fields and OR is more effective in identifying technical terms. We demonstrate that multilevel keyword analysis is more effective at the identification of high-frequency technical words than either of these methods used alone. Multilevel methods are recommended for the creation of future domain-specific vocabulary lists to overcome the inherent flaws of individual analytic methods.
引用
收藏
页码:269 / 282
页数:14
相关论文
共 29 条