Bootstrapping of Semantic Relation Extraction for a Morphologically Rich Language: Semi-Supervised Learning of Semantic Relations

被引:1
作者
Jagan, Balaji [1 ]
Parthasarathi, Ranjani [2 ]
Geetha, T., V [1 ]
机构
[1] Anna Univ, Dept Comp Sci & Engn, Chennai, Tamil Nadu, India
[2] Anna Univ, Coll Engn, Dept Informat Sci & Technol, Chennai, Tamil Nadu, India
关键词
Bootstrapping; Semantic Relations; Semi-supervised Learning; Universal Networking Language;
D O I
10.4018/IJSWIS.2019010106
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This article focuses on the use of a bootstrapping approach for the extraction of semantic relations that exist between two different concepts in a Tamil text. The proposed system, bootstrapping approach to semantic UNL relation extraction (BASURE) extracts generic relations that exist between different components of a sentence by exploiting the morphological richness of Tamil. Tamil is essentially a partially free word order language which means that semantic relations that exist between the concepts can occur anywhere in the sentence not necessarily in a fixed order. Here, the authors use Universal Networking Language (UNL), an Interlingua framework, to represent the word-based features and aim to define UNL semantic relations that exist between any two constituents in a sentence. The morphological suffix, lexical category and UNL semantic constraints associated with a word are defined as tuples of the pattern used for bootstrapping. Most systems define the initial set of seed patterns manually. However, this article uses a rule-based approach to obtain word-based features that form tuples of the patterns. A bootstrapping approach is then applied to extract all possible instances from the corpus and to generate new patterns. Here, the authors also introduce the use of UNL ontology to discover the semantic similarity between semantic tuples of the pattern, hence, to learn new patterns from the text corpus in an iterative manner. The use of UNL Ontology makes this approach general and domain independent. The results obtained are evaluated and compared with existing approaches and it has been shown that this approach is generic, can extract all sentence based semantic UNL relations and significantly increases the performance of the generic semantic relation extraction system.
引用
收藏
页码:119 / 149
页数:31
相关论文
共 53 条
[1]  
Abney S, 2004, COMPUT LINGUIST, V30, P364
[2]  
Agichtein E., 2000, ACM 2000. Digital Libraries. Proceedings of the Fifth ACM Conference on Digital Libraries, P85, DOI 10.1145/336597.336644
[3]  
Ali N. Y., 2011, INT J COMPUTERS APPL, V18, P34
[4]   Pattern-based approaches to semantic relation extraction -: A state-of-the-art [J].
Auger, Alain ;
Barriere, Caroline .
TERMINOLOGY, 2008, 14 (01) :1-19
[5]  
Bach N., 2007, Literature review for Language and Statistics II, V2
[6]  
Balaji J., 2011, INT J COMPUT APPL, V26, P11
[7]  
Blanchard E, 2010, PROCEEDINGS OF THE ASME INTERNATIONAL DESIGN ENGINEERING TECHNICAL CONFERENCES AND COMPUTERS AND INFORMATION IN ENGINEERING CONFERENCE, DETC 2010, VOL 4, P315
[8]  
Brin S, 1999, LECT NOTES COMPUT SC, V1590, P172
[9]  
CHEN Miao, 2008, INT C DUBLIN CORE ME, P117
[10]  
Clark S., 2003, P 7 C NATUR LANG HLT, P49