Improving ontology-based text classification: An occupational health and security application

被引:29
作者
Sanchez-Pi, Nayat [1 ]
Marti, Luis [2 ]
Bicharra Garcia, Ana Cristina [2 ]
机构
[1] Univ Estado Rio De Janeiro, Inst Math & Stat, Rio De Janeiro, RJ, Brazil
[2] Univ Fed Fluminense, Inst Comp, Niteroi, RJ, Brazil
关键词
Text classification; Ontology; Oil and gas industry;
D O I
10.1016/j.jal.2015.09.008
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Information retrieval has been widely studied due to the growing amounts of textual information available electronically. Nowadays organizations and industries are facing the challenge of organizing, analyzing and extracting knowledge from masses of unstructured information for decision making process. The development of automatic methods to produce usable structured information from unstructured text sources is extremely valuable to them. Opposed to the traditional text classification methods that need a set of well-classified trained corpus to perform efficient classification; the ontology-based classifier benefits from the domain knowledge and provides more accuracy. In a previous work we proposed and evaluated an ontology based heuristic algorithm [28] for occupational health control process, particularly, for the case of automatic detection of accidents from unstructured texts. Our extended proposal is more domain dependent because it uses technical terms and contrast the relevance of these technical terms into the text, so the heuristic is more accurate. It divides the problem in subtasks such as: (i) text analysis, (ii) recognition and (iii) classification of failed occupational health control, resolving accidents as text analysis, recognition and classification of failed occupational health control, resolving accidents. (C) 2015 Elsevier B.V. All rights reserved.
引用
收藏
页码:48 / 58
页数:11
相关论文
共 34 条
[1]  
[Anonymous], 1998, MACHINE LEARNING ECM, DOI DOI 10.1007/BFB0026666
[2]  
[Anonymous], 2004, Lucene in Action
[3]  
[Anonymous], 2000, NATURE STAT LEARNING, DOI DOI 10.1007/978-1-4757-3264-1
[4]  
[Anonymous], 1999, WordNet
[5]  
[Anonymous], 2003, P 12 INT C WORLD WID, DOI DOI 10.1145/775152.775226
[6]   Text classification by boosting weak learners based on terms and concepts [J].
Bloehdorn, S ;
Hotho, A .
FOURTH IEEE INTERNATIONAL CONFERENCE ON DATA MINING, PROCEEDINGS, 2004, :331-334
[7]  
Bodner Richard C, 1996, Knowledge-based approaches to query expansion in information retrieval
[8]  
Camous F, 2007, LECT NOTES COMPUT SC, V4414, P439
[9]  
De la Prieta F., 2014, 2 INT WORKSH LEARN T, P193
[10]   Biomedic Organizations: An intelligent dynamic architecture for KDD [J].
De Paz, Juan F. ;
Bajo, Javier ;
Lopez, Vivian F. ;
Corchado, Juan M. .
INFORMATION SCIENCES, 2013, 224 :49-61