Semantic Knowledge Extraction from Research Documents

被引:10
作者
Upadhyay, Rishabh [1 ]
Fujii, Akihiro [1 ]
机构
[1] Hosei Univ, Dept Appl Informat, Tokyo, Japan
来源
PROCEEDINGS OF THE 2016 FEDERATED CONFERENCE ON COMPUTER SCIENCE AND INFORMATION SYSTEMS (FEDCSIS) | 2016年 / 8卷
关键词
Knowledge extraction; Semantics; Ontology; Discourse; Science and technology foresight;
D O I
10.15439/2016F221
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In this paper, we designed a knowledge supporting software system in which sentences and keywords are extracted from large scale document database. This system consists of semantic representation scheme for natural language processing of the document database. Documents originally in a form of PDF are broken into triple-store data after pre-processing. The semantic representation is a hyper-graph which consists of collections of binary relations of 'triples'. According to a certain rule based on user's interests, the system identify sentences and words of interests. The relationship of those extracted sentences is visualized in the form of network graph. A user can introduce new rules to extract additional Knowledge from the Database or paper. For practical example, we choose a set of research papers related IoT for the case study purpose. Applying several rules concerning authors' indicated keywords as well as the system's specified discourse words, significant knowledge are extracted from the papers.
引用
收藏
页码:439 / 445
页数:7
相关论文
共 40 条
  • [31] Nightingal J., 2006, GUARDIAN
  • [32] North S. C., 2004, NEATO USER MANUAL
  • [33] Parikh A. P., 2015, P 2015 C N AM CHAPT
  • [34] QasemiZadeh Behrang, 2010, 9 INT SEM WEB C ISWC
  • [35] Machine learning in automated text categorization
    Sebastiani, F
    [J]. ACM COMPUTING SURVEYS, 2002, 34 (01) : 1 - 47
  • [36] Segaran Toby, 2009, SEMANTIC WEB PROGRAM
  • [37] Shinyama Y., 2010, Pdfminer: Python pdf parser and analyzer
  • [38] Weber M., REFLEXIVE GOVERNANCE, P189
  • [39] Whitney Paul, PACIFIC, P1
  • [40] Yorick Wilks, 1997, INT SUMMER SCH SCIE