Extracting Threshold Conceptual Structures from Web Documents

被引：3

作者：

Ciobanu, Gabriel ^{[1
]}

Horne, Ross ^{[1
]}

Vaideanu, Cristian ^{[2
]}

机构：

[1] Romanian Acad, Inst Comp Sci, Iasi, Romania

[2] AI Cuza Univ Ia, Fac Math, Iasi, Romania

来源：

GRAPH-BASED REPRESENTATION AND REASONING | 2014年 / 8577卷

关键词：

D O I：

10.1007/978-3-319-08389-6_12

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In this paper we describe an iterative approach based on formal concept analysis to refine the information retrieval process. Based on weights for ranking documents we define a weighted formal context. We use a Galois connection to introduce a new type of formal concept that allows us to work with specific thresholds for searching words in Web documents. By increasing the threshold, we obtain smaller lattices with more relevant concepts, thus improving the retrieval of more specific items. We use techniques for processing large data sets in parallel, to generate sequences of Galois lattices, overcoming the time complexity of building a lattice for an entire large context.

引用

页码：130 / 144

页数：15

共 50 条

[21] Extracting domain-specific terms from unlabeled web documents by bootstrapping and term classifiers
Liu, Tao
Wang, Xiao-Long
Liu, Bing-Quan
Liu, Yuan-Chao
Li, Ming-Hui
2007 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN AND CYBERNETICS, VOLS 1-8, 2007, : 1536 - 1541
[22] Fixing the Threshold for Effective Detection of Near Duplicate Web Documents in Web Crawling
Narayana, V. A.
Premchand, P.
Govardhan, A.
ADVANCED DATA MINING AND APPLICATIONS, ADMA 2010, PT I, 2010, 6440 : 169 - 180
[23] EXTRACTING THE MAIN CONTENT OF WEB DOCUMENTS BASED ON A NAIVE SMOOTHING METHOD
Mohammadzadeh, Hadi
Gottron, Thomas
Schweiggert, Franz
Nakhaeizadeh, Gholamreza
KDIR 2011: PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND INFORMATION RETRIEVAL, 2011, : 470 - 475
[24] Extracting semantic relationships between terms from PC documents and its applications to web search personalization
Ohshima, H
Oyama, S
Tanaka, K
FRONTIERS OF WWW RESEARCH AND DEVELOPMENT - APWEB 2006, PROCEEDINGS, 2006, 3841 : 579 - 590
[25] A Study of Extracting Knowledge from Guideline Documents
Taboada, M.
Meizoso, M.
Martinez, D.
Tellado, S.
COMPUTER AIDED SYSTEMS THEORY - EUROCAST 2009, 2009, 5717 : 195 - +
[26] Extracting Topical Phrases from Clinical Documents
He, Yulan
THIRTIETH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2016, : 2957 - 2963
[27] Extracting mathematical expressions from postscript documents
Department of Precision Machinery and Precision Instrumentation, University of Science and Technology of China, Hefei 230027, China
不详
Shu Ju Cai Ji Yu Chu Li, 2008, 4 (454-458):
[28] Extracting Time Information from Korean Documents
Lee, Seung-Dong
Jeong, Young-Seob
2023 IEEE INTERNATIONAL CONFERENCE ON BIG DATA AND SMART COMPUTING, BIGCOMP, 2023, : 407 - 409
[29] Extracting mathematical semantics from LATEX documents
Stuber, J
van den Brand, M
PRINCIPLES AND PRACTICE OF SEMANTIC WEB REASONING, 2003, 2901 : 160 - 173
[30] Extracting digital fingerprints from Chinese documents
Liu, Guo-Hua
Ma, Hui-Dong
Li, Xu
Liang, Peng
CIS: 2007 INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND SECURITY, PROCEEDINGS, 2007, : 438 - 441

← 1 2 3 4 5 →