Extracting Threshold Conceptual Structures from Web Documents

被引:3
作者
Ciobanu, Gabriel [1 ]
Horne, Ross [1 ]
Vaideanu, Cristian [2 ]
机构
[1] Romanian Acad, Inst Comp Sci, Iasi, Romania
[2] AI Cuza Univ Ia, Fac Math, Iasi, Romania
来源
GRAPH-BASED REPRESENTATION AND REASONING | 2014年 / 8577卷
关键词
D O I
10.1007/978-3-319-08389-6_12
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper we describe an iterative approach based on formal concept analysis to refine the information retrieval process. Based on weights for ranking documents we define a weighted formal context. We use a Galois connection to introduce a new type of formal concept that allows us to work with specific thresholds for searching words in Web documents. By increasing the threshold, we obtain smaller lattices with more relevant concepts, thus improving the retrieval of more specific items. We use techniques for processing large data sets in parallel, to generate sequences of Galois lattices, overcoming the time complexity of building a lattice for an entire large context.
引用
收藏
页码:130 / 144
页数:15
相关论文
共 50 条
  • [41] Extracting Interlinear Glossed Text from LATEX Documents
    Schenner, Mathias
    Nordhoff, Sebastian
    LREC 2016 - TENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2016, : 4044 - 4048
  • [42] Extracting Comparative Commonsense from the Web
    Cao, Yanan
    Cao, Cungen
    Zang, Liangjun
    Wang, Shi
    Wang, Dongsheng
    INTELLIGENT INFORMATION PROCESSING V, 2010, 340 : 154 - 162
  • [43] Extracting Logical Schema from the Web
    Vincenza Carchiolo
    Alessandro Longheu
    Michele Malgeri
    Applied Intelligence, 2003, 18 : 341 - 355
  • [44] Optimal threshold control by the robots of web search engines with obsolescence of documents
    Avrachenkov, Konstantin
    Dudin, Alexander
    Klimenok, Valentina
    Nain, Philippe
    Semenova, Olga
    COMPUTER NETWORKS, 2011, 55 (08) : 1880 - 1893
  • [45] Extracting logical schema from the web
    Carchiolo, V
    Longheu, A
    Malgeri, M
    APPLIED INTELLIGENCE, 2003, 18 (03) : 341 - 355
  • [46] Extracting Company Information from the Web
    Lam, Man I.
    Gong, Zhiguo
    Guo, Jingzhi
    2009 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN AND CYBERNETICS (SMC 2009), VOLS 1-9, 2009, : 3640 - 3645
  • [47] A METHOD FOR EXTRACTING WATERMARKS FROM TEXTURED PRINTED DOCUMENTS
    Sergeyev, V. V.
    Fedoseev, V. A.
    COMPUTER OPTICS, 2014, 38 (04) : 825 - 832
  • [48] A methodology for extracting ontological knowledge from Spanish documents
    Valencia-García, R
    Castellanos-Nieves, D
    Fernández-Breis, JT
    Vivancos-Vicente, PJ
    COMPUTATIONAL LINGUISTICS AND INTELLIGENT TEXT PROCESSING, 2006, 3878 : 71 - 80
  • [49] Applying GaiusT for Extracting Requirements from Legal Documents
    Zeni, Nicola
    Mich, Luisa
    Mylopoulos, John
    Cordy, James R.
    2013 6TH INTERNATIONAL WORKSHOP ON REQUIREMENTS ENGINEERING AND LAW (RELAW), 2013, : 65 - 68
  • [50] Metadata based framework for extracting and using web sites structures
    Information Systems Lab, Hiroshima, Japan
    Int Conf Multimedia Comput Syst Proc, (51-56):