A frequency-based approach for mining coverage statistics in data integration

被引:4
|
作者
Nie, ZQ [1 ]
Kambhampati, S [1 ]
机构
[1] Arizona State Univ, Dept Comp Sci & Engn, Tempe, AZ 85287 USA
关键词
D O I
10.1109/ICDE.2004.1320013
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Query Optimization in data integration requires source coverage and overlap statistics. Gathering and storing the required statistics presents many challenges, not the least of which is controlling the amount of statistics learned. In this paper we introduce StatMiner, a novel statistics mining approach which automatically generates attribute value hierarchies, efficiently discovers frequently accessed query classes based on the learned attribute Value hierarchies, and learns statistics only with respect to these classes. We describe the details of our method, and present experimental results demonstrating the efficiency and effectiveness of our approach. Our experiments are done in the context of BibFinder, a publicly fielded bibliography mediator.
引用
收藏
页码:387 / 398
页数:12
相关论文
共 50 条
  • [1] Reductions for Frequency-Based Data Mining Problems
    Neumann, Stefan
    Miettinen, Pauli
    2017 17TH IEEE INTERNATIONAL CONFERENCE ON DATA MINING (ICDM), 2017, : 997 - 1002
  • [2] Effectively mining and using coverage and overlap statistics for data integration
    Nie, ZQ
    Kambhampati, S
    Nambiar, U
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2005, 17 (05) : 638 - 651
  • [3] An Alternative to Disproportionality: A Frequency-Based Method for Pharmacovigilance Data Mining
    Jeremy D. Jokinen
    Fabio Lievano
    Linda Scarazzini
    Melissa Truffa
    Therapeutic Innovation & Regulatory Science, 2018, 52 : 294 - 299
  • [4] Frequency-based Rare Events Mining in Administrative Health Data
    Chen, Jie
    Jin, Huidong
    He, Hongxing
    O'Keefe, Christine M.
    Sparks, Ross
    Williams, Graham
    McAullay, Damien
    Kelman, Chris
    ELECTRONIC JOURNAL OF HEALTH INFORMATICS, 2006, 1 (01):
  • [5] An Alternative to Disproportionality: A Frequency-Based Method for Pharmacovigilance Data Mining
    Jokinen, Jeremy D.
    Lievano, Fabio
    Scarazzini, Linda
    Truffa, Melissa
    THERAPEUTIC INNOVATION & REGULATORY SCIENCE, 2018, 52 (03) : 294 - 299
  • [6] A Frequency-Based Algorithm for Workflow Outlier Mining
    Chuang, Yu-Cheng
    Hsu, PingYu
    Wang, MinTzu
    Chen, Sin-Cheng
    FUTURE GENERATION INFORMATION TECHNOLOGY, 2010, 6485 : 191 - 207
  • [7] Contextual Classification Of Remotely Sensed Data Using Frequency-Based Approach
    Mustapha, M. R.
    Lim, H. S.
    MatJafri, M. Z.
    Syahreza, S.
    4TH ASIAN PHYSICS SYMPOSIUM: AN INTERNATIONAL EVENT, 2010, 1325 : 289 - 292
  • [8] A frequency-based approach to intrusion detection
    Zhou, M
    Lang, SD
    7TH WORLD MULTICONFERENCE ON SYSTEMICS, CYBERNETICS AND INFORMATICS, VOL, III, PROCEEDINGS: COMMUNICATION, NETWORK AND CONTROL SYSTEMS, TECHNOLOGIES AND APPLICATIONS, 2003, : 393 - 397
  • [9] CoverSize: A Global Constraint for Frequency-Based Itemset Mining
    Schaus, Pierre
    Aoga, John O. R.
    Guns, Tias
    PRINCIPLES AND PRACTICE OF CONSTRAINT PROGRAMMING (CP 2017), 2017, 10416 : 529 - 546
  • [10] Accurate frequency-based lexicon generation for opinion mining
    Keshavarz, Hamidreza
    Abadeh, Mohammad Saniee
    JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2017, 33 (04) : 2223 - 2234