Association Rule Mining from large datasets of clinical invoices document

被引:0
作者
Agapito, Giuseppe [1 ]
Calabrese, Barbara [1 ]
Guzzi, Pietro Hiram [1 ]
Graziano, Sabrina [2 ]
Cannataro, Mario [1 ]
机构
[1] Univ Catanzaro, Dept Surg & Med Sci, Data Analyt Res Ctr, Catanzaro, Italy
[2] Open Knowledge Technol, Arcavacata Di Rende, Italy
来源
2019 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE (BIBM) | 2019年
关键词
Data Mining; Association Rules; Electronic Invoices; Knowledge Discovery from Databases; DISCOVERY; KNOWLEDGE;
D O I
暂无
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
The concept of massive data generation nowadays affects several domains such as marketing including electronic invoices of large retailers, web access log files, healthcare, life sciences and so on. All these web activities introduced a new way to pay through the concept of electronic invoices (eInvoice), replacing the paper invoices. For these reasons, eInvoicing can be thought of as an innovative digital infrastructure for the issue, transmission, and storage of invoices. The availability of large volumes of eInvoices allows the discovery of new knowledge through data mining in these domains. Thus, users by using data mining can extract knowledge from large invoices documents. In this paper, we present a software tool for mining association rules from invoices produced in healthcare centers. In particular, the tool adopt a novel preprocessing methodology that provides merging, cleaning, formatting and summarization of eInvocies. The methodology can improve the quality of a huge amount of clinical invoices reducing the quantity of irrelevant data, making the remaining data suitable to mine information in form of association rules. The core of the tool allows to extract association rules from eInvoices; as a case study, we discuss the mined rules, highlighting the relationships among the purchased goods.
引用
收藏
页码:2232 / 2238
页数:7
相关论文
共 13 条
  • [1] Services4SNPs: A RESTful Platform for Association Rule Mining and Survival Analysis of Genotyping Data
    Agapito, Giuseppe
    Cannataro, Mario
    [J]. ACM-BCB'18: PROCEEDINGS OF THE 2018 ACM INTERNATIONAL CONFERENCE ON BIOINFORMATICS, COMPUTATIONAL BIOLOGY, AND HEALTH INFORMATICS, 2018, : 517 - 517
  • [2] Extracting Cross-Ontology Weighted Association Rules from Gene Ontology Annotations
    Agapito, Giuseppe
    Milano, Marianna
    Guzzi, Pietro Hiram
    Cannataro, Mario
    [J]. IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2016, 13 (02) : 197 - 208
  • [3] DMET-Miner: Efficient discovery of association rules from pharmacogenomic data
    Agapito, Giuseppe
    Guzzi, Pietro H.
    Cannataro, Mario
    [J]. JOURNAL OF BIOMEDICAL INFORMATICS, 2015, 56 : 273 - 283
  • [4] Agrawal R., 1993, SIGMOD Record, V22, P207, DOI 10.1145/170036.170072
  • [5] AGRAWAL R, 1995, PROC INT CONF DATA, P3, DOI 10.1109/ICDE.1995.380415
  • [6] The KDD process for extracting useful knowledge from volumes of data
    Fayyad, U
    PiatetskyShapiro, G
    Smyth, P
    [J]. COMMUNICATIONS OF THE ACM, 1996, 39 (11) : 27 - 34
  • [7] Fayyad U, 1996, AI MAG, V17, P37
  • [8] Han JW, 2000, SIGMOD RECORD, V29, P1
  • [9] Kumbhare T., 2014, INT J COMPUTER SCI I, V5, P927
  • [10] Prithiviraj P., 2015, Am. J. Comput. Sci. Eng. Surv. AJCSES, V3, P98