Data Mining from NoSQL Document-Append Style Storages

被引:1
|
作者
Lomotey, Richard K. [1 ]
Deters, Ralph [1 ]
机构
[1] Univ Saskatchewan, Dept Comp Sci, Saskatoon, SK S7N 0W0, Canada
关键词
Data mining; NoSQL; Bayesian Rule; Unstructured data; Apriori; Big Data;
D O I
10.1109/ICWS.2014.62
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
The modern data economy, which has been described as "Big Data", has changed the status quo on digital content creation and storage. While data storage has followed the schema-dictated approach for decades, the recent nature of digital content, which is widely unstructured, creates the need to adopt different storage techniques. Thus, the NoSQL database systems have been proposed to accommodate most of the content being generated today. One of such NoSQL databases that have received significant enterprise adoption is the document-append style storage. The emerging concern and challenge however is that, research and tools that can aid data mining processes from such NoSQL databases is generally lacking. Even though document-append style storages allow data accessibility as Web services and over URL/I, building a corresponding data mining tool deviates from the underlying techniques governing web crawlers. Also, existing data mining tools that have been designed for schema-based storages (e.g., RDBMS) are misfits. Hence, our goal in this work is to design a unique data analytics tool that enables knowledge discovery through information retrieval from document-append style storage. The tool is algorithmically built on the inference-based Apriori, which aids us to achieve optimization of the search duration. Preliminary test results of the proposed tool also show high accuracy in comparison to other approaches that were previously proposed.
引用
收藏
页码:385 / 392
页数:8
相关论文
共 50 条
  • [1] Terms Mining in Document-Based NoSQL: Response to Unstructured Data
    Lomotey, Richard K.
    Deters, Ralph
    2014 IEEE INTERNATIONAL CONGRESS ON BIG DATA (BIGDATA CONGRESS), 2014, : 661 - 668
  • [2] Evaluating NoSQL document oriented data model
    Hashem, Hadi
    Ranc, Daniel
    2016 IEEE 4TH INTERNATIONAL CONFERENCE ON FUTURE INTERNET OF THINGS AND CLOUD WORKSHOPS (FICLOUDW), 2016, : 51 - 56
  • [3] Data Ingestion from a Data Lake: The Case of Document-oriented NoSQL Databases
    Abdelhedi, Fatma
    Jemmali, Rym
    Zurfluh, Gilles
    ICEIS: PROCEEDINGS OF THE 24TH INTERNATIONAL CONFERENCE ON ENTERPRISE INFORMATION SYSTEMS - VOL 1, 2022, : 226 - 233
  • [4] Document-oriented Models for Data Warehouses NoSQL Document-oriented for Data Warehouses
    Chevalier, Max
    El Malki, Mohammed
    Kopliku, Arlind
    Teste, Olivier
    Tournier, Ronan
    PROCEEDINGS OF THE 18TH INTERNATIONAL CONFERENCE ON ENTERPRISE INFORMATION SYSTEMS, VOL 1 (ICEIS), 2016, : 142 - 149
  • [5] From Document Warehouse to Column-Oriented NoSQL Document Warehouse
    Ben Messaoud, Ines
    Ben Ali, Refka
    Feki, Jamel
    ICSOFT: PROCEEDINGS OF THE 12TH INTERNATIONAL CONFERENCE ON SOFTWARE TECHNOLOGIES, 2017, : 85 - 94
  • [6] MDA Process to Extract the Data Model from Document-oriented NoSQL Database
    Brahim, Amal Ait
    Ferhat, Rabah Tighilt
    Zurfluh, Gilles
    PROCEEDINGS OF THE 21ST INTERNATIONAL CONFERENCE ON ENTERPRISE INFORMATION SYSTEMS (ICEIS), VOL 1, 2019, : 141 - 148
  • [7] A data replication strategy for document-oriented NoSQL systems
    Tabet, Khaoula
    Mokadem, Riad
    Laouar, Mohamed Ridda
    INTERNATIONAL JOURNAL OF GRID AND UTILITY COMPUTING, 2019, 10 (01) : 53 - 62
  • [8] Data Modeling Guidelines for NoSQL Document-Store Databases
    Imam, Abdullahi Abubakar
    Basri, Shuib
    Ahmad, Rohiza
    Watada, Junzo
    Gonzlez-Aparicio, Maria T.
    Almomani, Malek Ahmad
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2018, 9 (10) : 544 - 555
  • [9] NoSQL document data migration strategy in the context of schema evolution
    Fedushko, Solomiia
    Malyi, Roman
    Syerov, Yuriy
    Serdyuk, Pavlo
    DATA & KNOWLEDGE ENGINEERING, 2024, 154
  • [10] NoSQL document store translation to data vault based EDW
    Cernjeka, Katerina
    Jaksic, Danijela
    Jovanovic, Vladan
    2018 41ST INTERNATIONAL CONVENTION ON INFORMATION AND COMMUNICATION TECHNOLOGY, ELECTRONICS AND MICROELECTRONICS (MIPRO), 2018, : 1197 - 1202