Terms Mining in Document-Based NoSQL: Response to Unstructured Data

被引:3
作者
Lomotey, Richard K. [1 ]
Deters, Ralph [1 ]
机构
[1] Univ Saskatchewan, Dept Comp Sci, Saskatoon, SK S7N 0W0, Canada
来源
2014 IEEE INTERNATIONAL CONGRESS ON BIG DATA (BIGDATA CONGRESS) | 2014年
关键词
Unstructured Data Mining; Big Bata; Viterbi algorithm; Terms; NoSQL; Association Rules; classification; clustering;
D O I
10.1109/BigData.Congress.2014.99
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Unstructured data mining has become topical recently due to the availability of high-dimensional and voluminous digital content (known as "Big Data") across the enterprise spectrum. The Relational Database Management Systems (RDBMS) have been employed over the past decades for content storage and management, but, the ever-growing heterogeneity in today's data calls for a new storage approach. Thus, the NoSQL database has emerged as the preferred storage facility nowadays since the facility supports unstructured data storage. This creates the need to explore efficient data mining techniques from such NoSQL systems since the available tools and frameworks which are designed for RDBMS are often not directly applicable. In this paper, we focused on topics and terms mining, based on clustering, in document-based NoSQL. This is achieved by adapting the architectural design of an analytics-as-a-service framework and the proposal of the Viterbi algorithm to enhance the accuracy of the terms classification in the system. The results from the pilot testing of our work show higher accuracy in comparison to some previously proposed techniques such as the parallel search.
引用
收藏
页码:661 / 668
页数:8
相关论文
共 50 条
  • [21] A Study of Genomic Data Provenance in NoSQL Document-Oriented Database Systems
    Guimaraes, Valeria
    Hondo, Fernanda
    Almeida, Rodrigo
    Vera, Harley
    Holanda, Maristela
    Araujo, Aleteia
    Walter, Maria Emilia
    Lifschitz, Sergio
    PROCEEDINGS 2015 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE, 2015, : 1525 - 1531
  • [22] UML4NOSQL: A NOVEL APPROACH FOR MODELING NOSQL DOCUMENT-ORIENTED DATABASES BASED ON UML
    Maicha, Mohammed ElHabib
    Ouinten, Youcef
    Ziani, Benameur
    COMPUTING AND INFORMATICS, 2022, 41 (03) : 813 - 833
  • [23] Ontology Based Data Integration of NoSQL Datastores
    Kiran, V. K.
    Vijayakumar, R.
    2014 9TH INTERNATIONAL CONFERENCE ON INDUSTRIAL AND INFORMATION SYSTEMS (ICIIS), 2014, : 423 - 428
  • [24] Cloud-based NoSQL Data Migration
    Bansel, Aryan
    Gonzalez-Velez, Horacio
    Chis, Adriana E.
    2016 24TH EUROMICRO INTERNATIONAL CONFERENCE ON PARALLEL, DISTRIBUTED, AND NETWORK-BASED PROCESSING (PDP), 2016, : 224 - 231
  • [25] Data Warehouse Based on NoSQL: a literature mapping
    de Oliveira, Beatriz Fragnan P.
    Victorino, Marcio de Carvalho
    Holanda, Maristela
    PROCEEDINGS OF 2021 16TH IBERIAN CONFERENCE ON INFORMATION SYSTEMS AND TECHNOLOGIES (CISTI'2021), 2021,
  • [26] Enhanced Elearning Application for Data Mining in a NoSQL Distributed Database Management System
    Valentin, Pupezescu
    Mailena-Catalina, Dragomir
    PROCEEDINGS OF THE 14TH INTERNATIONAL CONFERENCE ON VIRTUAL LEARNING, ICVL 2019, 2019, : 476 - 482
  • [27] MDA Process to Extract the Data Model from Document-oriented NoSQL Database
    Brahim, Amal Ait
    Ferhat, Rabah Tighilt
    Zurfluh, Gilles
    PROCEEDINGS OF THE 21ST INTERNATIONAL CONFERENCE ON ENTERPRISE INFORMATION SYSTEMS (ICEIS), VOL 1, 2019, : 141 - 148
  • [28] Real-Time Effective Framework for Unstructured Data Mining
    Lomotey, Richard K.
    Deters, Ralph
    2013 12TH IEEE INTERNATIONAL CONFERENCE ON TRUST, SECURITY AND PRIVACY IN COMPUTING AND COMMUNICATIONS (TRUSTCOM 2013), 2013, : 1081 - 1088
  • [29] RSenter: Tool for Topics and Terms Extraction from Unstructured Data Debris
    Lomotey, Richard K.
    Deters, Ralph
    2013 IEEE INTERNATIONAL CONGRESS ON BIG DATA, 2013, : 395 - 402
  • [30] Web Service Clustering Approach Based on Network and Fused Document-Based and Tag-Based Topics Similarity
    Ping, Deng Li
    Bing, Guo
    Wen, Zheng
    INTERNATIONAL JOURNAL OF WEB SERVICES RESEARCH, 2021, 18 (03) : 63 - 81