Terms Mining in Document-Based NoSQL: Response to Unstructured Data

被引:3
作者
Lomotey, Richard K. [1 ]
Deters, Ralph [1 ]
机构
[1] Univ Saskatchewan, Dept Comp Sci, Saskatoon, SK S7N 0W0, Canada
来源
2014 IEEE INTERNATIONAL CONGRESS ON BIG DATA (BIGDATA CONGRESS) | 2014年
关键词
Unstructured Data Mining; Big Bata; Viterbi algorithm; Terms; NoSQL; Association Rules; classification; clustering;
D O I
10.1109/BigData.Congress.2014.99
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Unstructured data mining has become topical recently due to the availability of high-dimensional and voluminous digital content (known as "Big Data") across the enterprise spectrum. The Relational Database Management Systems (RDBMS) have been employed over the past decades for content storage and management, but, the ever-growing heterogeneity in today's data calls for a new storage approach. Thus, the NoSQL database has emerged as the preferred storage facility nowadays since the facility supports unstructured data storage. This creates the need to explore efficient data mining techniques from such NoSQL systems since the available tools and frameworks which are designed for RDBMS are often not directly applicable. In this paper, we focused on topics and terms mining, based on clustering, in document-based NoSQL. This is achieved by adapting the architectural design of an analytics-as-a-service framework and the proposal of the Viterbi algorithm to enhance the accuracy of the terms classification in the system. The results from the pilot testing of our work show higher accuracy in comparison to some previously proposed techniques such as the parallel search.
引用
收藏
页码:661 / 668
页数:8
相关论文
共 50 条
  • [1] Topics and Terms Mining in Unstructured Data Stores
    Lomotey, Richard K.
    Deters, Ralph
    2013 IEEE 16TH INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE AND ENGINEERING (CSE 2013), 2013, : 854 - 861
  • [2] Unstructured Data Extraction in Distributed NoSQL
    Lomotey, Richard K.
    Deters, Ralph
    2013 7TH IEEE INTERNATIONAL CONFERENCE ON DIGITAL ECOSYSTEMS AND TECHNOLOGIES (DEST), 2013, : 160 - 165
  • [3] Data Mining from NoSQL Document-Append Style Storages
    Lomotey, Richard K.
    Deters, Ralph
    2014 IEEE 21ST INTERNATIONAL CONFERENCE ON WEB SERVICES (ICWS 2014), 2014, : 385 - 392
  • [4] Performance Evaluation of Unstructured NoSQL data over distributed framework
    Nyati, Suyog S.
    Pawar, Shivanand
    Ingle, Rajesh
    2013 INTERNATIONAL CONFERENCE ON ADVANCES IN COMPUTING, COMMUNICATIONS AND INFORMATICS (ICACCI), 2013, : 1623 - 1627
  • [5] USING NoSQL FOR PROCESSING UNSTRUCTURED BIG DATA
    Balakayeva, G. T.
    Phillips, C.
    Darkenbayev, D. K.
    Turdaliyev, M.
    NEWS OF THE NATIONAL ACADEMY OF SCIENCES OF THE REPUBLIC OF KAZAKHSTAN-SERIES OF GEOLOGY AND TECHNICAL SCIENCES, 2019, (06): : 12 - 21
  • [6] NoSQL document store translation to data vault based EDW
    Cernjeka, Katerina
    Jaksic, Danijela
    Jovanovic, Vladan
    2018 41ST INTERNATIONAL CONVENTION ON INFORMATION AND COMMUNICATION TECHNOLOGY, ELECTRONICS AND MICROELECTRONICS (MIPRO), 2018, : 1197 - 1202
  • [7] A Comparative Study of MongoDB and Document-Based MySQL for Big Data Application Data Management
    Gyorodi, Cornelia A.
    Dumse-Burescu, Diana V.
    Zmaranda, Doina R.
    Gyorodi, Robert S.
    BIG DATA AND COGNITIVE COMPUTING, 2022, 6 (02)
  • [8] USE OF NOSQL TECHNOLOGY FOR ANALYSIS OF UNSTRUCTURED SPATIAL DATA
    Polakova, Monta
    Vitols, Gatis
    RESEARCH FOR RURAL DEVELOPMENT 2018, VOL 2, 2018, : 267 - 270
  • [9] SECURITY ANALYSIS OF UNSTRUCTURED DATA IN NOSQL MONGODB DATABASE
    Kumar, Jitender
    Garg, Varsha
    2017 INTERNATIONAL CONFERENCE ON COMPUTING AND COMMUNICATION TECHNOLOGIES FOR SMART NATION (IC3TSN), 2017, : 300 - 305
  • [10] Terms Extraction from Unstructured Data Silos
    Lomotey, Richard K.
    Deters, Ralph
    2013 8TH INTERNATIONAL CONFERENCE ON SYSTEM OF SYSTEMS ENGINEERING (SOSE), 2013, : 19 - 24