Unstructured Data Extraction in Distributed NoSQL

被引:0
|
作者
Lomotey, Richard K. [1 ]
Deters, Ralph [1 ]
机构
[1] Univ Saskatchewan, Dept Comp Sci, Saskatoon, SK S7N 0W0, Canada
关键词
Unstructured data; big data; Hidden Markov Model (HMM); terms extraction; NoSQL; Re-usable dictionary; Association rules;
D O I
暂无
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
While "Big data" has brought good tidings in terms of easy accessibility to voluminous data, we are faced with challenges too. The existing Knowledge Discovery in Database (KDD) processes which have been proposed for schema-oriented data sources are no longer applicable since today's data is unstructured. Previously, we deployed a tool called TouchR which relies on the Hidden Markov Model (HMM) to extract terms from unstructured data sources (specifically, NoSQL databases). This paper has advanced on the initially deployed version where we infroduced re-usable dictionary and association rules to improve on the quality of the extracted terms. Also, the tool in its present stage is more adaptable to the user search based on the most frequently searched term.
引用
收藏
页码:160 / 165
页数:6
相关论文
共 50 条
  • [1] Performance Evaluation of Unstructured NoSQL data over distributed framework
    Nyati, Suyog S.
    Pawar, Shivanand
    Ingle, Rajesh
    2013 INTERNATIONAL CONFERENCE ON ADVANCES IN COMPUTING, COMMUNICATIONS AND INFORMATICS (ICACCI), 2013, : 1623 - 1627
  • [2] USING NoSQL FOR PROCESSING UNSTRUCTURED BIG DATA
    Balakayeva, G. T.
    Phillips, C.
    Darkenbayev, D. K.
    Turdaliyev, M.
    NEWS OF THE NATIONAL ACADEMY OF SCIENCES OF THE REPUBLIC OF KAZAKHSTAN-SERIES OF GEOLOGY AND TECHNICAL SCIENCES, 2019, (06): : 12 - 21
  • [3] USE OF NOSQL TECHNOLOGY FOR ANALYSIS OF UNSTRUCTURED SPATIAL DATA
    Polakova, Monta
    Vitols, Gatis
    RESEARCH FOR RURAL DEVELOPMENT 2018, VOL 2, 2018, : 267 - 270
  • [4] SECURITY ANALYSIS OF UNSTRUCTURED DATA IN NOSQL MONGODB DATABASE
    Kumar, Jitender
    Garg, Varsha
    2017 INTERNATIONAL CONFERENCE ON COMPUTING AND COMMUNICATION TECHNOLOGIES FOR SMART NATION (IC3TSN), 2017, : 300 - 305
  • [5] On the Energy Proportionality of Distributed NoSQL Data Stores
    Subramaniam, Balaji
    Feng, Wu-chun
    HIGH PERFORMANCE COMPUTING SYSTEMS: PERFORMANCE MODELING, BENCHMARKING, AND SIMULATION, 2015, 8966 : 264 - 274
  • [6] Terms Mining in Document-Based NoSQL: Response to Unstructured Data
    Lomotey, Richard K.
    Deters, Ralph
    2014 IEEE INTERNATIONAL CONGRESS ON BIG DATA (BIGDATA CONGRESS), 2014, : 661 - 668
  • [7] Intelligent processing of unstructured textual data in document based NoSQL databases
    Jose B.
    Abraham S.
    Materials Today: Proceedings, 2023, 80 : 1777 - 1785
  • [8] Processing of Unstructured data for Information Extraction
    Ingle, Vaishali A.
    3RD NIRMA UNIVERSITY INTERNATIONAL CONFERENCE ON ENGINEERING (NUICONE 2012), 2012,
  • [9] A DISTRIBUTED STORAGE STRATEGY FOR TRAJECTORY DATA BASED ON NOSQL DATABASE
    Zhou, Yan
    Chen, Qifan
    Shan, Baoyu
    Jiang, Fan
    Pang, Yuling
    2019 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM (IGARSS 2019), 2019, : 3487 - 3490
  • [10] TrajMesa: A Distributed NoSQL Storage Engine for Big Trajectory Data
    Li, Ruiyuan
    He, Huajun
    Wang, Rubin
    Ruan, Sijie
    Sui, Yuan
    Bao, Jie
    Zheng, Yu
    2020 IEEE 36TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE 2020), 2020, : 2002 - 2005