Data Quality in Big Data Processing: Issues, Solutions and Open Problems

被引:0
作者
Zhang, Pengcheng [1 ]
Xiong, Fang [1 ]
Gao, Jerry [2 ,3 ]
Wang, Jimin [1 ]
机构
[1] Hohai Univ, Coll Comp & Informat, Nanjing, Jiangsu, Peoples R China
[2] San Jose State Univ, San Jose, CA 95192 USA
[3] Taiyuan Univ Technol, Taiyuan, Shanxi, Peoples R China
来源
2017 IEEE SMARTWORLD, UBIQUITOUS INTELLIGENCE & COMPUTING, ADVANCED & TRUSTED COMPUTED, SCALABLE COMPUTING & COMMUNICATIONS, CLOUD & BIG DATA COMPUTING, INTERNET OF PEOPLE AND SMART CITY INNOVATION (SMARTWORLD/SCALCOM/UIC/ATC/CBDCOM/IOP/SCI) | 2017年
基金
中国国家自然科学基金;
关键词
Big Data; Big data processing; Data Quality; Recommendation system; Prediction system; FUZZY C-MEANS;
D O I
暂无
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
With the rapid development of social networks, Internet of things, Cloud computing as well as other technologies, big data age is arriving. The increasing number of data has brought great value to the public and enterprises. Meanwhile how to manage and use big data better has become the focus of all walks of life. The 4V characteristics of big data have brought a lot of issues to the big data processing. The key to big data processing is to solve data quality issue, and to ensure data quality is a prerequisite for the successful application of big data technique. In this paper, we use recommendation systems and prediction systems as typical big data applications, and try to find out the data quality issues during data collection, data preprocessing, data storage and data analysis stages of big data processing. According to the elaboration and analysis of the proposed issues, the corresponding solutions are also put forward. Finally, some open problems to be solved in the future are also raised.
引用
收藏
页数:7
相关论文
共 30 条
[1]  
[Anonymous], 2015, DATA SCI J
[2]  
[Anonymous], TECHNOL EC DEV EC
[3]   Advances in Clustering Collaborative Filtering by means of Fuzzy C-means and trust [J].
Birtolo, Cosimo ;
Ronca, Davide .
EXPERT SYSTEMS WITH APPLICATIONS, 2013, 40 (17) :6997-7009
[4]   Applying artificial immune systems to collaborative filtering for movie recommendation [J].
Chen, Meng-Hui ;
Teng, Chin-Hung ;
Chang, Pei-Chann .
ADVANCED ENGINEERING INFORMATICS, 2015, 29 (04) :830-839
[5]  
Cheng R G, 2008, J CHONGQING NORMAL U
[6]   Leveraging clustering approaches to solve the gray-sheep users problem in recommender systems [J].
Ghazanfar, Mustansar Ali ;
Pruegel-Bennett, Adam .
EXPERT SYSTEMS WITH APPLICATIONS, 2014, 41 (07) :3261-3275
[7]  
Gong C, 2016, ELECT TEST
[8]   Clustering incomplete relational data using the non-Euclidean relational fuzzy c-means algorithm [J].
Hathaway, RJ ;
Bezdek, JC .
PATTERN RECOGNITION LETTERS, 2002, 23 (1-3) :151-160
[9]  
Jiang F, 2010, LECT NOTES COMPUT SC, V6377, P597, DOI 10.1007/978-3-642-16167-4_76
[10]  
Lawal Z K, 2016, INT J ADV TRENDS COM