A survey of big data architectures and machine learning algorithms in healthcare

被引:71
作者
Manogaran G. [1 ]
Lopez D. [1 ]
机构
[1] School of Information Technology and Engineering, Vellore Institute of Technology University, Vellore
关键词
big data; big data applications; big data architectures; big data opportunities and challenges; healthcare; machine learning algorithms;
D O I
10.1504/IJBET.2017.087722
中图分类号
学科分类号
摘要
Big Data has gained much attention from researchers in healthcare, bioinformatics, and information sciences. As a result, data production at this stage will be 44 times greater than that in 2009. Hence, the volume, velocity, and variety of data rapidly increase. Hence, it is difficult to store, process and visualise this huge data using traditional technologies. Many organisations such as Twitter, LinkedIn, and Facebook are used big data for different use cases in the social networking domain. Also, implementations of such architectures of the use cases have been published worldwide. However, a conceptual architecture for specific big data application has been limited. The intention of this paper is application-oriented architecture for big data systems, which is based on a study of published big data architectures for specific use cases. This paper also provides an overview of the state-of-the-art machine learning algorithms for processing big data in healthcare and other applications. © 2017 Inderscience Enterprises Ltd.
引用
收藏
页码:182 / 211
页数:29
相关论文
共 66 条
[1]  
Aydin G., Hallac I.R., Karakus B., Architecture and implementation of a scalable sensor data storage and analysis system using cloud computing and big data technologies, Journal of Sensors, 501, (2015)
[2]  
Ben-Haim Y., Tom-Tov E., A streaming parallel decision tree algorithm, The Journal of Machine Learning Research, 11, pp. 849-872, (2010)
[3]  
Boja C., Pocovnicu A., Batagan L., Distributed parallel architecture for big data, Informatica Economica, 16, 2, pp. 116-127, (2012)
[4]  
Calaway R., Edlefsen L., Gong L., Fast S., Big Data Decision Trees with R, (2012)
[5]  
Catak F.O., Balaban M.E., A MapReduce based distributed SVM algorithm for binary classification, Turkish Journal of Electrical Engineering & Computer Science, (2013)
[6]  
Chen C.C., Lee K.W., Chang C.C., Yang D.N., Chen M.S., Efficient large graph pattern mining for big data in the cloud, 2013 IEEE International Conference on Big Data, pp. 531-536, (2013)
[7]  
Del Rio S., Lopez V., Benitez J.M., Herrera F., On the use of MapReduce for imbalanced big data using random forest, Information Sciences, 285, pp. 112-137, (2014)
[8]  
Ester M., Kriegel H.P., Sander J., Xu X., A density-based algorithm for discovering clusters in large spatial databases with noise, KDD '96 Proceedings of the Second International Conference on Knowledge Discovery and Data Mining, 96, 34, pp. 226-231, (1996)
[9]  
Ferreira Cordeiro R.L., Traina C., Traina A.J.M., Lopez J., Kang U., Faloutsos C., Clustering very large multi-dimensional datasets with MapReduce, Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 690-698, (2011)
[10]  
Fries S., Wels S., Seidl T., Projected clustering for huge data sets in MapReduce, EDBT, pp. 49-60, (2014)