Deep learning applications and challenges in big data analytics

被引:216
作者
Najafabadi M.M. [1 ]
Villanustre F. [2 ]
Khoshgoftaar T.M. [1 ]
Seliya N. [1 ]
Wald R. [1 ]
Muharemagic E. [3 ]
机构
[1] Florida Atlantic University, 777 Glades Road, Boca Raton, FL
[2] LexisNexis Business Information Solutions, 245 Peachtree Center Avenue, Atlanta, GA
[3] LexisNexis Business Information Solutions, 6601 Park of Commerce Blvd, Boca Raton, FL
关键词
Big data; Deep learning;
D O I
10.1186/s40537-014-0007-7
中图分类号
学科分类号
摘要
Big Data Analytics and Deep Learning are two high-focus of data science. Big Data has become important as many organizations both public and private have been collecting massive amounts of domain-specific information, which can contain useful information about problems such as national intelligence, cyber security, fraud detection, marketing, and medical informatics. Companies such as Google and Microsoft are analyzing large volumes of data for business analysis and decisions, impacting existing and future technology. Deep Learning algorithms extract high-level, complex abstractions as data representations through a hierarchical learning process. Complex abstractions are learnt at a given level based on relatively simpler abstractions formulated in the preceding level in the hierarchy. A key benefit of Deep Learning is the analysis and learning of massive amounts of unsupervised data, making it a valuable tool for Big Data Analytics where raw data is largely unlabeled and un-categorized. In the present study, we explore how Deep Learning can be utilized for addressing some important problems in Big Data Analytics, including extracting complex patterns from massive volumes of data, semantic indexing, data tagging, fast information retrieval, and simplifying discriminative tasks. We also investigate some aspects of Deep Learning research that need further exploration to incorporate specific challenges introduced by Big Data Analytics, including streaming data, high-dimensional data, scalability of models, and distributed computing. We conclude by presenting insights into relevant future works by posing some questions, including defining data sampling criteria, domain adaptation modeling, defining criteria for obtaining useful data abstractions, improving semantic indexing, semi-supervised learning, and active learning. © 2015, Najafabadi et al.; licensee Springer.
引用
收藏
相关论文
共 58 条
  • [1] Domingos P., Domingos P., A few useful things to know about machine learning, (2012)
  • [2] Dalal N., Triggs B., Dalal N., Triggs B., Histograms of oriented gradients for human detection, pp. 886-893, (2005)
  • [3] Lowe D.G., Object recognition from local scale-invariant features., pp. 1150-1157, (1999)
  • [4] Bengio Y., LeCun Y., Scaling learning algorithms towards, AI, Large Scale Kernel Machines, pp. 321-360, (2007)
  • [5] Bengio Y., Courville A., Vincent P., Representation learning: A review and new perspectives, Pattern Analysis and Machine Intelligence, IEEE Transactions on, 35, 8, pp. 1798-1828, (2013)
  • [6] Arel I., Rose D.C., Karnowski T.P., Deep machine learning-a new frontier in artificial intelligence research [research frontier], IEEE Comput Intell, 5, pp. 13-18, (2010)
  • [7] Hinton G.E., Osindero S., Teh Y.-W., A fast learning algorithm for deep belief nets, Neural Comput, 18, 7, pp. 1527-1554, (2006)
  • [8] Larochelle H., Bengio Y., Louradour J., Lamblin P., Exploring strategies for training deep neural networks, J Mach Learn Res, 10, pp. 1-40, (2009)
  • [9] Salakhutdinov R., Hinton G.E., Deep boltzmann machines, pp. 448-455, (2009)
  • [10] Goodfellow I., Lee H., Le Q.V., Saxe A., Ng A.Y., Measuring invariances in deep networks, pp. 646-654, (2009)