Towards a real-time processing framework based on improved distributed recurrent neural network variants with fastText for social big data analytics

被引:60
|
作者
Hammou, Badr Ait [1 ]
Lahcen, Ayoub Ait [1 ,2 ]
Mouline, Salma [1 ]
机构
[1] Mohammed V Univ, Rabat IT Ctr, Associated Unit CNRST URAC 29, Fac Sci,LRIT, Rabat, Morocco
[2] Ibn Tofail Univ, Natl Sch Appl Sci ENSA, LGS, Kenitra, Morocco
关键词
Big data; FastText; Recurrent neural networks; LSTM; BiLSTM; GRU; Natural language processing; Sentiment analysis; Social big data analytics; SENTIMENT ANALYSIS; BIDIRECTIONAL LSTM; CLASSIFICATION;
D O I
10.1016/j.ipm.2019.102122
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Big data generated by social media stands for a valuable source of information, which offers an excellent opportunity to mine valuable insights. Particularly, User-generated contents such as reviews, recommendations, and users' behavior data are useful for supporting several marketing activities of many companies. Knowing what users are saying about the products they bought or the services they used through reviews in social media represents a key factor for making decisions. Sentiment analysis is one of the fundamental tasks in Natural Language Processing. Although deep learning for sentiment analysis has achieved great success and allowed several firms to analyze and extract relevant information from their textual data, but as the volume of data grows, a model that runs in a traditional environment cannot be effective, which implies the importance of efficient distributed deep learning models for social Big Data analytics. Besides, it is known that social media analysis is a complex process, which involves a set of complex tasks. Therefore, it is important to address the challenges and issues of social big data analytics and enhance the performance of deep learning techniques in terms of classification accuracy to obtain better decisions. In this paper, we propose an approach for sentiment analysis, which is devoted to adopting fastText with Recurrent neural network variants to represent textual data efficiently. Then, it employs the new representations to perform the classification task. Its main objective is to enhance the performance of well-known Recurrent Neural Network (RNN) variants in terms of classification accuracy and handle large scale data. In addition, we propose a distributed intelligent system for real-time social big data analytics. It is designed to ingest, store, process, index, and visualize the huge amount of information in real-time. The proposed system adopts distributed machine learning with our proposed method for enhancing decision-making processes. Extensive experiments conducted on two benchmark data sets demonstrate that our proposal for sentiment analysis outperforms well-known distributed recurrent neural network variants (i.e., Long Short-Term Memory (LSTM), Bidirectional Long Short-Term Memory (BiLSTM), and Gated Recurrent Unit (GRU)). Specifically, we tested the efficiency of our approach using the three different deep learning models. The results show that our proposed approach is able to enhance the performance of the three models. The current work can provide several benefits for researchers and practitioners who want to collect, handle, analyze and visualize several sources of information in real-time. Also, it can contribute to a better understanding of public opinion and user behaviors using our proposed system with the improved variants of the most powerful distributed deep learning and machine learning algorithms. Furthermore, it is able to increase the classification accuracy of several existing works based on RNN models for sentiment analysis.
引用
收藏
页数:15
相关论文
共 45 条
  • [31] A Real-Time Autonomous Highway Accident Detection Model Based on Big Data Processing and Computational Intelligence
    Ozhayoglu, Mural
    Kucukayan, Gokhan
    Dogdu, Erdogan
    2016 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2016, : 1807 - 1813
  • [32] Deep recurrent neural network-based Hadoop framework for COVID prediction with applications to big data in cloud computing
    Rao, D. B. Jagannadha
    Polepally, Vijayakumar
    Prabhu, S. Nagendra
    Kalpana, Parsi
    INTERNATIONAL JOURNAL OF BIO-INSPIRED COMPUTATION, 2023, 21 (01) : 36 - 47
  • [33] Apache Storm Based on Topology for Real-Time Processing of Streaming Data from Social Networks
    Batyuk, Anatoliy
    Voityshyn, Volodymyr
    PROCEEDINGS OF THE 2016 IEEE FIRST INTERNATIONAL CONFERENCE ON DATA STREAM MINING & PROCESSING (DSMP), 2016, : 345 - 349
  • [34] Big Data Platform for Real-Time Oscillatory Stability Predictive Assessment Using Recurrent Neural Networks and WAProtector's Records
    Cepeda, Jaime
    Gomez, Ignacio
    Calero, Fabian
    Vaca, Angel
    2022 INTERNATIONAL CONFERENCE ON SMART GRID SYNCHRONIZED MEASUREMENTS AND ANALYTICS - SGSMA 2022, 2022,
  • [35] Real-Time Prediction of Large-Scale Ship Model Vertical Acceleration Based on Recurrent Neural Network
    Su, Yumin
    Lin, Jianfeng
    Zhao, Dagang
    Guo, Chunyu
    Wang, Chao
    Guo, Hang
    JOURNAL OF MARINE SCIENCE AND ENGINEERING, 2020, 8 (10) : 1 - 12
  • [36] Big Data Processing Workflows Oriented Real-Time Scheduling Algorithm using Task-Duplication in Geo-Distributed Clouds
    Chen, Huangke
    Wen, Jinming
    Pedrycz, Witold
    Wu, Guohua
    IEEE TRANSACTIONS ON BIG DATA, 2020, 6 (01) : 131 - 144
  • [37] An Optimized Ensemble Support Vector Machine-Based Extreme Learning Model for Real-Time Big Data Analytics and Disaster Prediction
    Jagadeesan, J.
    Subashree, D.
    Kirupanithi, D. Nancy
    COGNITIVE COMPUTATION, 2023, 15 (06) : 2152 - 2174
  • [38] An Optimized Ensemble Support Vector Machine-Based Extreme Learning Model for Real-Time Big Data Analytics and Disaster Prediction
    J. Jagadeesan
    Subashree D.
    D. Nancy Kirupanithi
    Cognitive Computation, 2023, 15 : 2152 - 2174
  • [39] A recurrent neural-network-based real-time learning control strategy applying to nonlinear systems with unknown dynamics
    Chow, TWS
    Fang, Y
    IEEE TRANSACTIONS ON INDUSTRIAL ELECTRONICS, 1998, 45 (01) : 151 - 161
  • [40] Real-Time Outlier Detection Applied to a Doppler Velocity Log Sensor Based on Hybrid Autoencoder and Recurrent Neural Network
    Davari, Narjes
    Aguiar, A. Pedro
    IEEE JOURNAL OF OCEANIC ENGINEERING, 2021, 46 (04) : 1288 - 1301