Semantic Query-Featured Ensemble Learning Model for SQL-Injection Attack Detection in IoT-Ecosystems

被引:31
作者
Gowtham, M. [1 ,2 ]
Pramod, H. B. [3 ]
机构
[1] Visvesvaraya Technol Univ, Belagavi 590018, India
[2] Rajeev Inst Technol, Dept Comp Sci & Engn, Hassan 573201, India
[3] Rajeev Inst Technol, Dept Comp Sci & Engn, Hassan 573201, India
关键词
Feature extraction; Databases; Semantics; Real-time systems; Predictive models; Computational modeling; Data models; Ensemble learning; Internet-of-Thing (IoT) database security; natural language processing; SQL-injection attack (SQLIA) detection; WEB; VULNERABILITIES;
D O I
10.1109/TR.2021.3124331
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Structured query language (SQL) has emerged as one of the most used databases, serving an array of Internet-of-Things (IoTs)-enabled services including web-transactions, grid networks, industrial activity log and proactive decision systems, smart-home, financial transactions, business communication etc. With high pace increase in SQL-driven IoT applications, the threat of SQL-injection attacks (SQLIAs) at the middleware layer has increased significantly. To address such issues, machine learning-based SQLIA-prediction systems are proposed; however, majority of the existing methods are found limited in terms of intrusion detection accuracy because of their complete-reliance on structural features and inferior learning model(s). On the contrary, intruders these days intrude the system by mimicking the normal queries and hence confuses most of the classical learning-based methods. To alleviate such problems, this article emphasizes on exploiting semantic features along with the state-of-art highly robust computing environment. We proposed a robust semantic query-featured ensemble learning model for SQLIA prediction. Unlike classical (query's) template-matching or term-assessment-based methods, our proposed SQLIA-prediction model exploits latent semantic features from large SQL-queries to train an ensemble learning model that classifies each query as the normal query or the SQLIA query. Functionally, it performs preprocessing over large set of SQL-queries using count-vectorizer and stopping word removal. Subsequently, it applies Word2Vec feature extraction method over each query using continuous bag of words (CBOW) and N-skip gram (SKG) algorithms, which obtained CBOW and SKG semantic features from each SQL-query. The extracted features were processed for data resampling so as to alleviate the problem of class-imbalance and skewness. To alleviate redundant computation, two feature selection algorithms named Mann-Whitney significance predictor test and principal component analysis were applied over the resampled features. Moreover, to eliminate over-fitting and convergence problem, Min-Max normalization was performed over the selected features which were later processed for learning using a state-of-art robust heterogeneous ensemble learning model. Unlike standalone classifier-based SQLIA, the proposed learning-model employed a set of nine base classifiers designed to serve maximum voting ensemble-based prediction. The proposed ensemble-learning method classified each SQL-query as the normal-query or the SQLIA-query. Simulation results affirmed superiority of the proposed SQLIA prediction model in terms of accuracy (98%), F-Score (0.989), AUC (0.999) signifying its efficacy toward real-world SQL-driven IoT-ecosystems.
引用
收藏
页码:1057 / 1074
页数:18
相关论文
共 59 条
[1]  
[Anonymous], PROC 2016 INTERCONF, DOI DOI 10.1109/ICEMIS.2016.7745338
[2]   Assessing and Comparing Vulnerability Detection Tools for Web Services: Benchmarking Approach and Examples [J].
Antunes, Nuno ;
Vieira, Marco .
IEEE TRANSACTIONS ON SERVICES COMPUTING, 2015, 8 (02) :269-283
[3]  
Ao Luo, 2019, 2019 IEEE/ACIS 18th International Conference on Computer and Information Science (ICIS). Proceedings, P320
[4]  
Bogatinoska DC, 2016, 2016 39TH INTERNATIONAL CONVENTION ON INFORMATION AND COMMUNICATION TECHNOLOGY, ELECTRONICS AND MICROELECTRONICS (MIPRO), P632, DOI 10.1109/MIPRO.2016.7522218
[5]   A System for Profiling and Monitoring Database Access Patterns by Application Programs for Anomaly Detection [J].
Bossi, Lorenzo ;
Bertino, Elisa ;
Hussain, Syed Rafiul .
IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 2017, 43 (05) :415-431
[6]  
Chen K., 2018, J. Hardw. Syst. Secur, V2, P97, DOI DOI 10.1007/S41635-017-0029-7
[7]  
Daneels G, 2017, GLOB INFORM INFRAS, P23, DOI 10.1109/GIIS.2017.8169799
[8]   Internet of Things and M2M Communications as Enablers of Smart City Initiatives [J].
Datta, Soumya Kanti ;
Bonnet, Christian .
2015 9TH INTERNATIONAL CONFERENCE ON NEXT GENERATION MOBILE APPLICATIONS, SERVICES AND TECHNOLOGIES (NGMAST 2015), 2015, :393-398
[9]  
Duda O, 2019, INT WORKSH INT DATA, P96, DOI [10.1109/idaacs.2019.8924262, 10.1109/IDAACS.2019.8924262]
[10]   Towards a Deep Learning Model for Vulnerability Detection on Web Application Variants [J].
Fidalgo, Ana ;
Medeiros, Iberia ;
Antunes, Paulo ;
Neves, Nuno .
2020 IEEE 13TH INTERNATIONAL CONFERENCE ON SOFTWARE TESTING, VERIFICATION AND VALIDATION WORKSHOPS (ICSTW), 2020, :465-476