Enhancing the security of patients' portals and websites by detecting malicious web crawlers using machine learning techniques

被引:15
作者
Hosseini, Nafiseh [1 ]
Fakhar, Fatemeh [2 ]
Kiani, Behzad [1 ]
Eslami, Saeid [1 ,3 ,4 ]
机构
[1] Mashhad Univ Med Sci, Fac Med, Dept Med Informat, Mashhad, Razavi Khorasan, Iran
[2] Payame Noor Univ, Fac Engn, Dept Comp Sci, Tehran, Iran
[3] Mashhad Univ Med Sci, Pharmaceut Res Ctr, Mashhad, Razavi Khorasan, Iran
[4] Univ Amsterdam, Acad Med Ctr, Dept Med Informat, Amsterdam, Netherlands
关键词
Security of patient portal; Malicious crawlers; Support vector machines; Feature extraction; ROBOT DETECTION; ACCESS; ENGAGEMENT; CLASSIFICATION; COMMUNICATION; PROVIDER; OUTCOMES;
D O I
10.1016/j.ijmedinf.2019.103976
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Introduction: There is increasing demand for access to medical information via patients' portals. However, one of the challenges towards widespread utilisation of such service is maintaining the security of those portals. Recent reports show an alarming increase in cyber-attacks using crawlers. These software programs crawl web pages and are capable of executing various commands such as attacking web servers, cracking passwords, harvesting users' personal information, and testing the vulnerability of servers. The aim of this research is to develop a new effective model for detecting malicious crawlers based on their navigational behavior using machine-learning techniques. Method: In this research, different methods of crawler detection were investigated. Log files of a sample of compromised web sites were analysed and the best features for the detection of crawlers were extracted. Then after testing and comparing several machine learning algorithms including Support Vector Machine (SVM), Bayesian Network and Decision Tree, the best model was developed using the most appropriate features and its accuracy was evaluated. Results: Our analysis showed the SVM-based models can yield higher accuracy (f-measure = 0.97) comparing to Bayesian Network (f-measure = 0.88) and Decision Tree (f-measure = 0.95) and artificial neural network (ANN) (f-measure = 0.87)for detecting malicious crawlers. However, extracting proper features can increase the performance of the SVM (f-measure = 0.98), the Bayesian network (f-measure = 0.94) and the Decision Tree (f-measure = 0.96) and ANN (f-measure = 0.92). Conclusion: Security concerns are among the potential barriers to widespread utilisation of patient portals. Machine learning algorithms can be accurately used to detect malicious crawlers and enhance the security of sensitive patients' information. Selecting appropriate features for the development of these algorithms can remarkably increase their accuracy.
引用
收藏
页数:8
相关论文
共 40 条
[1]  
Ammenwerth E., 2018, European Journal of Biomedical Informatics, V14, P20, DOI DOI 10.24105/EJBI.2018.14.2.4
[2]   The Impact of Electronic Patient Portals on Patient Care: A Systematic Review of Controlled Trials [J].
Ammenwerth, Elske ;
Schnell-Inderst, Petra ;
Hoerbst, Alexander .
JOURNAL OF MEDICAL INTERNET RESEARCH, 2012, 14 (06) :325-337
[3]  
[Anonymous], 2012, PROC INF SCI IND APP
[4]  
Bomhardt Christian., 2005, Web robot detection-preprocessing web logfiles for robot detection
[5]   Patients Prefer Results From the Ordering Provider and Access to Their Radiology Reports [J].
Cabarrus, Miguel ;
Naeger, David M. ;
Rybkin, Alexander ;
Qayyum, Aliya .
JOURNAL OF THE AMERICAN COLLEGE OF RADIOLOGY, 2015, 12 (06) :556-562
[6]   Patient web portals, disease management, and primary prevention [J].
Coughlin, Steven S. ;
Prochaska, Judith J. ;
Williams, Lovoria B. ;
Besenyi, Gina M. ;
Heboyan, Vahe ;
Goggans, D. Stephen ;
Yoo, Wonsuk ;
De Leo, Gianluca .
RISK MANAGEMENT AND HEALTHCARE POLICY, 2017, 10 :33-40
[7]   An integrated method for real time and offline web robot detection [J].
Doran, Derek ;
Gokhale, Swapna S. .
EXPERT SYSTEMS, 2016, 33 (06) :592-606
[8]  
Duskin O., 2009, P 2009 WORKSH WEB SE
[9]   Predictors and Intensity of Online Access to Electronic Medical Records Among Patients With Cancer [J].
Gerber, David E. ;
Laccetti, Andrew L. ;
Chen, Beibei ;
Yan, Jingsheng ;
Cai, Jennifer ;
Gates, Samantha ;
Xie, Yang ;
Lee, Simon J. Craddock .
JOURNAL OF ONCOLOGY PRACTICE, 2014, 10 (05) :E307-E312
[10]  
Goldzweig C., 2012, EFFIC ATTITUDES