ChatPhishDetector: Detecting Phishing Sites Using Large Language Models

被引：2

作者：

Koide, Takashi ^{[1
]}

Nakano, Hiroki ^{[2
]}

Chiba, Daiki

机构：

[1] NTT Secur Holdings Corp, Tokyo 1010021, Japan

[2] NTT Corp, Tokyo 1010021, Japan

来源：

IEEE ACCESS | 2024年 / 12卷

关键词：

Phishing; Uniform resource locators; Large language models; Crawlers; Codes; Web pages; Security; Accuracy; Visualization; Cognition; phishing sites; social engineering;

D O I：

10.1109/ACCESS.2024.3483905

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Large Language Models (LLMs), such as ChatGPT, are significantly impacting various fields. While LLMs have been extensively studied for code generation and text synthesis, their application in detecting malicious web content, particularly phishing sites, remains largely unexplored. To counter the increasing cyber-attacks that leverage LLMs for creating more sophisticated and convincing phishing content, it is crucial to automate detection by harnessing LLMs' advanced capabilities. This paper introduces ChatPhishDetector, a novel system that employs LLMs to identify phishing sites. Our approach involves using a web crawler to collect website information, generating prompts for LLMs based on the gathered data, and extracting detection results from LLM responses. This system enables accurate detection of multilingual phishing sites by identifying impersonated brands and social engineering techniques within the entire website context, without requiring machine learning model training. We evaluated our system's performance using our own dataset and compared it with baseline systems and several LLMs. Experiments using GPT-4V showed exceptional results, achieving 98.7% precision and 99.6% recall, surpassing the detection performance of other LLMs and existing systems. These findings highlight the potential of LLMs for protecting users from online fraudulent activities and provide crucial insights for strengthening defenses against phishing attacks.

引用

页码：154381 / 154400

页数：20

共 50 条

[21] Detecting implicit biases of large language models with Bayesian hypothesis testing
Shijing Si
Xiaoming Jiang
Qinliang Su
Lawrence Carin
Scientific Reports, 15 (1)
[22] PhishFry - A Proactive Approach to Classify Phishing Sites using SCIKIT Learn
Brites, Daniel
Wei, Mingkui
2019 IEEE GLOBECOM WORKSHOPS (GC WKSHPS), 2019,
[23] Large Language Models and Security
Bezzi, Michele
IEEE SECURITY & PRIVACY, 2024, 22 (02) : 60 - 68
[24] Dual Adapter Tuning of Vision–Language Models Using Large Language Models
Mohammad Reza Zarei
Abbas Akkasi
Majid Komeili
International Journal of Computational Intelligence Systems, 18 (1)
[25] LMEye: An Interactive Perception Network for Large Language Models
Li, Yunxin
Hu, Baotian
Chen, Xinyu
Ma, Lin
Xu, Yong
Zhang, Min
IEEE TRANSACTIONS ON MULTIMEDIA, 2024, 26 : 10952 - 10964
[26] Neurosymbolic AI Approach to Attribution in Large Language Models
Tilwani, Deepa
Venkataramanan, Revathy
Sheth, Amit P.
IEEE INTELLIGENT SYSTEMS, 2024, 39 (06) : 10 - 17
[27] Visistant: A Conversational Chatbot for Natural Language to Visualizations With Gemini Large Language Models
Muthumanikandan, V.
Ram, Santhosh
IEEE ACCESS, 2024, 12 : 138547 - 138563
[28] MASPHID: A Model to Assist Screen Reader Users for Detecting Phishing Sites Using Aural and Visual Similarity Measures
Sonowal, Gunikhan
Kuppusamy, K. S.
PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON INFORMATICS AND ANALYTICS (ICIA' 16), 2016,
[29] A Study on the Representativeness Heuristics Problem in Large Language Models
Ryu, Jongwon
Kim, Jungeun
Kim, Junyeong
IEEE ACCESS, 2024, 12 : 147958 - 147966
[30] Using Natural Language Processing for Phishing Detection
Jonker, Richard Adolph Aires
Poudel, Roshan
Pedrosa, Tiago
Lopes, Rui Pedro
OPTIMIZATION, LEARNING ALGORITHMS AND APPLICATIONS, OL2A 2021, 2021, 1488 : 540 - 552

← 1 2 3 4 5 →