E-Mail Spam Detection Based on Part of Speech Tagging

被引:0
|
作者
Parsaei, Mohammad Reza [1 ]
Salehi, Mohammad [1 ]
机构
[1] Shiraz Univ Technol, Sch Comp Sci & IT, Shiraz, Iran
来源
2015 2ND INTERNATIONAL CONFERENCE ON KNOWLEDGE-BASED ENGINEERING AND INNOVATION (KBEI) | 2015年
关键词
K-Mean algorithm; Spam e-mail; data mining; pus tagging; vector model;
D O I
暂无
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Ever since the emails became well-known tools in communication field, the problem of spams was associated with them. One of the most significant methods for filtering such junk email is diagnostic of those e-mails by applying some especial technics named as Data-Mining. In the presented paper, a new approach based on this strategy that how frequently words are repeated is proposed in which the key words in the evidence are found by usage of their repetition number (frequency). The key sentences, those with the key words, of the incoming e-mails have to be tagged and thereafter the grammatical roles of the entire words in the sentence need to be determined, finally they will be put together in a vector in order to indicate the similarity between the received emails. The proposed paper takes advantage of an extraordinary algorithm called K-Mean algorithm to classify the received e-mails. It is worthwhile to note that the so-called K-Mean algorithm follows some simple and understandable rules which are too easy to work with and this stands as a great privilege for this paper. The precision of the applied algorithm in diagnostic of the e-mails is 83 percent.
引用
收藏
页码:1010 / 1013
页数:4
相关论文
共 42 条
  • [1] Content Based Spam E-mail Filtering
    Liu, Pingchuan
    Moh, Teng-Sheng
    2016 INTERNATIONAL CONFERENCE ON COLLABORATION TECHNOLOGIES AND SYSTEMS (CTS), 2016, : 218 - 224
  • [2] Development of Proposed Ensemble Model for Spam e-mail Classification
    Shrivas, Akhilesh Kumar
    Dewangan, Amit Kumar
    Ghosh, S. M.
    Singh, Devendra
    INFORMATION TECHNOLOGY AND CONTROL, 2021, 50 (03): : 411 - 423
  • [3] Academic E-Mail Overload and the Burden of "Academic Spam"
    Wood, Kelly E.
    Krasowski, Matthew D.
    ACADEMIC PATHOLOGY, 2020, 7
  • [4] Hybrid Water Cycle Optimization Algorithm With Simulated Annealing for Spam E-mail Detection
    Al-Rawashdeh, Ghada
    Mamat, Rabiei
    Abd Rahim, Noor Hafhizah Binti
    IEEE ACCESS, 2019, 7 : 143721 - 143734
  • [5] Comparison of Decision Tree Algorithms for Spam E-mail Filtering
    Subasi, Abdulhamit
    Alzahrani, Sara
    Aljuhani, Afnan
    Aljedani, Maha
    2018 1ST INTERNATIONAL CONFERENCE ON COMPUTER APPLICATIONS & INFORMATION SECURITY (ICCAIS' 2018), 2018,
  • [6] A Novel Spam Classification System for E-Mail Using a Gradient Fuzzy Guideline-Based Spam Classifier (GFGSC)
    Subramaniam, Vinoth Narayanan Arumugam
    Annamalai, Rajesh
    INTERNATIONAL ARAB JOURNAL OF INFORMATION TECHNOLOGY, 2023, 20 (03) : 398 - 406
  • [7] Research of an intelligent detection model for E-mail
    Xu Xiao-lin
    Tang Wen-zhong
    Yuan Zhong-yi
    PROCEEDINGS OF 2006 CHINESE CONTROL AND DECISION CONFERENCE, 2006, : 500 - 503
  • [8] AN EFFICIENT SPAM FILTERING METHOD BY ANALYZING E-MAIL'S HEADER SESSION ONLY
    Sheu, Jyh-Jian
    Chu, Ko-Tsung
    INTERNATIONAL JOURNAL OF INNOVATIVE COMPUTING INFORMATION AND CONTROL, 2009, 5 (11A): : 3717 - 3731
  • [9] Spam E-Mail Classification by Utilizing N-Gram Features of Hyperlink Texts
    Bozkir, A. Selman
    Sahin, Esra
    Aydos, Murat
    Sezer, Ebru Akcapinar
    Orhan, Fatih
    2017 11TH IEEE INTERNATIONAL CONFERENCE ON APPLICATION OF INFORMATION AND COMMUNICATION TECHNOLOGIES (AICT 2017), 2017, : 308 - 312
  • [10] A Clustering Techniques to Detect E-mail Spammer and Their Domains
    Patel, Kavita
    Dubey, Sanjay Kumar
    Singh, Ajay Shanker
    INFORMATION AND COMMUNICATION TECHNOLOGY FOR INTELLIGENT SYSTEMS (ICTIS 2017) - VOL 2, 2018, 84 : 637 - 646