E-Mail Classification Using Natural Language Processing

被引:1
作者
Sel, Ilhami [1 ]
Hanbay, Davut [1 ]
机构
[1] Inonu Univ, Bilgisayar Muhendisligi Bolumu, Malatya, Turkey
来源
2019 27TH SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE (SIU) | 2019年
关键词
Natural Language Processing; Text Classification; Word2Vec; Skip Gram; K-means;
D O I
10.1109/siu.2019.8806593
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Thanks to the rapid increase in technology and electronic communications, e-mail has become a serious communication tool. In many applications such as business correspondence, reminders, academic notices, web page memberships, e-mail is used as primary way of communication. If we ignore spam e-mails, there remain hundreds of e-mails received every day. In order to determine the importance of received e-mails, the subject or content of each e-mail must be checked. In this study we proposed an unsupervised system to classify received e-mails. Received e-mails' coordinates are determined by a method of natural language processing called as Word2Vec algorithm. According to the similarities, processed data are grouped by k-means algorithm with an unsupervised training model. In this study, 10517 e-mails were used in training. The success of the system is tested on a test group of 200 e-mails. In the test phase M3 model (window size 3, min. Word frequency 10, Gram skip) consolidated the highest success (91%). Obtained results are evaluated in section VI.
引用
收藏
页数:4
相关论文
共 19 条
[1]  
Adali E., 2012, TURKIYE BILISIM VAKF, V5
[2]  
[Anonymous], 2017, 2017 25 SIGNAL PROCE, DOI DOI 10.1109/SIU.2017.7960552
[3]  
Blanzieri E, 2008, DIT06056 U TRENT INF
[4]  
Clark J, 2003, IEEE/WIC INTERNATIONAL CONFERENCE ON WEB INTELLIGENCE, PROCEEDINGS, P702
[5]   Advances in mesenchymal stem cell-mediated gene therapy for cancer [J].
Dwyer, Roisin M. ;
Khan, Sonja ;
Barry, Frank P. ;
O'Brien, Timothy ;
Kerin, Michael J. .
STEM CELL RESEARCH & THERAPY, 2010, 1
[6]  
Eryigit G., 14 C EUR CHAPT ASS C
[7]   Competitive K-means [J].
Esteves, Rui Maximo ;
Hacker, Thomas ;
Rong, Chunming .
2013 IEEE FIFTH INTERNATIONAL CONFERENCE ON CLOUD COMPUTING TECHNOLOGY AND SCIENCE (CLOUDCOM), VOL 1, 2013, :17-24
[8]   A review of machine learning approaches to Spam filtering [J].
Guzella, Thiago S. ;
Caminhas, Walmir M. .
EXPERT SYSTEMS WITH APPLICATIONS, 2009, 36 (07) :10206-10222
[9]  
Ilhan S., METIN MADENCILIGI IL
[10]  
Karthika Renuka D., 2011, 2011 International Conference on Process Automation, Control and Computing, P1