Mining Business Process Activities from Email Logs

被引:11
作者
Jlailaty, Diana [1 ]
Grigori, Daniela [1 ]
Belhajjame, Khalid [1 ]
机构
[1] Paris Dauphine Univ, Paris, France
来源
2017 IEEE 1ST INTERNATIONAL CONFERENCE ON COGNITIVE COMPUTING (ICCC 2017) | 2017年
关键词
Email analysis; Word2vec; LSA; process mining; process modeling; CLASSIFICATION;
D O I
10.1109/IEEE.ICCC.2017.28
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Due to its wide use in personal, but most importantly, professional contexts, email represents a valuable source of information that can be harvested for understanding, reengineering and repurposing undocumented business processes of companies and institutions. Few researchers have investigated the problem of extracting and analyzing the process-oriented information contained in emails. In this paper, we go forward in this direction by proposing a new method to discover business process activities from email logs. Towards this aim, emails are grouped according to the process model they belong to. This is followed by sub-grouping and labeling the emails of each process model into business activity types. These tasks are applied by deploying an unsupervised mining technique accompanied by semantic similarity measurement methods. Two representative similarity measurement methods are examined: Latent Semantic Indexing (LSA) and Word2vec. These methods are compared to prove that Word2vec provides a better performance than LSA in grouping emails according to what process model they are related to, and in discovering emails belonging to the same activity type. Experimental results are detailed to illustrate and prove our approach contributions.
引用
收藏
页码:112 / 119
页数:8
相关论文
共 21 条
[1]   Clustering and classification of email contents [J].
Alsmadi, Izzat ;
Alhami, Ikdam .
JOURNAL OF KING SAUD UNIVERSITY-COMPUTER AND INFORMATION SCIENCES, 2015, 27 (01) :46-57
[2]  
[Anonymous], 2014, J COMPU INF SYST
[3]  
Arungunram CSurendran, 2010, US Patent, Patent No. [7,765,212, 7765212]
[4]  
Baroni M, 2014, PROCEEDINGS OF THE 52ND ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 1, P238
[5]  
Cohen William W., 2004, P EMP METH NAT LANG
[6]  
Corston-Oliver Simon, 2004, P TEXT SUMM BRANCH O
[7]  
Dy JG, 2004, J MACH LEARN RES, V5, P845
[8]  
Furnas G. W., 1988, 11th International Conference on Research and Development in Information Retrieval, P465
[9]  
Hajic Jan, WORKSH DEEP LANG PRO
[10]  
Hinton G. E., 1986, P 8 ANN C COGN SCI S, V1, P12, DOI DOI 10.1109/69.917563