EMFORE: Online Learning of Email Folder Classification Rules

被引:1
作者
Singh, Mukul [1 ]
Cambronero, Jose [2 ]
Gulwani, Sumit [2 ]
Le, Vu [2 ]
Verbruggen, Gust [3 ]
机构
[1] Microsoft, Delhi, India
[2] Microsoft, Redmond, WA USA
[3] Microsoft, Keerbergen, Belgium
来源
PROCEEDINGS OF THE 32ND ACM INTERNATIONAL CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, CIKM 2023 | 2023年
关键词
Email Classification; Online Learning; Learning by Examples;
D O I
10.1145/3583780.3614863
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Modern email clients support predicate-based folder assignment rules that can automatically organize emails. Unfortunately, users still need to write these rules manually. Prior machine learning approaches have framed automatically assigning email to folders as a classification task and do not produce symbolic rules. Prior inductive logic programming (ILP) approaches, which generate symbolic rules, fail to learn efficiently in the online environment needed for email management. To close this gap, we present EMFORE, an online system that learns symbolic rules for email classification from observations. Our key insights to do this successfully are: (1) learning rules over a folder abstraction that supports quickly determining candidate predicates to add or replace terms in a rule, (2) ensuring that rules remain consistent with historical assignments, (3) ranking rule updates based on existing predicate and folder name similarity, and (4) building a rule suppression model to avoid surfacing low-confidence folder predictions while keeping the rule for future use. We evaluate on two popular public email corpora and compare to 13 baselines, including state-of-the-art folder assignment systems, incremental machine learning, ILP and transformer-based approaches. We find that EMFORE performs significantly better, updates four orders of magnitude faster, and is more robust than existing methods and baselines.
引用
收藏
页码:2280 / 2290
页数:11
相关论文
共 45 条
[1]   Evaluating User Actions as a Proxy for Email Significance [J].
Alrashed, Tarfah ;
Lee, Chia-Jung ;
Bailey, Peter ;
Lin, Christopher ;
Shokouhi, Milad ;
Dumais, Susan .
WEB CONFERENCE 2019: PROCEEDINGS OF THE WORLD WIDE WEB CONFERENCE (WWW 2019), 2019, :26-36
[2]  
[Anonymous], 2011, P 38 ANN ACM SIGPLAN, DOI [DOI 10.1145/1926385.1926423, 10.1145/1926385.1926423, DOI 10.1145/1925844.1926423]
[3]  
[Anonymous], 2015, P 24 ACM INT C INF K
[4]   Efficient email classification approach based on semantic methods [J].
Bahgat, Eman M. ;
Rady, Sherine ;
Gad, Walaa ;
Moawad, Ibrahim F. .
AIN SHAMS ENGINEERING JOURNAL, 2018, 9 (04) :3259-3269
[5]  
Bartlett PL, 2008, J MACH LEARN RES, V9, P1823
[6]  
Bekkerman R., 2005, Automatic categorization of email into folders: Benchmark experiments on Enron and SRI corpora
[7]  
Bekkerman R., 2004, IR418 U MASS CIIR
[8]   Search and Discovery in Personal Email Collections [J].
Bendersky, Michael ;
Wang, Xuanhui ;
Najork, Marc ;
Metzler, Donald .
WSDM'22: PROCEEDINGS OF THE FIFTEENTH ACM INTERNATIONAL CONFERENCE ON WEB SEARCH AND DATA MINING, 2022, :1617-1619
[9]  
Boser B. E., 1992, Proceedings of the Fifth Annual ACM Workshop on Computational Learning Theory, P144, DOI 10.1145/130385.130401
[10]   A comparative study on feature selection and adaptive strategies for email foldering using the ABC-DynF framework [J].
Carmona-Cejudo, Jose M. ;
Castillo, Gladys ;
Baena-Garcia, Manuel ;
Morales-Bueno, Rafael .
KNOWLEDGE-BASED SYSTEMS, 2013, 46 :81-94