PICCIL: Interactive learning to support log file categorization

被引:0
作者
Loewenstern, D [1 ]
Ma, S [1 ]
Salahshour, A [1 ]
机构
[1] IBM Corp, TJ Watson Res Ctr, Hawthorne, NY 10532 USA
来源
ICAC 2005: Second International Conference on Autonomic Computing, Proceedings | 2005年
关键词
D O I
暂无
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Motivated by the real-world application Of categorizing system log messages into defined situation categories, this paper describes an interactive text categorization method, PICCIL1, that leverages supervised machine learning to reduce the burden of assigning categories to documents in large finite data sets but, by coupling human expertise to the machine learning, does so without sacrificing accuracy. PICCIL uses keywords and keyword rules both to preclassify documents and to assist in the manual process of grouping and reviewing documents. The reviewed documents, in turn, are used to refine the keyword rules iteratively to improve subsequent grouping and document review. We apply PICCIL to the problem of assigning semantic situation labels to the entries of a catalog of log events to support on-line labeling of log events.
引用
收藏
页码:311 / 312
页数:2
相关论文
共 10 条
[1]  
AGRAWAL R, 1994, P VER LARG DAT BAS
[2]  
HEARST M, 1999, ACL 99
[3]  
LI T, 2004, SDM 04
[4]  
LI T, 2004, SIGIR 04
[5]  
Liu B, 1998, Proceedings of the fourth international conference on knowledge discovery and data mining, P80
[6]  
MERETAKIS D, 1999, KDD, P165
[7]   Machine learning in automated text categorization [J].
Sebastiani, F .
ACM COMPUTING SURVEYS, 2002, 34 (01) :1-47
[8]  
Steinbach M, 2000, KDD TEXT MIN WORKSH
[9]  
YANG Y, 2003, SIGIR 03
[10]  
Yang YM, 1999, SIGIR'99: PROCEEDINGS OF 22ND INTERNATIONAL CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, P42, DOI 10.1145/312624.312647