Fast On-line Learning for Multilingual Categorization

被引:0
作者
Kovesi, Michelle [1 ]
Goutte, Cyril [1 ]
Amini, Massih-Reza [2 ]
机构
[1] CNR, Interact Language Tech, 283 Alexandre Tache, Gatineau, PQ, Canada
[2] Univ Paris 06, Lab Informat Paris 6, F-75252 Paris, France
来源
SIGIR 2012: PROCEEDINGS OF THE 35TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL | 2012年
关键词
Multilingual text categorisation; on-line learning;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Multiview learning has been shown to be a natural and efficient framework for supervised or semi-supervised learning of multilingual document categorizers. The state-of-the-art co-regularization approach relies on alternate minimizations of a combination of language-specific categorization errors and a disagreement between the outputs of the monolingual text categorizers. This is typically solved by repeatedly training categorizers on each language with the appropriate regularizer. We extend and improve this approach by introducing an on-line learning scheme, where language-specific updates are interleaved in order to iteratively optimize the global cost in one pass. Our experimental results show that this produces similar performance as the batch approach, at a fraction of the computational cost.
引用
收藏
页码:1071 / 1072
页数:2
相关论文
共 6 条
  • [1] Amini MR, 2010, SIGIR 2010: PROCEEDINGS OF THE 33RD ANNUAL INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH DEVELOPMENT IN INFORMATION RETRIEVAL, P475
  • [2] A co-classification approach to learning from multilingual corpora
    Amini, Massih-Reza
    Goutte, Cyril
    [J]. MACHINE LEARNING, 2010, 79 (1-2) : 105 - 121
  • [3] Bottou U, 2004, ADV NEUR IN, V16, P217
  • [4] Eisele A, 2010, LREC 2010 - SEVENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, P2868
  • [5] Lewis DD, 2004, J MACH LEARN RES, V5, P361
  • [6] Pouliquen B., 2006, ABSCS0609059 CORR