Bibliographic component extraction using support vector machines and hidden Markov models

被引:0
作者
Okada, T
Takasu, A
Adachi, J
机构
[1] Univ Tokyo, Tokyo, Japan
[2] Natl Inst Informat, Chiyoda Ku, Tokyo, Japan
来源
RESEARCH AND ADVANCED TECHNOLOGY FOR DIGITAL LIBRARIES | 2004年 / 3232卷
关键词
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Article citations are composed of subfields such as 'author', 'title', 'journal', and 'year'. It is useful to automatically identify attributes of these subfields, since they are used for linking a citation with the actual cited article. In this article, we employ a Support Vector Machine (SVM), a method of machine learning, to automatically identify subfields. We then employ a Hidden Markov Model (HMM) to improve the identification accuracy. Information from the subfields identified by the SVM, and syntactic information analyzed by the HMM, are integrated to make an accurate identification.
引用
收藏
页码:501 / 512
页数:12
相关论文
共 12 条
[1]  
AIZAWA A, 2004, NII J, P43
[2]  
[Anonymous], 1999, REPOSIT TU DORTMUND, DOI DOI 10.17877/DE290R-5098
[3]   THE UNIVERSAL STANDARD BIBLIOGRAPHIC CODE (USBC) - ITS USE FOR CLEANING, MERGING AND CONTROLLING LARGE DATABASES [J].
AYRES, FH ;
HUGGILL, JAW ;
YANNAKOUDAKIS, EJ .
PROGRAM-AUTOMATED LIBRARY AND INFORMATION SYSTEMS, 1988, 22 (02) :117-132
[4]  
Bilenko M., 2003, Proc. 9th Int. Conf. Knowledge Discovery and Data Mining, Washington, P39, DOI DOI 10.1145/956750.956759
[5]  
Hsu C., 2001, COMP METHODS MULTICL
[6]  
ITHO T, 2003, 2003DBS130 IPSJ SIG, P181
[7]  
KITA K, 1999, COMPUTATION LANGUAGE, V4
[8]   Digital libraries and autonomous citation indexing [J].
Lawrence, S ;
Giles, CL ;
Bollacker, K .
COMPUTER, 1999, 32 (06) :67-+
[9]   Learning string-edit distance [J].
Ristad, ES ;
Yianilos, PN .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 1998, 20 (05) :522-532
[10]  
Takasu A., 1996, Proceedings of the 13th International Conference on Pattern Recognition, P175, DOI 10.1109/ICPR.1996.546933