Using machine learning to generate headlines

被引:0
作者
Wang, RC [1 ]
Stokes, N [1 ]
Doran, W [1 ]
Dunnion, J [1 ]
Carthy, J [1 ]
机构
[1] Univ Coll Dublin, Dept Comparat Biosci, Intelligent Informat Retrieval Grp, Dublin 2, Ireland
来源
MLMTA '05: Proceedings of the International Conference on Machine Learning Models Technologies and Applications | 2005年
关键词
machine learning; headline generation; statistical techniques; DUC;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we present a system which generates headlines for a text. Our system, the HybridTrim system, uses a linguistic, statistical and positional information in combination with a machine learning technique to identify topic labels for headlines in a text. In this paper, we compare our system with the Topiary system which, in contrast, uses a statistical learning approach to finding topic descriptors for headlines. Both systems combine these topic descriptors with a compressed version of the lead sentence. The performance of these systems is evaluated using the ROUGE evaluation suite on the DUC 2004 news stories collection.
引用
收藏
页码:167 / 172
页数:6
相关论文
共 9 条
[1]  
[Anonymous], 1998, C5. 0: An informal tutorial
[2]  
DORAN W, 2004, P CICLING SEOUL
[3]  
Doran W. P., 2004, Proceedings of Sheffield SIGIR 2004. The Twenty-Seventh Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, P464, DOI 10.1145/1008992.1009072
[4]  
DORAN WP, 2004, P DOC UND C DUC
[5]  
Dorr Bonnie, 2003, P HLT NAACL
[6]  
LIN CY, 2004, P ACL WORKSH TEXT SU, P56
[7]  
LIN CY, 2003, P HLT NACCL
[8]  
Stokes Nicola, 2004, Ph. D. Dissertation.
[9]  
Zajic D., 2004, P DOC UND C DUC