Can Back-of-the-Book Indexes be Automatically Created?

被引：13

作者：

Wu, Zhaohui ^{[1
]}

Li, Zhenhui ^{[2
]}

Mitra, Prasenjit ^{[1
,2
]}

Giles, C. Lee ^{[1
,2
]}

机构：

[1] Penn State Univ, Comp Sci & Engn, University Pk, PA 16802 USA

[2] Penn State Univ, Informat Sci & Technol, University Pk, PA 16802 USA

来源：

PROCEEDINGS OF THE 22ND ACM INTERNATIONAL CONFERENCE ON INFORMATION & KNOWLEDGE MANAGEMENT (CIKM'13) | 2013年

关键词：

Back-of-the-Book Index; Book Index; Term Informativeness;

D O I：

10.1145/2505515.2505627

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Automatic creation of back-of-the-book indexes remains one of the few manual tasks related to publishing. Inspired by how human indexers work on back-of-the-book indexes creation, we present a new domain-independent, corpus-free and training-free automation approach. Given a book, the index terms will be sequentially selected according to an indexability score encoded by the structure information residing in a book as well as a novel context-aware term informativeness measurement utilizing the power of the web knowledge base such as Wikipedia. By extensive experiments on books from various domains, we show our approach to be a more effective and practical than ones that used previous keyword extraction and supervised learning.

引用

页码：1745 / 1750

页数：6