Automated storytelling evaluation and story chain generation

被引:0
作者
Rigsby, J. T. [1 ]
Barbara, Daniel [2 ]
机构
[1] Naval Surface Warfare Ctr, 18372 Frontage Rd,Suite 318, Dahlgren, VA 22448 USA
[2] George Mason Univ, CS Dept, Mail Stop 4A5, Fairfax, VA 22030 USA
来源
2017 17TH IEEE INTERNATIONAL CONFERENCE ON DATA MINING WORKSHOPS (ICDMW 2017) | 2017年
关键词
automated storytelling; storytelling evaluation; quantitative evaluation; automated evaluation; literature-based discovery;
D O I
10.1109/ICDMW.2017.15
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Given a beginning and ending document, automated storytelling attempts to fill in intermediary documents to form a coherent story. This is a common problem for analysts; they often have two snippets of information and want to find the other pieces that relate them. Evaluation of the quality of the created stories is difficult and has routinely involved human judgment. This work extends the state of the art by providing quantitative methods of story quality evaluation which are shown to have good agreement with human judgment. Two methods of automated storytelling evaluation, dispersion and coherence are developed. Dispersion, a measure of story flow, ascertains how well the generated story flows away from the beginning document and towards the ending document. Coherence measures how well the articles in the middle of the story provide information about the relationship of the beginning and ending document pair. Kullback-Leibler divergence (KLD) is used to measure the ability to encode the vocabulary of the beginning and ending story documents using the set of middle documents in the story. The dispersion and coherence methodologies developed here have the added benefit that they do not require parametrization or user inputs and are also easily automated. An automated storytelling algorithm is proposed as a multi-criteria optimization problem that maximizes dispersion and coherence simultaneously. The developed storytelling methodologies will allow for the automated identification of information which associates disparate documents in support of literature-based discovery and link analysis tasking. In addition, the methods provide quantitative measures of the strength of these associations.
引用
收藏
页码:61 / 68
页数:8
相关论文
共 10 条
[1]  
[Anonymous], 1980, P WORKSH PATT REC PR
[2]  
Hossain M. S., 2011, SCAL INT AN VIS 2011
[3]   Connecting the Dots between PubMed Abstracts [J].
Hossain, M. Shahriar ;
Gresock, Joseph ;
Edmonds, Yvette ;
Helm, Richard ;
Potts, Malcolm ;
Ramakrishnan, Naren .
PLOS ONE, 2012, 7 (01)
[4]  
Hughes F. J., 2005, UNPUB
[5]   ON INFORMATION AND SUFFICIENCY [J].
KULLBACK, S ;
LEIBLER, RA .
ANNALS OF MATHEMATICAL STATISTICS, 1951, 22 (01) :79-86
[6]  
Ponte JM., 2017, P 21 ANN INT ACM SIG, V51, P202, DOI [10.1145/3130348.3130368, DOI 10.1145/290941.291008]
[7]  
Shahaf Dafna, 2010, Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining, P623, DOI DOI 10.1145/1835804.1835884
[8]   Uncovering the plot: detecting surprising coalitions of entities in multi-relational schemas [J].
Wu, Hao ;
Vreeken, Jilles ;
Tatti, Nikolaj ;
Ramakrishnan, Naren .
DATA MINING AND KNOWLEDGE DISCOVERY, 2014, 28 (5-6) :1398-1428
[9]   A study of smoothing methods for language models applied to information retrieval [J].
Zhai, CX ;
Lafferty, J .
ACM TRANSACTIONS ON INFORMATION SYSTEMS, 2004, 22 (02) :179-214
[10]  
Zhou Yun., 2005, CIKM 05, P331, DOI [DOI 10.1145/1099554.1099652, 10.1145/1099554.1099652]