A data structure for representing multi-version texts online

被引:32
作者
Schmidt, Desmond [1 ]
Colomb, Robert [1 ]
机构
[1] Univ Queensland, Sch ITEE, Brisbane, Qld, Australia
关键词
Textual variation; Overlapping hierarchies; Markup; Electronic editions; Cultural heritage; SEQUENCE ALIGNMENT; LITERARY-TEXTS; MODELS; MARKUP;
D O I
10.1016/j.ijhcs.2009.02.001
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
The digitisation of cultural heritage and linguistics texts has long been troubled by the problem of how to represent overlapping structures arising from different markup perspectives ('overlapping hierarchies') or from different versions of the same work ('textual variation'). These two problems can be reduced to one by observing that every case of overlapping hierarchies is also a case of textual variation. Overlapping textual structures can be accurately modelled either as a minimally redundant directed graph, or, more practically, as an ordered list of pairs, each containing a set of versions and a fragment of text or date. This 'pair-list' representation is provably equivalent to the graph representation. It can record texts consisting of thousands of versions or perspectives without becoming overloaded with data, and the most common operations on variant text, e.g. comparison between two versions, can be performed in linear time. This representation also separates variation or other overlapping structures from the document content, leading to a simplification of markup suitable for wiki-like web applications. (c) 2009 Elsevier Ltd. All rights reserved.
引用
收藏
页码:497 / 514
页数:18
相关论文
共 108 条
[1]  
ABAITAU J, 2008, PROJECT ROMULO
[2]  
[Anonymous], 2001, The wiki way: Quick collaboration on the web
[3]   The phylogeny of The Canterbury Tales [J].
Barbrook, AC ;
Howe, CJ ;
Blake, N ;
Robinson, P .
NATURE, 1998, 394 (6696) :839-839
[4]   SGML-BASED MARKUP FOR LITERARY-TEXTS .2. PROBLEMS AND SOME SOLUTIONS [J].
BARNARD, D ;
HAYTER, R ;
KARABABA, M ;
LOGAN, G ;
MCFADDEN, J .
COMPUTERS AND THE HUMANITIES, 1988, 22 (04) :265-276
[5]   LITERARY TEXTS IN ELECTRONIC STORAGE - EDITORIAL POTENTIAL [J].
BENDER, TK .
COMPUTERS AND THE HUMANITIES, 1976, 10 (04) :193-199
[6]  
Bentivogli Luisa, 2004, P LREC 2004 WORKSH X, P30
[7]  
Berners-Lee T., 1999, TRANSCRIPT TIM BERNE
[8]  
BERRIE P, 2000, APPL U CONS C WOLL A
[9]  
BOURDAILLET J, 2007, THESIS U PARIS 6 P M
[10]  
BOURDAILLET J, 2007, IJCAI WORKSH AN NOIS