The most probable string: an algorithmic study

被引:7
作者
De la Higuera, Colin [1 ]
Oncina, Jose [2 ]
机构
[1] Univ Nantes, CNRS, LINA, UMR6241, F-44000 Nantes, France
[2] Univ Alicante, Dept Lenguajes & Sistemas Informat, E-03080 Alicante, Spain
关键词
Probabilistic automata; parsing; FINITE-STATE TRANSDUCERS; HIDDEN MARKOV-MODELS; PROBABILISTIC-AUTOMATA; COMPLEXITY;
D O I
10.1093/logcom/exs049
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
The problem of finding the consensus (most probable string) for a distribution generated by a weighted finite automaton or a probabilistic grammar is related to a number of important questions: computing the distance between two distributions or finding the best translation (the most probable one) given a probabilistic finite state transducer. The problem is undecidable with general weights and is NP-hard if the automaton is probabilistic. We give a pseudo-polynomial algorithm that solves a decision problem directly associated with the consensus string and answers if there is a (reasonably short) string whose probability is larger than a given bound in time polynomial in the the size of this bound, both for probabilistic finite automata and probabilistic context-free grammars. We also study a randomized algorithm solving the same problem. Finally, we report links between the length of the consensus string and the probability of this string.
引用
收藏
页码:311 / 330
页数:20
相关论文
共 24 条
[1]  
[Anonymous], 1979, Computers and Intractablity: A Guide to the Theory of NP-Completeness
[2]   ON THE THEORY OF AVERAGE CASE COMPLEXITY [J].
BENDAVID, S ;
CHOR, B ;
GOLDREICH, O ;
LUBY, M .
JOURNAL OF COMPUTER AND SYSTEM SCIENCES, 1992, 44 (02) :193-219
[3]   Undecidable problems for probabilistic automata of fixed dimension [J].
Blondel, VD ;
Canterini, V .
THEORY OF COMPUTING SYSTEMS, 2003, 36 (03) :231-245
[4]   Optimal linguistic decoding is a difficult computational problem [J].
Casacuberta, F ;
de la Higuera, C .
PATTERN RECOGNITION LETTERS, 1999, 20 (08) :813-821
[5]  
Casacuberta F, 2000, LECT NOTES ARTIF INT, V1891, P15
[6]  
Casacuberta F., 1995, 6 SPAN S PATT REC IM, P201
[7]   Lp distance and equivalence of probabilistic automata [J].
Cortes, Corinna ;
Mohri, Mehryar ;
Rastogi, Ashish .
INTERNATIONAL JOURNAL OF FOUNDATIONS OF COMPUTER SCIENCE, 2007, 18 (04) :761-779
[8]  
Cortes C, 2006, LECT NOTES COMPUT SC, V4094, P137
[9]  
de la Higuera C., 2010, GRAMMATICAL INFERENC
[10]  
de la Higuera C., 2011, P 12 INT C PARS TECH, P26