Optimal Mean-Based Algorithms for Trace Reconstruction

被引:28
作者
De, Anindya [1 ]
O'Donnell, Ryan [2 ]
Servedio, Rocco A. [3 ]
机构
[1] Northwestern Univ, Evanston, IL 60208 USA
[2] Carnegie Mellon Univ, Pittsburgh, PA 15213 USA
[3] Columbia Univ, New York, NY 10027 USA
来源
STOC'17: PROCEEDINGS OF THE 49TH ANNUAL ACM SIGACT SYMPOSIUM ON THEORY OF COMPUTING | 2017年
基金
美国国家科学基金会;
关键词
Trace Reconstruction; deletion channel; Littlewood polynomials; LITTLEWOOD-TYPE PROBLEMS; EFFICIENT RECONSTRUCTION; SEQUENCES; SUBSEQUENCES;
D O I
10.1145/3055399.3055450
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
In the (deletion-channel) trace reconstruction problem, there is an unknown n-bit source string x. An algorithm is given access to independent traces of x, where a trace is formed by deleting each bit of x independently with probability delta. The goal of the algorithm is to recover x exactly (with high probability), while minimizing samples (number of traces) and running time. Previously, the best known algorithm for the trace reconstruction problem was due to wHolenstein et al. [SODA 2008]; it uses exp((O) over tilde (n(1/2))) samples and running time for any fixed 0 < delta < 1. It is also what we call a "mean-based algorithm", meaning that it only uses the empirical means of the individual bits of the traces. Holenstein et al. also gave a lower bound, showing that any mean-based algorithm must use at least n((Omega) over bar (log n)) samples. In this paper we improve both of these results, obtaining matching upper and lower bounds for mean-based trace reconstruction. For any constant deletion rate 0 < delta < 1, we give a mean-based algorithm that uses exp(O(n(1/3))) time and traces; we also prove that any mean-based algorithm must use at least exp(Omega(n(1/3))) traces. In fact, we obtain matching upper and lower bounds even for delta subconstant and rho := 1 - delta subconstant: when (log(3) n)/n << delta <= 1/2 the bound is exp((-Theta(delta n)(1/3)), and when 1/root n << rho <= 1/2 the bound is exp((-Theta(n/rho)(1/3)). Our proofs involve estimates for the maxima of Littlewood polynomials on complex disks. We show that these techniques can also be used to perform trace reconstruction with random insertions and bit-flips in addition to deletions. We also find a surprising result: for deletion probabilities delta > 1/2, the presence of insertions can actually help with trace reconstruction.
引用
收藏
页码:1047 / 1056
页数:10
相关论文
共 22 条
[1]   Global alignment of molecular sequences via ancestral state reconstruction [J].
Andoni, Alexandr ;
Daskalakis, Constantinos ;
Hassidim, Avinatan ;
Roch, Sebastien .
STOCHASTIC PROCESSES AND THEIR APPLICATIONS, 2012, 122 (12) :3852-3874
[2]  
[Anonymous], 2008, London Mathematical Society Lecture Note Series, V352, P322
[3]  
Batu Tugkan, 2004, P IEEE 36 ANN FDN CO, P910
[4]   Littlewood-type problems on [0,1] [J].
Borwein, P ;
Erdély, T ;
Kós, G .
PROCEEDINGS OF THE LONDON MATHEMATICAL SOCIETY, 1999, 79 :22-46
[5]  
Borwein P, 1997, INDIANA U MATH J, V46, P1323
[6]  
De Anindya, 2016, TECHNICAL REPORT
[7]   Reconstruction from subsequences [J].
Dudík, M ;
Schulman, LJ .
JOURNAL OF COMBINATORIAL THEORY SERIES A, 2003, 103 (02) :337-348
[8]  
Holenstein T, 2008, PROCEEDINGS OF THE NINETEENTH ANNUAL ACM-SIAM SYMPOSIUM ON DISCRETE ALGORITHMS, P389
[9]  
Janson S., 2014, TAIL BOUNDS SUMS GEO
[10]  
Kalashnik V.V., 1973, Computational Mathematics and Computer Science (Vychislitel'naya matematika i vychislitel'naya tekhnika), Kharkov, V4, P56