Deterministic Document Exchange Protocols, and Almost Optimal Binary Codes for Edit Errors

被引:49
作者
Cheng, Kuan [1 ]
Jin, Zhengzhong [1 ]
Li, Xin [1 ]
Wu, Ke [1 ]
机构
[1] Johns Hopkins Univ, Dept Comp Sci, Baltimore, MD 21218 USA
来源
2018 IEEE 59TH ANNUAL SYMPOSIUM ON FOUNDATIONS OF COMPUTER SCIENCE (FOCS) | 2018年
关键词
Edit errors; Document exchange; Error correcting codes; INSERTIONS; DELETIONS;
D O I
10.1109/FOCS.2018.00028
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
We study two basic problems regarding edit errors (insertions and deletions). The first one is document exchange, where two parties Alice and Bob hold two strings x and y with a bounded edit distance k. The goal is to have Alice send a short sketch to Bob, so that Bob can recover x based on y and the sketch. The second one is the fundamental problem of designing error correcting codes for edit errors, where the goal is to construct an explicit code to transmit a message x through a channel that can add at most k worst case insertions and deletions, so that the original message x can be successfully recovered at the other end of the channel. Both problems have been extensively studied for decades, and in this paper we focus on deterministic document exchange protocols and binary codes for insertions and deletions (insdel codes). If the length of x is n, then it is known that for small k (e.g., k <= n/4), in both problems the optimal sketch size or the optimal number of redundant bits is theta(k log n/k). In particular, this implies the existence of binary codes that can correct epsilon fraction of insertions and deletions with rate 1 - theta(epsilon log(1/epsilon)). However, known constructions are far from achieving these bounds. In this paper we significantly improve previous results on both problems. For document exchange, we give an efficient deterministic protocol with sketch size O(k log(2) n/k). This significantly improves the previous best known deterministic protocol, which has sketch size O(k(2) + k log(2) n) [2]. For binary insdel codes, we obtain the following results: An explicit binary insdel code which encodes an n-bit message x against k errors with redundancy O(k log(2) n/k). In particular this implies an explicit family of binary insdel codes that can correct epsilon fraction of insertions and deletions with rate 1 - O(epsilon log(2) (1/epsilon)) = 1 - O(epsilon). This ignificantly improves the previous best known result which only achieves rate 1 - O(root epsilon) [11], [10], and is optimal up to a log(1/epsilon) factor. An explicit binary insdel code which encodes an n-bit message x against k errors with redundancy O(k log n). This significantly improves the previous best known result of [4], which only works for constant k and has redundancy O(k(2) log k log n); and that of [2], which has redundancy O(k(2) k log(2) n). Our code has optimal redundancy for k <= n(1-alpha), any constant 0 < alpha < 1. This is the first explicit construction of binary insdel codes that has optimal redundancy for a wide range of error parameters k, and this brings our understanding of binary insdel codes much closer to that of standard binary error correcting codes. In obtaining our results we introduce several new techniques. Most notably, we introduce the notion of epsilon-self matching hash functions and epsilon-synchronization hash functions. We believe our techniques can have further applications in the literature.
引用
收藏
页码:200 / 211
页数:12
相关论文
共 22 条
[1]  
Alon Noga, 1992, RANDOM STRUCTURES AL
[2]  
[Anonymous], 1966, Soviet Physics Doklady
[3]  
[Anonymous], STOC
[4]  
Belazzougui Djamal, 2015, 151109229 CORR
[5]  
Belazzougui Djamal, 2016, FOCS
[6]  
Brakensiek J., 2017, IEEE T INFORM THEORY, P1
[7]  
Bukh Boris, 2016, SODA
[8]  
Cheng K., 2018, ARXIV E PRINTS
[9]  
Cormode Graham, 2000, SODA
[10]  
De A, 2010, LECT NOTES COMPUT SC, V6302, P504, DOI 10.1007/978-3-642-15369-3_38