conLSH: Context based Locality Sensitive Hashing for mapping of noisy SMRT reads

被引:8
|
作者
Chakraborty, Angana [1 ]
Bandyopadhyay, Sanghamitra [1 ]
机构
[1] Indian Stat Inst, Machine Intelligence Unit, Kolkata, India
关键词
Locality Sensitive Hashing; Sequence analysis; Single Molecule Real-Time (SMRT) sequencing; Sequence alignment; PacBio dataset; Algorithm; NEAREST-NEIGHBOR; ALIGNMENT; ALGORITHMS;
D O I
10.1016/j.compbiolchem.2020.107206
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Single Molecule Real-Time (SMRT) sequencing is a recent advancement of Next Gen technology developed by Pacific Bio (PacBio). It comes with an explosion of long and noisy reads demanding cutting edge research to get most out of it. To deal with the high error probability of SMRT data, a novel contextual Locality Sensitive Hashing (conLSH) based algorithm is proposed in this article, which can effectively align the noisy SMRT reads to the reference genome. Here, sequences are hashed together based not only on their closeness, but also on similarity of context. The algorithm has O(n(P+1)) space requirement, where n is the number of sequences in the corpus and p is a constant. The indexing time and querying time are bounded by O(n(p+1).ln n/ln 1/P-2) and O(nP) respectively, where P-2 > O, is a probability value. This algorithm is particularly useful for retrieving similar sequences, a widely used task in biology. The proposed conLSH based aligner is compared with rHAT, popularly used for aligning SMRT reads, and is found to comprehensively beat it in speed as well as in memory requirements. In particular, it takes approximately 24.2% less processing time, while saving about 70.3% in peak memory requirement for H.sapiens PacBio dataset.
引用
收藏
页数:8
相关论文
共 50 条
  • [41] Toward an Effective Locality-Sensitive Hashing Search for WMSNs Based on the Neighborhood Rough Set Approach
    Xiao, Ruliang
    Liu, Shirong
    Li, Yiqi
    Ni, Youcong
    Du, Xin
    IEEE INTERNET OF THINGS JOURNAL, 2020, 7 (11) : 10985 - 10995
  • [42] Locality Sensitive Hashing based incremental clustering for creating affinity groups in Hadoop - HDFS - An infrastructure extension
    Kala Karun, A.
    Chitharanjan, K.
    Proceedings of IEEE International Conference on Circuit, Power and Computing Technologies, ICCPCT 2013, 2013, : 1243 - 1249
  • [43] Private approximate nearest neighbor search for on-chain data based on locality-sensitive hashing
    Shang, Siyuan
    Du, Xuehui
    Wang, Xiaohan
    Liu, Aodi
    FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2025, 164
  • [44] Locality Sensitive Hashing based Incremental Clustering for Creating Affinity Groups in Hadoop - HDFS - An Infrastructure Extension
    Karun, Kala A.
    Chitharanjan, K.
    PROCEEDINGS OF 2013 INTERNATIONAL CONFERENCE ON CIRCUITS, POWER AND COMPUTING TECHNOLOGIES (ICCPCT 2013), 2013, : 1243 - 1249
  • [45] A Novel Cluster Prediction Approach Based on Locality-Sensitive Hashing for Fuzzy Clustering of Categorical Data
    Toan Nguyen Mau
    Inoguchi, Yasushi
    Van-Nam Huynh
    IEEE ACCESS, 2022, 10 : 34196 - 34206
  • [46] DB-LSH 2.0: Locality-Sensitive Hashing With Query-Based Dynamic Bucketing
    Tian, Yao
    Zhao, Xi
    Zhou, Xiaofang
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2024, 36 (03) : 1000 - 1015
  • [47] Multi-view content-based mammogram retrieval using dynamic similarity and locality sensitive hashing
    Jouirou, Amira
    Baazaoui, Abir
    Barhoumi, Walid
    PATTERN RECOGNITION, 2021, 112
  • [48] Subspace k-anonymity algorithm for location-privacy preservation based on locality-sensitive hashing
    Wang, Xiaohan
    Luo, Yonglong
    Liu, Shiyang
    Wang, Taochun
    Han, Huihui
    INTELLIGENT DATA ANALYSIS, 2019, 23 (05) : 1167 - 1185
  • [49] Locality sensitive hashing based space partitioning approach for indexing multidimensional feature vectors of fingerprint image data
    Ahmed, Tauheed
    Sarma, Monalisa
    IET IMAGE PROCESSING, 2018, 12 (06) : 1056 - 1064
  • [50] Accuracy-enhanced E-commerce recommendation based on deep learning and locality-sensitive hashing
    Li, Dejuan
    Esquivel, James A.
    WIRELESS NETWORKS, 2024, 30 (09) : 7305 - 7320