conLSH: Context based Locality Sensitive Hashing for mapping of noisy SMRT reads

被引:8
|
作者
Chakraborty, Angana [1 ]
Bandyopadhyay, Sanghamitra [1 ]
机构
[1] Indian Stat Inst, Machine Intelligence Unit, Kolkata, India
关键词
Locality Sensitive Hashing; Sequence analysis; Single Molecule Real-Time (SMRT) sequencing; Sequence alignment; PacBio dataset; Algorithm; NEAREST-NEIGHBOR; ALIGNMENT; ALGORITHMS;
D O I
10.1016/j.compbiolchem.2020.107206
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Single Molecule Real-Time (SMRT) sequencing is a recent advancement of Next Gen technology developed by Pacific Bio (PacBio). It comes with an explosion of long and noisy reads demanding cutting edge research to get most out of it. To deal with the high error probability of SMRT data, a novel contextual Locality Sensitive Hashing (conLSH) based algorithm is proposed in this article, which can effectively align the noisy SMRT reads to the reference genome. Here, sequences are hashed together based not only on their closeness, but also on similarity of context. The algorithm has O(n(P+1)) space requirement, where n is the number of sequences in the corpus and p is a constant. The indexing time and querying time are bounded by O(n(p+1).ln n/ln 1/P-2) and O(nP) respectively, where P-2 > O, is a probability value. This algorithm is particularly useful for retrieving similar sequences, a widely used task in biology. The proposed conLSH based aligner is compared with rHAT, popularly used for aligning SMRT reads, and is found to comprehensively beat it in speed as well as in memory requirements. In particular, it takes approximately 24.2% less processing time, while saving about 70.3% in peak memory requirement for H.sapiens PacBio dataset.
引用
收藏
页数:8
相关论文
共 50 条
  • [31] A Graph Classification Method Based on Support Vector Machines and Locality-Sensitive Hashing
    Gonzalez-Lima, Maria D.
    Ludena, Carenne C.
    Otazo-Sanchez, Gibran G.
    IEEE ACCESS, 2024, 12 : 15791 - 15799
  • [32] An improved k-NN anomaly detection framework based on locality sensitive hashing for edge computing environment
    Gao, Cong
    Chen, Yuzhe
    Chen, Yanping
    Wang, Zhongmin
    Xia, Hong
    INTELLIGENT DATA ANALYSIS, 2023, 27 (05) : 1267 - 1285
  • [33] A Projection-based Locality-Sensitive Hashing Technique for Reducing False Negatives
    Lee, Keon Myung
    INFORMATION TECHNOLOGY APPLICATIONS IN INDUSTRY, PTS 1-4, 2013, 263-266 : 1341 - 1346
  • [34] DB-LSH: Locality-Sensitive Hashing with Query-based Dynamic Bucketing
    Tian, Yao
    Zhao, Xi
    Thou, Xiaofang
    2022 IEEE 38TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE 2022), 2022, : 2250 - 2262
  • [35] Locality Sensitive Hashing for Fast Computation of Correlational Manifold Learning based Feature space Transformations
    Tomar, Vikrant Singh
    Rose, Richard C.
    14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 1775 - 1779
  • [36] Curvelet-based Locality Sensitive Hashing for Mammogram Retrieval in Large-scale Datasets
    Jouirou, Amira
    Baazaoui, Abir
    Barhoumi, Walid
    Zagrouba, Ezzeddine
    2015 IEEE/ACS 12TH INTERNATIONAL CONFERENCE OF COMPUTER SYSTEMS AND APPLICATIONS (AICCSA), 2015,
  • [37] Trusted Player Transfer Evaluation for Sport Markets Based on Blockchain and Locality-Sensitive Hashing
    Liu, Chao
    Li, Zengxi
    Liu, Shunshun
    Xie, Jushi
    Yan, Chao
    Huang, Wanli
    IEEE ACCESS, 2021, 9 : 87332 - 87339
  • [38] A Machine Learning approach for anomaly detection on the Internet of Things based on Locality-Sensitive Hashing
    Hernandez-Jaimes, Mireya Lucia
    Martinez-Cruz, Alfonso
    Ramirez-Gutierrez, Kelseyalejandra
    INTEGRATION-THE VLSI JOURNAL, 2024, 96
  • [39] LSHWE: Improving Similarity-Based Word Embedding with Locality Sensitive Hashing for Cyberbullying Detection
    Zhao, Zehua
    Gao, Min
    Luo, Fengji
    Zhang, Yi
    Xiong, Qingyu
    2020 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2020,
  • [40] SessionPrint: Accelerating kNN via Locality-Sensitive Hashing for Session-Based News Recommendation
    Karimi, Mozhgan
    EXPERIMENTAL IR MEETS MULTILINGUALITY, MULTIMODALITY, AND INTERACTION, PT I, CLEF 2024, 2024, 14958 : 159 - 165