ClaPIM: Scalable Sequence Classification Using Processing-in-Memory

被引:2
|
作者
Khalifa, Marcel [1 ]
Hoffer, Barak [1 ]
Leitersdorf, Orian [1 ]
Hanhan, Robert [1 ]
Perach, Ben [1 ]
Yavits, Leonid [2 ]
Kvatinsky, Shahar [1 ]
机构
[1] Technion Israel Inst Technol, Andrew & Erna Viterbi Fac Elect & Comp Engn, IL-3200003 Haifa, Israel
[2] Bar Ilan Univ, Alexander Kofkin Fac Engn, IL-5290002 Ramat Gan, Israel
基金
欧洲研究理事会;
关键词
Accelerator; approximate string matching; bioinformatics; deoxyribonucleic acid (DNA) classification; processing-in-memory (PIM);
D O I
10.1109/TVLSI.2023.3293038
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Deoxyribonucleic acid (DNA) sequence classification is a fundamental task in computational biology with vast implications for applications such as disease prevention and drug design. Therefore, fast high-quality sequence classifiers are significantly important. This article introduces ClaPIM, a scalable DNA sequence classification architecture based on the emerging concept of hybrid in-crossbar and near-crossbar memristive processing-in-memory (PIM). We enable efficient and high-quality classification by uniting the filter and search stages within a single algorithm. Specifically, we propose a custom filtering technique that drastically narrows the search space and a search approach that facilitates approximate string matching through a distance function. ClaPIM is the first PIM architecture for scalable approximate string matching that benefits from the high density of memristive crossbar arrays and the massive computational parallelism of PIM. Compared with Kraken2, a state-of-the-art software classifier, ClaPIM provides significantly higher classification quality (up to 20x improvement in F1 score) and also demonstrates a 1.8x throughput improvement. Compared with edit distance tolerant approximate matching (EDAM), a recently proposed static random-access memory (SRAM)-based accelerator that is restricted to small datasets, we observe both a 30.4x improvement in normalized throughput per area and a 7% increase in classification precision.
引用
收藏
页码:1347 / 1357
页数:11
相关论文
共 38 条
  • [21] BEACON: Scalable Near-Data-Processing Accelerators for Genome Analysis near Memory Pool with the CXL Support
    Tiuangfu, Wenqin
    Malladi, Krishna T.
    Chang, Andrew
    Xie, Yuan
    2022 55TH ANNUAL IEEE/ACM INTERNATIONAL SYMPOSIUM ON MICROARCHITECTURE (MICRO), 2022, : 727 - 743
  • [22] A scalable multiple pairwise protein sequence alignment acceleration using hybrid CPU–GPU approach
    Luay Alawneh
    Mohammed A. Shehab
    Mahmoud Al-Ayyoub
    Yaser Jararweh
    Ziad A. Al-Sharif
    Cluster Computing, 2020, 23 : 2677 - 2688
  • [23] Functional Annotation of Proteins using Domain Embedding based Sequence Classification
    Sarker, Bishnu
    Ritchie, David
    Aridhi, Sabeur
    KDIR: PROCEEDINGS OF THE 11TH INTERNATIONAL JOINT CONFERENCE ON KNOWLEDGE DISCOVERY, KNOWLEDGE ENGINEERING AND KNOWLEDGE MANAGEMENT - VOL 1: KDIR, 2019, : 163 - 170
  • [24] Fast, scalable generation of high-quality protein multiple sequence alignments using Clustal Omega
    Sievers, Fabian
    Wilm, Andreas
    Dineen, David
    Gibson, Toby J.
    Karplus, Kevin
    Li, Weizhong
    Lopez, Rodrigo
    McWilliam, Hamish
    Remmert, Michael
    Soeding, Johannes
    Thompson, Julie D.
    Higgins, Desmond G.
    MOLECULAR SYSTEMS BIOLOGY, 2011, 7
  • [25] A scalable multiple pairwise protein sequence alignment acceleration using hybrid CPU-GPU approach
    Alawneh, Luay
    Shehab, Mohammed A.
    Al-Ayyoub, Mahmoud
    Jararweh, Yaser
    Al-Sharif, Ziad A.
    CLUSTER COMPUTING-THE JOURNAL OF NETWORKS SOFTWARE TOOLS AND APPLICATIONS, 2020, 23 (04): : 2677 - 2688
  • [26] DNA Pre-Alignment Filter Using Processing Near Racetrack Memory
    Hameed, Fazal
    Khan, Asif Ali
    Ollivier, Sebastien
    Jones, Alex K.
    Castrillon, Jeronimo
    IEEE COMPUTER ARCHITECTURE LETTERS, 2022, 21 (02) : 53 - 56
  • [27] BASTA - Taxonomic classification of sequences and sequence bins using last common ancestor estimations
    Kahlke, Tim
    Ralph, Peter J.
    METHODS IN ECOLOGY AND EVOLUTION, 2019, 10 (01): : 100 - 103
  • [28] A Generic and Scalable Architecture for a Large Acoustic Model and Large Vocabulary Speech Recognition Accelerator Using Logic on Memory
    Bapat, Ojas A.
    Franzon, Paul D.
    Fastow, Richard M.
    IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, 2014, 22 (12) : 2701 - 2712
  • [29] Identification and classification of promoters using the attention mechanism based on long short-term memory
    Qingwen Li
    Lichao Zhang
    Lei Xu
    Quan Zou
    Jin Wu
    Qingyuan Li
    Frontiers of Computer Science, 2022, 16
  • [30] Identification and classification of promoters using the attention mechanism based on long short-term memory
    Qingwen LI
    Lichao ZHANG
    Lei XU
    Quan ZOU
    Jin WU
    Qingyuan LI
    Frontiers of Computer Science, 2022, 16 (04) : 105 - 111