ClaPIM: Scalable Sequence Classification Using Processing-in-Memory

被引：2

作者：

Khalifa, Marcel ^{[1
]}

Hoffer, Barak ^{[1
]}

Leitersdorf, Orian ^{[1
]}

Hanhan, Robert ^{[1
]}

Perach, Ben ^{[1
]}

Yavits, Leonid ^{[2
]}

Kvatinsky, Shahar ^{[1
]}

机构：

[1] Technion Israel Inst Technol, Andrew & Erna Viterbi Fac Elect & Comp Engn, IL-3200003 Haifa, Israel

[2] Bar Ilan Univ, Alexander Kofkin Fac Engn, IL-5290002 Ramat Gan, Israel

来源：

IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS | 2023年 / 31卷 / 09期

基金：

欧洲研究理事会;

关键词：

Accelerator; approximate string matching; bioinformatics; deoxyribonucleic acid (DNA) classification; processing-in-memory (PIM);

D O I：

10.1109/TVLSI.2023.3293038

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

Deoxyribonucleic acid (DNA) sequence classification is a fundamental task in computational biology with vast implications for applications such as disease prevention and drug design. Therefore, fast high-quality sequence classifiers are significantly important. This article introduces ClaPIM, a scalable DNA sequence classification architecture based on the emerging concept of hybrid in-crossbar and near-crossbar memristive processing-in-memory (PIM). We enable efficient and high-quality classification by uniting the filter and search stages within a single algorithm. Specifically, we propose a custom filtering technique that drastically narrows the search space and a search approach that facilitates approximate string matching through a distance function. ClaPIM is the first PIM architecture for scalable approximate string matching that benefits from the high density of memristive crossbar arrays and the massive computational parallelism of PIM. Compared with Kraken2, a state-of-the-art software classifier, ClaPIM provides significantly higher classification quality (up to 20x improvement in F1 score) and also demonstrates a 1.8x throughput improvement. Compared with edit distance tolerant approximate matching (EDAM), a recently proposed static random-access memory (SRAM)-based accelerator that is restricted to small datasets, we observe both a 30.4x improvement in normalized throughput per area and a 7% increase in classification precision.

引用

页码：1347 / 1357

页数：11

共 38 条

[21] BEACON: Scalable Near-Data-Processing Accelerators for Genome Analysis near Memory Pool with the CXL Support
Tiuangfu, Wenqin
Malladi, Krishna T.
Chang, Andrew
Xie, Yuan
2022 55TH ANNUAL IEEE/ACM INTERNATIONAL SYMPOSIUM ON MICROARCHITECTURE (MICRO), 2022, : 727 - 743
[22] A scalable multiple pairwise protein sequence alignment acceleration using hybrid CPU–GPU approach
Luay Alawneh
Mohammed A. Shehab
Mahmoud Al-Ayyoub
Yaser Jararweh
Ziad A. Al-Sharif
Cluster Computing, 2020, 23 : 2677 - 2688
[23] Functional Annotation of Proteins using Domain Embedding based Sequence Classification
Sarker, Bishnu
Ritchie, David
Aridhi, Sabeur
KDIR: PROCEEDINGS OF THE 11TH INTERNATIONAL JOINT CONFERENCE ON KNOWLEDGE DISCOVERY, KNOWLEDGE ENGINEERING AND KNOWLEDGE MANAGEMENT - VOL 1: KDIR, 2019, : 163 - 170
[24] Fast, scalable generation of high-quality protein multiple sequence alignments using Clustal Omega
Sievers, Fabian
Wilm, Andreas
Dineen, David
Gibson, Toby J.
Karplus, Kevin
Li, Weizhong
Lopez, Rodrigo
McWilliam, Hamish
Remmert, Michael
Soeding, Johannes
Thompson, Julie D.
Higgins, Desmond G.
MOLECULAR SYSTEMS BIOLOGY, 2011, 7
[25] A scalable multiple pairwise protein sequence alignment acceleration using hybrid CPU-GPU approach
Alawneh, Luay
Shehab, Mohammed A.
Al-Ayyoub, Mahmoud
Jararweh, Yaser
Al-Sharif, Ziad A.
CLUSTER COMPUTING-THE JOURNAL OF NETWORKS SOFTWARE TOOLS AND APPLICATIONS, 2020, 23 (04): : 2677 - 2688
[26] DNA Pre-Alignment Filter Using Processing Near Racetrack Memory
Hameed, Fazal
Khan, Asif Ali
Ollivier, Sebastien
Jones, Alex K.
Castrillon, Jeronimo
IEEE COMPUTER ARCHITECTURE LETTERS, 2022, 21 (02) : 53 - 56
[27] BASTA - Taxonomic classification of sequences and sequence bins using last common ancestor estimations
Kahlke, Tim
Ralph, Peter J.
METHODS IN ECOLOGY AND EVOLUTION, 2019, 10 (01): : 100 - 103
[28] A Generic and Scalable Architecture for a Large Acoustic Model and Large Vocabulary Speech Recognition Accelerator Using Logic on Memory
Bapat, Ojas A.
Franzon, Paul D.
Fastow, Richard M.
IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, 2014, 22 (12) : 2701 - 2712
[29] Identification and classification of promoters using the attention mechanism based on long short-term memory
Qingwen Li
Lichao Zhang
Lei Xu
Quan Zou
Jin Wu
Qingyuan Li
Frontiers of Computer Science, 2022, 16
[30] Identification and classification of promoters using the attention mechanism based on long short-term memory
Qingwen LI
Lichao ZHANG
Lei XU
Quan ZOU
Jin WU
Qingyuan LI
Frontiers of Computer Science, 2022, 16 (04) : 105 - 111

← 1 2 3 4 →