Transformer-based embedding applied to classify bacterial species using sequencing reads

被引:0
|
作者
Gwak, Ho-Jin [1 ]
Rho, Mina [1 ]
机构
[1] Hanyang Univ, Dept Comp Sci, Seoul, South Korea
来源
2022 IEEE INTERNATIONAL CONFERENCE ON BIG DATA AND SMART COMPUTING (IEEE BIGCOMP 2022) | 2022年
关键词
embedding; transformer; deep learning; classification; Staphylococcus species; SOFTWARE;
D O I
10.1109/BigComp54360.2022.00084
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
With the emergence of next-generation sequencing and metagenomic approaches, the necessity for read-level taxonomy classifiers has increased. Although the 16S rRNA gene sequence has been widely employed as a taxonomic marker, recent studies have revealed that 16S rRNA is not sufficient to assign species. Therefore, an accurate classifier is required to classify whole-genome sequencing reads into species. With the advancement of deep learning methods and natural language processing technologies, several studies attempted to apply these methods to genomic data and successfully achieved state-of-the-art performance. In this study, we applied transformer- based embedding into bacterial genomes to accurately classify species using sequencing reads. As a case study, we classified Staphylococcus species using sequencing reads. Our model achieved ROC-AVC values of over 0.98 and 0.99 for 151bp and 251bp paired-end reads, respectively. Compared with a cutting-edge method Kraken2, our model classified significantly more S. aureus reads while maintaining comparable precision.
引用
收藏
页码:374 / 377
页数:4
相关论文
共 50 条
  • [41] REDAffectiveLM: leveraging affect enriched embedding and transformer-based neural language model for readers' emotion detection
    Kadan, Anoop
    Deepak, P.
    Gangan, Manjary P.
    Abraham, Sam Savitha
    Lajish, V. L.
    KNOWLEDGE AND INFORMATION SYSTEMS, 2024, 66 (12) : 7495 - 7525
  • [42] A transformer-based approach for early prediction of soybean yield using time-series images
    Bi, Luning
    Wally, Owen
    Hu, Guiping
    Tenuta, Albert U.
    Kandel, Yuba R.
    Mueller, Daren S.
    FRONTIERS IN PLANT SCIENCE, 2023, 14
  • [43] In silico identification of Histone Deacetylase inhibitors using Streamlined Masked Transformer-based Pretrained features
    Vinh, Tuan
    Nguyen-Vo, Thanh-Hoang
    Le, Viet-Tuan
    Phan-Nguyen, Xuan-Phuc
    Nguyen, Binh P.
    METHODS, 2025, 234 : 1 - 9
  • [44] Patent image retrieval using transformer-based deep metric learning
    Higuchi, Kotaro
    Yanai, Keiji
    WORLD PATENT INFORMATION, 2023, 74
  • [45] Explaining transformer-based next activity prediction by using attention scores
    Martin Käppel
    Lars Ackermann
    Stefan Jablonski
    Simon Härtl
    Process Science, 2 (1):
  • [46] A vision transformer-based automated human identification using ear biometrics
    Mehta, Ravishankar
    Shukla, Sindhuja
    Pradhan, Jitesh
    Singh, Koushlendra Kumar
    Kumar, Abhinav
    JOURNAL OF INFORMATION SECURITY AND APPLICATIONS, 2023, 78
  • [47] Long-term prediction for temporal propagation of seasonal influenza using Transformer-based model
    Li, Liang
    Jiang, Yuewen
    Huang, Biqing
    JOURNAL OF BIOMEDICAL INFORMATICS, 2021, 122
  • [48] Accelerating Transformer-based Deep Learning Models on FPGAs using Column Balanced Block Pruning
    Peng, Hongwu
    Huang, Shaoyi
    Geng, Tong
    Li, Ang
    Jiang, Weiwen
    Liu, Hang
    Wang, Shusen
    Ding, Caiwen
    PROCEEDINGS OF THE 2021 TWENTY SECOND INTERNATIONAL SYMPOSIUM ON QUALITY ELECTRONIC DESIGN (ISQED 2021), 2021, : 142 - 148
  • [49] SketchFormer: transformer-based approach for sketch recognition using vector images
    Anil Singh Parihar
    Gaurav Jain
    Shivang Chopra
    Suransh Chopra
    Multimedia Tools and Applications, 2021, 80 : 9075 - 9091
  • [50] Prediction of Marine Shaft Centerline Trajectories Using Transformer-Based Models
    Han, Jialin
    Zhu, Qingbo
    Yang, Sheng
    Xia, Wan
    Yao, Yongjun
    SYMMETRY-BASEL, 2025, 17 (01):