Transformer-based embedding applied to classify bacterial species using sequencing reads

被引:0
|
作者
Gwak, Ho-Jin [1 ]
Rho, Mina [1 ]
机构
[1] Hanyang Univ, Dept Comp Sci, Seoul, South Korea
来源
2022 IEEE INTERNATIONAL CONFERENCE ON BIG DATA AND SMART COMPUTING (IEEE BIGCOMP 2022) | 2022年
关键词
embedding; transformer; deep learning; classification; Staphylococcus species; SOFTWARE;
D O I
10.1109/BigComp54360.2022.00084
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
With the emergence of next-generation sequencing and metagenomic approaches, the necessity for read-level taxonomy classifiers has increased. Although the 16S rRNA gene sequence has been widely employed as a taxonomic marker, recent studies have revealed that 16S rRNA is not sufficient to assign species. Therefore, an accurate classifier is required to classify whole-genome sequencing reads into species. With the advancement of deep learning methods and natural language processing technologies, several studies attempted to apply these methods to genomic data and successfully achieved state-of-the-art performance. In this study, we applied transformer- based embedding into bacterial genomes to accurately classify species using sequencing reads. As a case study, we classified Staphylococcus species using sequencing reads. Our model achieved ROC-AVC values of over 0.98 and 0.99 for 151bp and 251bp paired-end reads, respectively. Compared with a cutting-edge method Kraken2, our model classified significantly more S. aureus reads while maintaining comparable precision.
引用
收藏
页码:374 / 377
页数:4
相关论文
共 50 条
  • [21] HEART: Historically Information Embedding and Subspace Re-Weighting Transformer-Based Tracking
    Liu, Tianpeng
    Li, Jing
    Beheshti, Amin
    Wu, Jia
    Chang, Jun
    Song, Beihang
    Lian, Lezhi
    IEEE TRANSACTIONS ON BIG DATA, 2025, 11 (02) : 566 - 577
  • [22] Parotid Gland Segmentation Using Purely Transformer-Based U-Shaped Network and Multimodal MRI
    Xu, Zi'an
    Dai, Yin
    Liu, Fayu
    Li, Siqi
    Liu, Sheng
    Shi, Lifu
    Fu, Jun
    ANNALS OF BIOMEDICAL ENGINEERING, 2024, 52 (08) : 2101 - 2117
  • [23] Lung Cancer Prediction Using Electronic Claims Records: A Transformer-Based Approach
    Chen, Huan-Yu
    Wang, Hui-Min
    Lin, Ching-Heng
    Yang, Rob
    Lee, Chi-Chun
    IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS, 2023, 27 (12) : 6062 - 6073
  • [24] Transformer-Based Deep Learning Strategies for Lithium-Ion Batteries SOX Estimation Using Regular and Inverted Embedding
    Guirguis, John
    Abdulmaksoud, Ahmed
    Ismail, Mohanad
    Kollmeyer, Phillip J.
    Ahmed, Ryan
    IEEE ACCESS, 2024, 12 : 167108 - 167119
  • [25] Arabic Paraphrase Generation Using Transformer-Based Approaches
    Al-Shameri, Noora Aref
    Al-Khalifa, Hend S.
    IEEE ACCESS, 2024, 12 : 121896 - 121914
  • [26] Empirical Study of Tweets Topic Classification Using Transformer-Based Language Models
    Mandal, Ranju
    Chen, Jinyan
    Becken, Susanne
    Stantic, Bela
    INTELLIGENT INFORMATION AND DATABASE SYSTEMS, ACIIDS 2021, 2021, 12672 : 340 - 350
  • [27] Locational marginal price forecasting using Transformer-based deep learning network
    Liao, Shengyi
    Wang, Zhuo
    Luo, Yao
    Liang, Haiyan
    2021 PROCEEDINGS OF THE 40TH CHINESE CONTROL CONFERENCE (CCC), 2021, : 8457 - 8462
  • [28] Predicting Respiratory Rate from Electrocardiogram and Photoplethysmogram Using a Transformer-Based Model
    Zhao, Qi
    Liu, Fang
    Song, Yide
    Fan, Xiaoya
    Wang, Yu
    Yao, Yudong
    Mao, Qian
    Zhao, Zheng
    BIOENGINEERING-BASEL, 2023, 10 (09):
  • [29] Enhancing performance of transformer-based models in natural language understanding through word importance embedding
    Hong, Seung-Kyu
    Jang, Jae-Seok
    Kwon, Hyuk-Yoon
    KNOWLEDGE-BASED SYSTEMS, 2024, 304
  • [30] Transformer-Based Amharic-to-English Machine Translation With Character Embedding and Combined Regularization Techniques
    Asefa, Surafiel Habib
    Assabie, Yaregal
    IEEE ACCESS, 2025, 13 : 1090 - 1105