Transformer-based embedding applied to classify bacterial species using sequencing reads

被引:0
|
作者
Gwak, Ho-Jin [1 ]
Rho, Mina [1 ]
机构
[1] Hanyang Univ, Dept Comp Sci, Seoul, South Korea
来源
2022 IEEE INTERNATIONAL CONFERENCE ON BIG DATA AND SMART COMPUTING (IEEE BIGCOMP 2022) | 2022年
关键词
embedding; transformer; deep learning; classification; Staphylococcus species; SOFTWARE;
D O I
10.1109/BigComp54360.2022.00084
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
With the emergence of next-generation sequencing and metagenomic approaches, the necessity for read-level taxonomy classifiers has increased. Although the 16S rRNA gene sequence has been widely employed as a taxonomic marker, recent studies have revealed that 16S rRNA is not sufficient to assign species. Therefore, an accurate classifier is required to classify whole-genome sequencing reads into species. With the advancement of deep learning methods and natural language processing technologies, several studies attempted to apply these methods to genomic data and successfully achieved state-of-the-art performance. In this study, we applied transformer- based embedding into bacterial genomes to accurately classify species using sequencing reads. As a case study, we classified Staphylococcus species using sequencing reads. Our model achieved ROC-AVC values of over 0.98 and 0.99 for 151bp and 251bp paired-end reads, respectively. Compared with a cutting-edge method Kraken2, our model classified significantly more S. aureus reads while maintaining comparable precision.
引用
收藏
页码:374 / 377
页数:4
相关论文
共 50 条
  • [31] Electronic cleansing in CT colonography using transformer-based UNet
    Tachibana, Rie
    Nappi, Janne J.
    Okamoto, Masaki
    Yoshida, Hiroyuki
    IMAGING INFORMATICS FOR HEALTHCARE, RESEARCH, AND APPLICATIONS, MEDICAL IMAGING 2024, 2024, 12931
  • [32] Development of a Text Classification Framework using Transformer-based Embeddings
    Yeasmin, Sumona
    Afrin, Nazia
    Saif, Kashfia
    Huq, Mohammad Rezwanul
    PROCEEDINGS OF THE 11TH INTERNATIONAL CONFERENCE ON DATA SCIENCE, TECHNOLOGY AND APPLICATIONS (DATA), 2022, : 74 - 82
  • [33] Transformer-based Approaches for Personality Detection using the MBTI Model
    Lazo Vasquez, Ricardo
    Ochoa-Luna, Jose
    2021 XLVII LATIN AMERICAN COMPUTING CONFERENCE (CLEI 2021), 2021,
  • [34] Image captioning using transformer-based double attention network
    Parvin, Hashem
    Naghsh-Nilchi, Ahmad Reza
    Mohammadi, Hossein Mahvash
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2023, 125
  • [35] Enhancing Microseismic Signal Classification in Metal Mines Using Transformer-Based Deep Learning
    Peng, Pingan
    Lei, Ru
    Wang, Jinmiao
    SUSTAINABILITY, 2023, 15 (20)
  • [36] Deforestation Detection in the Brazilian Amazon Using Transformer-based Networks
    Alshehri, Mariam
    Ouadou, Anes
    Scott, Grant
    2023 IEEE CONFERENCE ON ARTIFICIAL INTELLIGENCE, CAI, 2023, : 292 - 293
  • [37] Pulsar candidate identification using advanced transformer-based models
    Cao, Jie
    Xu, Tingting
    Deng, Linhua
    Zhou, Xueliang
    Li, Shangxi
    Liu, Yuxia
    Zhou, Weihong
    CHINESE JOURNAL OF PHYSICS, 2024, 90 : 121 - 133
  • [38] Identification of lesion bioactivity in hepatic cystic echinococcosis using a transformer-based fusion model
    Wang, Zhanjin
    Li, Fuyuan
    Cai, Junjie
    Xue, Zhangtuo
    Du, Kaihao
    Tao, Yongping
    Zhang, Hanxi
    Zhou, Ying
    Fan, Haining
    Wang, Zhan
    JOURNAL OF INFECTION, 2025, 90 (04)
  • [39] Malware Detection for Portable Executables Using a Multi-input Transformer-based Approach
    Huoh, Ting-Li
    Miskell, Timothy
    Barut, Onur
    Luo, Yan
    Li, Peilong
    Zhang, Tong
    2024 INTERNATIONAL CONFERENCE ON COMPUTING, NETWORKING AND COMMUNICATIONS, ICNC, 2024, : 778 - 782
  • [40] TrEnD: A transformer-based encoder-decoder model with adaptive patch embedding for mass segmentation in mammograms
    Liu, Dongdong
    Wu, Bo
    Li, Changbo
    Sun, Zheng
    Zhang, Nan
    MEDICAL PHYSICS, 2023, 50 (05) : 2884 - 2899