Leveraging Clustering Techniques to Facilitate Metagenomic Analysis

被引:1
|
作者
Ennis, Damien [1 ]
Dascalu, Sergiu [1 ]
Harris, Frederick C., Jr. [1 ]
机构
[1] Univ Nevada, Dept Comp Sci & Engn, Reno, NV 89557 USA
基金
美国国家科学基金会;
关键词
Metagenomics; Clustering; K-means; Machine learning; Self-organizing map; SEARCH;
D O I
10.1080/10798587.2015.1073887
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Machine learning clustering algorithms provide excellent methods for conducting metagenomic analysis with efficiency. This study uses two machine learning algorithms, the self-organizing map and the K-means algorithms, to cluster data from an environmental sample collected from a hot springs habitat and to provide a visual analysis of that data. A data processing pipeline is described that uses the clustering algorithms to identify which reference genomes should be included for further analysis in determining possible organisms that are present in a metagenomic sample. The clustering revealed probable candidates for additional analysis, including a thermophilic, anaerobic bacterium, which is likely to be found in a hot springs environment and serves to validate the functionality of these tools. The machine learning techniques discussed here can serve as a launching point for elucidating protein sequences that could serve as possible reference comparisons to a specific metagenomic sample and lead to further study.
引用
收藏
页码:153 / 165
页数:13
相关论文
共 50 条
  • [21] MetaDecoder: a novel method for clustering metagenomic contigs
    Cong-Cong Liu
    Shan-Shan Dong
    Jia-Bin Chen
    Wang, Chen
    Ning, Pan
    Guo, Yan
    Tie-Lin Yang
    MICROBIOME, 2022, 10 (01)
  • [22] Metagenomic read clustering based on overlap graphs
    Balvert, Marleen
    Schoenhuth, Alexander
    Dutilh, Bas
    2018 IEEE 8TH INTERNATIONAL CONFERENCE ON COMPUTATIONAL ADVANCES IN BIO AND MEDICAL SCIENCES (ICCABS), 2018,
  • [23] The Metagenomic Binning Problem: Clustering Markov Sequences
    Greenberg, Grant
    Shomorony, Ilan
    IEEE TRANSACTIONS ON MOLECULAR BIOLOGICAL AND MULTI-SCALE COMMUNICATIONS, 2024, 10 (01): : 32 - 42
  • [24] Targeted demand response for flexible energy communities using clustering techniques
    Pelekis, Sotiris
    Pipergias, Angelos
    Karakolis, Evangelos
    Mouzakitis, Spiros
    Santori, Francesca
    Ghoreishi, Mohammad
    Askounis, Dimitris
    SUSTAINABLE ENERGY GRIDS & NETWORKS, 2023, 36
  • [25] Customer segmentation issues and strategies for an automobile dealership with two clustering techniques
    Tsai, Chih-Fong
    Hu, Ya-Han
    Lu, Yu-Hsin
    EXPERT SYSTEMS, 2015, 32 (01) : 65 - 76
  • [26] High Performance Clustering Techniques: A Survey
    Savvas, Ilias K.
    Michos, Christos
    Chernov, Andrey
    Butakova, Maria
    PROCEEDINGS OF THE FOURTH INTERNATIONAL SCIENTIFIC CONFERENCE INTELLIGENT INFORMATION TECHNOLOGIES FOR INDUSTRY (IITI'19), 2020, 1156 : 252 - 259
  • [27] Estimating of Software Quality with Clustering Techniques
    Gupta, Deepak
    Goyal, Vinay Kr
    Mittal, Harish
    2013 THIRD INTERNATIONAL CONFERENCE ON ADVANCED COMPUTING & COMMUNICATION TECHNOLOGIES (ACCT 2013), 2013, : 20 - 27
  • [28] Electrical Load Profile Analysis Using Clustering Techniques
    Damayanti, R.
    Abdullah, A. G.
    Purnama, W.
    Nandiyanto, A. B. D.
    1ST ANNUAL APPLIED SCIENCE AND ENGINEERING CONFERENCE (AASEC), IN CONJUCTION WITH THE INTERNATIONAL CONFERENCE ON SPORT SCIENCE, HEALTH, AND PHYSICAL EDUCATION (ICSSHPE), 2017, 180
  • [29] Analysis of meteorological conditions in Spain by means of clustering techniques
    Arroyo, Angel
    Herrero, Alvaro
    Tricio, Veronica
    Corchado, Emilio
    JOURNAL OF APPLIED LOGIC, 2017, 24 : 76 - 89
  • [30] Geometric Gait Clustering for Unobtrusive Analysis
    Ellison, Grant
    Markovic, Milla Penelope
    Yazdansepas, Delaram
    2023 IEEE 19TH INTERNATIONAL CONFERENCE ON BODY SENSOR NETWORKS, BSN, 2023,