Leveraging Clustering Techniques to Facilitate Metagenomic Analysis

被引:1
作者
Ennis, Damien [1 ]
Dascalu, Sergiu [1 ]
Harris, Frederick C., Jr. [1 ]
机构
[1] Univ Nevada, Dept Comp Sci & Engn, Reno, NV 89557 USA
基金
美国国家科学基金会;
关键词
Metagenomics; Clustering; K-means; Machine learning; Self-organizing map; SEARCH;
D O I
10.1080/10798587.2015.1073887
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Machine learning clustering algorithms provide excellent methods for conducting metagenomic analysis with efficiency. This study uses two machine learning algorithms, the self-organizing map and the K-means algorithms, to cluster data from an environmental sample collected from a hot springs habitat and to provide a visual analysis of that data. A data processing pipeline is described that uses the clustering algorithms to identify which reference genomes should be included for further analysis in determining possible organisms that are present in a metagenomic sample. The clustering revealed probable candidates for additional analysis, including a thermophilic, anaerobic bacterium, which is likely to be found in a hot springs environment and serves to validate the functionality of these tools. The machine learning techniques discussed here can serve as a launching point for elucidating protein sequences that could serve as possible reference comparisons to a specific metagenomic sample and lead to further study.
引用
收藏
页码:153 / 165
页数:13
相关论文
共 50 条
  • [41] Clustering-based visualizations for diagnosing diseases on metagenomic data
    Nguyen, Hai Thanh
    Phan, Trang Huyen
    Pham, Linh Thuy Thi
    Pham, Ngoc Huynh
    SIGNAL IMAGE AND VIDEO PROCESSING, 2024, 18 (8-9) : 5685 - 5699
  • [42] Consolidated Study & Analysis of Different Clustering Techniques for Data Streams
    Jayswal, Meghnesh
    Shukla, Madhu
    PROCEEDINGS OF THE 10TH INDIACOM - 2016 3RD INTERNATIONAL CONFERENCE ON COMPUTING FOR SUSTAINABLE GLOBAL DEVELOPMENT, 2016, : 3541 - 3547
  • [43] Binning Metagenomic Contigs Using Unsupervised Clustering and Reference Databases
    Jiang, Zhongjun
    Li, Xiaobo
    Guo, Lijun
    INTERDISCIPLINARY SCIENCES-COMPUTATIONAL LIFE SCIENCES, 2022, 14 (04) : 795 - 803
  • [44] A framework for space-efficient read clustering in metagenomic samples
    Alanko, Jarno
    Cunial, Fabio
    Belazzougui, Djamal
    Makinen, Veli
    BMC BIOINFORMATICS, 2017, 18
  • [45] Binning Metagenomic Contigs Using Unsupervised Clustering and Reference Databases
    Zhongjun Jiang
    Xiaobo Li
    Lijun Guo
    Interdisciplinary Sciences: Computational Life Sciences, 2022, 14 : 795 - 803
  • [46] A framework for space-efficient read clustering in metagenomic samples
    Jarno Alanko
    Fabio Cunial
    Djamal Belazzougui
    Veli Mäkinen
    BMC Bioinformatics, 18
  • [47] Efficient Clustering Techniques For Web Services Clustering
    Parimalam, T.
    Sundaram, K. Meenakshi
    2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND COMPUTING RESEARCH (ICCIC), 2017, : 1080 - 1083
  • [48] Signature-Based Clustering for Analysis of the Wound Microbiome
    Chappell, Timothy
    Geva, Shlomo
    Hogan, James M.
    Huygens, Flavia
    Kelly, Wayne
    Perrin, Dimitri
    2017 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE (BIBM), 2017, : 339 - 346
  • [49] Leveraging a Wildfire Risk Prediction Metric with Spatial Clustering
    Kc, Ujjwal
    Aryal, Jagannath
    FIRE-SWITZERLAND, 2022, 5 (06):
  • [50] Big Data Clustering Techniques Challenges and Perspectives: Review
    Awad F.H.
    Hamad M.M.
    Informatica (Slovenia), 2023, 47 (06): : 203 - 218