An Automated Infrastructure to Support High-Throughput Bioinformatics

被引:0
|
作者
Cuccuru, Gianmauro [1 ]
Leo, Simone [1 ]
Lianas, Luca [1 ]
Muggiri, Michele [1 ]
Pinna, Andrea [1 ]
Pireddu, Luca [1 ]
Uva, Paolo [1 ]
Angius, Andrea [1 ]
Fotia, Giorgio [1 ]
Zanetti, Gianluigi [1 ]
机构
[1] CRS4, Pula, CA, Italy
关键词
Bioinformatics; NGS; MapReduce; DATA-MANAGEMENT; VARIANTS; GALAXY;
D O I
暂无
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
The number of domains affected by the big data phenomenon is constantly increasing, both in science and industry, with high-throughput DNA sequencers being among the most massive data producers. Building analysis frameworks that can keep up with such a high production rate, however, is only part of the problem: current challenges include dealing with articulated data repositories where objects are connected by multiple relationships, managing complex processing pipelines where each step depends on a large number of configuration parameters and ensuring reproducibility, error control and usability by nontechnical staff. Here we describe an automated infrastructure built to address the above issues in the context of the analysis of the data produced by the CRS4 next-generation sequencing facility. The system integrates open source tools, either written by us or publicly available, into a framework that can handle the whole data transformation process, from raw sequencer output to primary analysis results.
引用
收藏
页码:600 / 607
页数:8
相关论文
共 50 条
  • [21] High-Throughput and Automated Anion Transport Assays
    Yang, Kylie
    Lee, Lana C.
    Kotak, Hiral A.
    Morton, Evelyn R.
    Chee, Soo Mei
    Nguyen, Duy P. M.
    Keskkula, Alvaro
    Haynes, Cally J. E.
    CHEMISTRY-METHODS, 2025,
  • [22] Automated high-throughput DNA synthesis and assembly
    Ma, Yuxin
    Zhang, Zhaoyang
    Jia, Bin
    Yuan, Yingjin
    HELIYON, 2024, 10 (06)
  • [23] Automated, high-throughput serum glycoprofiling platform
    Stoeckmann, H.
    O'Flaherty, R.
    Adamczyk, B.
    Saldova, R.
    Rudd, P. M.
    INTEGRATIVE BIOLOGY, 2015, 7 (09) : 1026 - 1032
  • [24] A high-throughput adaptive computing infrastructure for bioinformaties research
    Pineo, S
    Wang, ZY
    18th International Conference on Systems Engineering, Proceedings, 2005, : 292 - 300
  • [25] A high-throughput overlay multicast infrastructure with network coding
    Wang, M
    Li, ZP
    Li, BC
    QUALITY OF SERVICE - IWQOS 2005, PROCEEDINGS, 2005, 3552 : 37 - 53
  • [26] High-throughput neuroimaging-genetics computational infrastructure
    Dinov, Ivo D.
    Petrosyan, Petros
    Liu, Zhizhong
    Eggert, Paul
    Hobel, Sam
    Vespa, Paul
    Moon, Seok Woo
    Van Horn, John D.
    Franco, Joseph
    Toga, Arthur W.
    FRONTIERS IN NEUROINFORMATICS, 2014, 8
  • [27] An infrastructure for high-throughput microscopy: Instrumentation, informatics, and integration
    Vaisberg, Eugeni A.
    Lenzi, David
    Hansen, Richard L.
    Keon, Brigitte H.
    Finer, Jeffrey T.
    MEASURING BIOLOGICAL RESPONSES WITH AUTOMATED MICROSCOPY, 2006, 414 : 484 - 512
  • [28] A high-throughput infrastructure for density functional theory calculations
    Jain, Anubhav
    Hautier, Geoffroy
    Moore, Charles J.
    Ong, Shyue Ping
    Fischer, Christopher C.
    Mueller, Tim
    Persson, Kristin A.
    Ceder, Gerbrand
    COMPUTATIONAL MATERIALS SCIENCE, 2011, 50 (08) : 2295 - 2310
  • [29] The challenges of delivering bioinformatics training in the analysis of high-throughput data
    Carvalho, Benilton S.
    Rustici, Gabriella
    BRIEFINGS IN BIOINFORMATICS, 2013, 14 (05) : 538 - 547
  • [30] High-throughput protein analysis integrating bioinformatics and experimental assays
    del Val, C
    Mehrle, A
    Falkenhahn, M
    Seiler, M
    Glatting, KH
    Poustka, A
    Suhai, S
    Wiemann, S
    NUCLEIC ACIDS RESEARCH, 2004, 32 (02) : 742 - 748