Technology Enablers for Big Data, Multi-Stage Analysis in Medical Image Processing

被引:0
|
作者
Bao, Shunxing [1 ]
Parvarthaneni, Prasanna [1 ]
Huo, Yuankai [1 ]
Barve, Yogesh [1 ]
Plassard, Andrew J. [1 ]
Yao, Yuang [1 ]
Sun, Hongyang [1 ]
Lyu, Ilwoo [1 ]
Zald, David H. [2 ]
Landman, Bennett A. [1 ]
Gokhale, Aniruddha [1 ]
机构
[1] Vanderbilt Univ, Dept Elect Engn & Comp Sci, Nashville, TN 37235 USA
[2] Vanderbilt Univ, Dept Psychiat & Psychol, Nashville, TN 37235 USA
来源
2018 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA) | 2018年
关键词
Hadoop; Medical image processing; Big data multi-stage analysis; Simulator; REGISTRATION ALGORITHMS; BRAIN; MAPREDUCE;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Big data medical image processing applications involving multi-stage analysis often exhibit significant variability in processing times ranging from a few seconds to several days. Moreover, due to the sequential nature of executing the analysis stages enforced by traditional software technologies and platforms, any errors in the pipeline are only detected at the later stages despite the sources of errors predominantly being the highly compute-intensive first stage. This wastes precious computing resources and incurs prohibitively higher costs for re-executing the application. The medical image processing community to date remains largely unaware of these issues and continues to use traditional high-performance computing clusters, which incur a high operating cost due to the use of dedicated resources and expensive centralized file systems. To overcome these challenges, this paper proposes an alternative approach for multi-stage analysis in medical image processing by using the Apache Hadoop ecosystem and offering it as a service in the cloud. We make the following contributions. First, we propose a concurrent pipeline execution framework and an associated semi-automatic, real-time monitoring and checkpointing framework that can detect outliers and achieve quality assurance without having to completely execute the expensive first stage of processing thereby expediting the entire multi-stage analysis. Second, we present a simulator to rapidly estimate the execution time for a given multi-stage analysis, which can aid the users in deciding the appropriate approach for their use cases. We conduct empirical evaluation of our framework and show that it requires 76.75% lesser wall time and 29.22% lesser resource time compared to the traditional approach that lacks such a quality assurance mechanism.
引用
收藏
页码:1337 / 1346
页数:10
相关论文
共 50 条
  • [41] A COMPARATIVE ANALYSIS OF CONVENTIONAL HADOOP WITH PROPOSED CLOUD ENABLED HADOOP FRAMEWORK FOR SPATIAL BIG DATA PROCESSING
    Tripathi, A. K.
    Agrawal, S.
    Gupta, R. D.
    ISPRS TC V MID-TERM SYMPOSIUM GEOSPATIAL TECHNOLOGY - PIXEL TO PEOPLE, 2018, 4-5 : 425 - 430
  • [42] Edge Detection of Medical Image Processing using Vector Field Analysis
    Chucherd, Sirikan
    2014 11TH INTERNATIONAL JOINT CONFERENCE ON COMPUTER SCIENCE AND SOFTWARE ENGINEERING (JCSSE), 2014, : 58 - 63
  • [43] Productivity frameworks in big data image processing computations - creating photographic mosaics with Hadoop and Scalding
    Szul, Piotr
    Bednarz, Tomasz
    2014 INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE, 2014, 29 : 2306 - 2314
  • [44] Analysis of Multi-diseases using Big Data for improvement in Healthcare
    Adil, Asif
    Kar, Hushmat Amin
    Jangir, Rajendra
    Sofi, Shabir Ahmad
    2015 IEEE UP SECTION CONFERENCE ON ELECTRICAL COMPUTER AND ELECTRONICS (UPCON), 2015,
  • [45] A Dietary Nutrition Analysis Method Leveraging Big Data Processing and Fuzzy Clustering
    Lei, Lihui
    Cai, Yuan
    HEALTH INFORMATION SCIENCE, HIS 2016, 2016, 10038 : 129 - 135
  • [46] Scaling up MapReduce-based Big Data Processing on Multi-GPU systems
    Jiang, Hai
    Chen, Yi
    Qiao, Zhi
    Weng, Tien-Hsiung
    Li, Kuan-Ching
    CLUSTER COMPUTING-THE JOURNAL OF NETWORKS SOFTWARE TOOLS AND APPLICATIONS, 2015, 18 (01): : 369 - 383
  • [47] Scaling up MapReduce-based Big Data Processing on Multi-GPU systems
    Hai Jiang
    Yi Chen
    Zhi Qiao
    Tien-Hsiung Weng
    Kuan-Ching Li
    Cluster Computing, 2015, 18 : 369 - 383
  • [48] Social Media Data Processing Infrastructure by Using Apache Spark Big Data Platform: Twitter Data Analysis
    Podhoranyi, Michal
    Vojacek, Lukas
    2019 4TH INTERNATIONAL CONFERENCE ON CLOUD COMPUTING AND INTERNET OF THINGS (CCIOT 2019), 2019, : 1 - 6
  • [49] BioM2: biologically informed multi-stage machine learning for phenotype prediction using omics data
    Zhang, Shunjie
    Li, Pan
    Wang, Shenghan
    Zhu, Jijun
    Huang, Zhongting
    Cai, Fuqiang
    Freidel, Sebastian
    Ling, Fei
    Schwarz, Emanuel
    Chen, Junfang
    BRIEFINGS IN BIOINFORMATICS, 2024, 25 (05)
  • [50] A Novel Big Data Processing Approach to Feature Extraction for Electrical Discharge Machining based on Container Technology
    Alimadji, Denata Rizky
    Hung, Min-Hsiung
    Lin, Yu-Chuan
    Suryajaya, Benny
    Chen, Chao-Chun
    22ND IEEE/ACIS INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING, ARTIFICIAL INTELLIGENCE, NETWORKING AND PARALLEL/DISTRIBUTED COMPUTING (SNPD 2021-FALL), 2021, : 142 - 147