Massively Parallel Expectation Maximization Using Graphics Processing Units

被引:0
|
作者
Altinigneli, Muzaffer Can [1 ]
Plant, Claudia [2 ]
Boehm, Christian [1 ]
机构
[1] Univ Munich, Munich, Germany
[2] Tech Univ Munich, Helmholtz Zentrum Munchen, Munich, Germany
来源
19TH ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING (KDD'13) | 2013年
关键词
Expectation Maximization; Graphics Processing Unit; CUDA; Fermi; ALGORITHMS;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Composed of several hundreds of processors, the Graphics Processing Unit (GPU) has become a very interesting platform for computationally demanding tasks on massive data. A special hierarchy of processors and fast memory units allow very powerful and efficient parallelization but also demands novel parallel algorithms. Expectation Maximization (EM) is a widely used technique for maximum likelihood estimation. In this paper, we propose an innovative EM clustering algorithm particularly suited for the GPU platform on NVIDIA's Fermi architecture. The central idea of our algorithm is to allow the parallel threads exchanging their local information in an asynchronous way and thus updating their cluster representatives on demand by a technique called Asynchronous Model Updates (Async-EM). Async-EM enables our algorithm not only to accelerate convergence but also to reduce the overhead induced by memory bandwidth limitations and synchronization requirements. We demonstrate (1) how to reformulate the EM algorithm to be able to exchange information using Async-EM and (2) how to exploit the special memory and processor architecture of a modern GPU in order to share this information among threads in an optimal way. As a perspective Async-EM is not limited to EM but can be applied to a variety of algorithms.
引用
收藏
页码:838 / 846
页数:9
相关论文
共 50 条
  • [41] Parallel terrain visibility calculation on the graphics processing unit
    Strnad, Damjan
    CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2011, 23 (18) : 2452 - 2462
  • [42] LU Decomposition Method implementation using Graphics Processing Units - GPU
    Gomez, Yensy
    Osorio, John
    Perez, Lina
    2014 9TH COMPUTING COLOMBIAN CONFERENCE (9CCC), 2014, : 184 - 189
  • [43] Accelerating molecular dynamics simulations using Graphics Processing Units with CUDA
    Liu, Weiguo
    Schmidt, Bertil
    Voss, Gerrit
    Mueller-Wittig, Wolfgang
    COMPUTER PHYSICS COMMUNICATIONS, 2008, 179 (09) : 634 - 641
  • [44] HIGH RESOLUTION DISASTER DATA CLUSTERING USING GRAPHICS PROCESSING UNITS
    Kurte, Kudeep R.
    Durbha, Surya S.
    2013 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM (IGARSS), 2013, : 1696 - 1699
  • [45] Training Optimum-Path Forest on Graphics Processing Units
    Iwashita, Adriana S.
    Romero, Marcos V. T.
    Baldassin, Alexandro
    Costa, Kelton A. P.
    Papa, Joao P.
    PROCEEDINGS OF THE 2014 9TH INTERNATIONAL CONFERENCE ON COMPUTER VISION, THEORY AND APPLICATIONS (VISAPP 2014), VOL 2, 2014, : 581 - 588
  • [46] AN APPROACH TO EFFICIENT FEM SIMULATIONS ON GRAPHICS PROCESSING UNITS USING CUDA
    Nutti, Bjorn
    Marinkovic, Dragan
    FACTA UNIVERSITATIS-SERIES MECHANICAL ENGINEERING, 2014, 12 (01) : 15 - 25
  • [47] Pipelined Iterative Solvers with Kernel Fusion for Graphics Processing Units
    Rupp, Karl
    Weinbub, Josef
    Juengel, Ansgar
    Grasser, Tibor
    ACM TRANSACTIONS ON MATHEMATICAL SOFTWARE, 2016, 43 (02):
  • [48] Optimizing the computation of a parallel 3D finite difference algorithm for graphics processing units
    Porter-Sobieraj, J.
    Cygert, S.
    Kikola, D.
    Sikorski, J.
    Slodkowski, M.
    CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2015, 27 (06) : 1591 - 1602
  • [49] Parallel option pricing with Fourier space time-stepping method on graphics processing units
    Surkov, Vladimir
    PARALLEL COMPUTING, 2010, 36 (07) : 372 - 380
  • [50] Efficient parallel algorithm for multiple sequence alignments with regular expression constraints on graphics processing units
    Lin, Chun Yuan
    Lin, Yu Shiang
    INTERNATIONAL JOURNAL OF COMPUTATIONAL SCIENCE AND ENGINEERING, 2014, 9 (1-2) : 11 - 20