A Decomposition of the Tikhonov Regularization Functional Oriented to Exploit Hybrid Multilevel Parallelism

被引:21
作者
Arcucci, Rossella [1 ,2 ]
D'Amore, Luisa [1 ,2 ]
Carracciuolo, Luisa [3 ]
Scotti, Giuseppe [1 ]
Laccetti, Giuliano [1 ]
机构
[1] Univ Naples Federico II, Naples, Italy
[2] Euro Mediterranean Ctr Climate Changes CMCC, Lecce, Italy
[3] CNR, IPCB, Naples, Italy
关键词
Tikhonov regularization; Large scale inverse problems; Parallel algorithm; Data assimilation; LAPLACE TRANSFORM INVERSION; PERFORMANCE; ALGORITHM; SOFTWARE;
D O I
10.1007/s10766-016-0460-3
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
We introduce a decomposition of the Tikhonov Regularization (TR) functional which split this operator into several TR functionals, suitably modified in order to enforce the matching of their solutions. As a consequence, instead of solving one problem we can solve several problems reproducing the initial one at smaller dimensions. Such approach leads to a reduction of the time complexity of the resulting algorithm. Since the subproblems are solved in parallel, this decomposition also leads to a reduction of the overall execution time. Main outcome of the decomposition is that the parallel algorithm is oriented to exploit the highest performance of parallel architectures where concurrency is implemented both at the coarsest and finest levels of granularity. Performance analysis is discussed in terms of the algorithm and software scalability. Validation is performed on a reference parallel architecture made of a distributed memory multiprocessor and a Graphic Processing Unit. Results are presented on the Data Assimilation problem, for oceanographic models.
引用
收藏
页码:1214 / 1235
页数:22
相关论文
共 32 条
[21]   PERFORMANCE OF PARALLEL PROCESSORS [J].
FLATT, HP ;
KENNEDY, K .
PARALLEL COMPUTING, 1989, 12 (01) :1-20
[22]  
Freitag M.A., 2010, PAMM, V10, P665, DOI [DOI 10.1002/PAMM.201010324, 10.1002/pamm.201010324]
[23]  
Gallopoulos E., 1994, ADV PARALLEL ANDVECT, P4751
[24]  
Hansen P. C., 1998, RANK DEFICIENT DISCR
[25]   A Double Adaptive Algorithm for Multidimensional Integration on Multicore Based HPC Systems [J].
Laccetti, Giuliano ;
Lapegna, Marco ;
Mele, Valeria ;
Romano, Diego ;
Murli, Almerico .
INTERNATIONAL JOURNAL OF PARALLEL PROGRAMMING, 2012, 40 (04) :397-409
[26]   ON THE LIMITED MEMORY BFGS METHOD FOR LARGE-SCALE OPTIMIZATION [J].
LIU, DC ;
NOCEDAL, J .
MATHEMATICAL PROGRAMMING, 1989, 45 (03) :503-528
[27]  
Murli A, 2007, INT FED INFO PROC, V239, P421
[28]   A multi-grained distributed implementation of the parallel Block Conjugate Gradient algorithm [J].
Murli, A. ;
D'Amore, L. ;
Laccetti, G. ;
Gregoretti, F. ;
Oliva, G. .
CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2010, 22 (15) :2053-2072
[29]  
Nichols NK, 2010, DATA ASSIMILATION: MAKING SENSE OF OBSERVATIONS, P13, DOI 10.1007/978-3-540-74703-1_2
[30]  
Nvidia, 2012, TESLA K20 GPU ACT AC