A parallel, boundary element method procedure for elastostatic problems is here illustrated. Starting from a reference collocation integration algorithm, a preliminary study, mainly devoted to the performance analysis of the method has been first carried out. Subsequently, three main areas that offer potential improvements have been detected. Several enhanced versions of the algorithm, incorporating the suggested modifications, have been tested and numerically characterized on test-cases with varying boundary conditions and mesh density. Finally, a parallel, high-efficiency, GPU-based version, able to process large domains fully featured with internal solutions has been developed and discussed in terms of cost/benefit ratio and overall performance. (C) 2014 Elsevier Ltd. All rights reserved.