We present an algorithm to solve the subset-sum problem (SSP) of capacity c and n items with weights w(i),1 <= i <= n, spending O(n(m - w(min))/p) time and O(n + m - w(min)) space in the Concurrent Read/Concurrent Write (CRCW) PRAM model with 1 <= p <= m - w(min) processors, where w(min) is the lowest weight and m=min {c,Sigma(n)(i) (=)1 w(i)-c}, improving both upper-bounds. Thus, when n <= c, it is possible to solve the SSP in O(n) time in parallel environments with low memory. We also show OpenMP and CUDA implementations of this algorithm and, on Graphics Processing Unit, we obtained better performance than the best sequential and parallel algorithms known so far. Copyright (c) 2015 John Wiley & Sons, Ltd.