Blob analysis has been extensively used in target detection, object recognition, moving target tracking, among other applications. Because blob analysis is computationally expensive, it has become a bottleneck of real-time applications. To tackle this problem, a parallel algorithm for blob analysis is proposed, and a hardware-efficient architecture for this algorithm is presented in this paper. First, based on image data partition and multi-process units, a novel parallel algorithm of blob analysis is proposed to process objects with different types and sizes. Second, a dynamic convex hull calculation method is designed, which is highly efficient for parallel processing and sub-block merging of connected component labeling. Third, a parallel hardware structure of the proposed algorithm is designed and implemented on FPGA. To evaluate performance, blobs of different types and sizes are located by the proposed algorithm in software and hardware. The experimental results demonstrate that the blobs are effectively and correctly located by the proposed algorithm, and the proposed hardware architecture works more efficiently than the state-of-the-art methods.