A hardware algorithm is presented for sorting. This algorithm is based on a highly piplined bit-serial architecture. The processing time of this sorter is linearly proportional to the number of data. Sorting cells are much smaller and simpler than previously reported sorter cells. A single chip sorting 512 16-b keys is designed with a 2-mu-m process and simulated at 240 MHz. The area.time performance of this chip is more than 60 times more efficient than previously reported sorting engines.