Cone-beam image reconstruction, such as the reconstruction of CT projection values, is computational very demanding. The most time-consuming step is the backprojection that is often limited by the memory bandwidth. Recently, a novel general purpose architecture optimized for distributed computing became available: the Cell Broadband Engine (CBE). Its eight synergistic processing elements (SPEs) currently allow for a theoretical performance of 192 GFlops (3 GHz, 8 units, 4 floats per vector, 2 instructions, multiply and add, per clock). Our aim is to maximize the image reconstruction speed for flat-panel-based cone-beam CT such as micro-CT or C-arm-CT. Therefore we implemented a highly optimized perspective cone-beam backprojection algorithm on the Cell processor. Data mining techniques and double buffering of source data were extensively used to optimally utilize both the memory bandwidth and the available local store of each SPE. The voxel-driven backprojection code uses 32 bit floating point arithmetic and bilinear interpolation between neighboring detector channels. The latter is performed in two stages by first upsampling the detector (this includes bilinear interpolation) to double the number of detector pixels followed by a nearest neighbor interpolation during backprojection. Performance was measured by backprojecting simulated data with 512 cone-beam projections per full rotation and 1024 by 1024 detector elements. The data were backprojected into a volume of 512(3) voxels fully contained in the field of measurement using an optimized PC-based (CPU-based) approach and the new Cell-based (CBE-based) algorithm. Both the PC and the CBE were clocked at 3 GHz. PC-based backprojection takes 3.2 min whereas the CBE version finishes within 13.6 s. Using both CBEs of our dual Cell-based blade (Mercury Computer Systems) one can do the cone-beam backprojection in 6.8 s.