Current multicore graphic processing units (GPUs) architecture designed for parallel data processing, have become applicable for general purpose computation. An example for image content processing is the automated Iris Recognition System stages, which is a highly computation algorithms. Such tasks are based on the extraction of texture features, which are required to analyze iris content. The localization and extraction processes are highly computation intensive and can benefit from the parallel computation power of GPUs. A scalable parallelization is presented for GPU-based localization and feature extraction, with a demonstrated speedup of 9.6 and 14.8 times, respectively, and 12.4 when taking into account this two system stages with our previous work iris matching on GPU stage speed, compared to that of CPU-based version whole system. We specifically implemented an Iris Recognition System based on Daugman's System for training and classification in C#. We executed the CUDA-C code on a NVIDIA GTX 460 Fermi 336 cores card.