Lung cancer ranks among the most lethal diseases, highlighting the necessity of early detection to facilitate timely therapeutic intervention. Deep learning has significantly improved lung cancer prediction by analyzing large healthcare datasets and making accurate decisions. This paper proposes a novel framework combining deep learning with integrated reinforcement learning to improve lung cancer diagnosis accuracy from CT scans. The data set utilized in this study consists of CT scans from healthy individuals and patients with various lung stages. We address class imbalance through elastic transformation and employ data augmentation techniques to enhance model generalization. For multi-class classification of lung tumors, five pre-trained convolutional neural network architectures (DenseNet201, EfficientNetB7, VGG16, MobileNet and VGG19) are used, and the models are refined by transfer learning. To further boost performance, we introduce a weighted average ensemble model "DEV-MV", coupled with grid search hyperparameter optimization, achieving an impressive diagnostic accuracy of 99.40%. The integration of ensemble reinforcement learning also contributes to improved robustness and reliability in predictions. This approach represents a significant advancement in automated lung cancer detection, offering a highly accurate, scalable solution for early diagnosis.