Webshell detection is highly important for network security protection. Conventional methods are based on keywords matching, which heavily relies on experiences of domain experts when facing emerging malicious webshells of various kinds. Recently, machine learning, especially supervised learning, is introduced for webshell detection and has proved to be a great success. As one of state-of-the-art work, neural network (NN) is designed to input a large number of features and enable deep learning. Thus, how to properly combine the advantages of automatic feature selection and the advantages of expert knowledge-based way has become a key issue. Considering that special features to indicate unexpected webshell behaviors for a target business system are usually simple but effective, in this work, we propose a novel approach for improving webshell detection based on convolutional neural network (CNN) through reinforcement learning. We utilize the reinforcement learning of asynchronous advantage actor-critic (A3C) for automatic feature selection, aiming to maximize the expected accuracy of the CNN classifier on a validation dataset by sequentially interacting with the feature space. Moreover, considering the sparseness of feature values, we build the CNN classifier with two convolutional layers and a global pooling. Extensive experiments and analysis have been conducted to demonstrate the effectiveness of our proposed method.