Automatic modulation recognition (AMR) involves identifying the modulation of electromagnetic signals in a noncollaborative manner. Deep learning-based methods have become a focused research topic in the AMR field. Such models are frequently trained using standardized data, relying on many computational and storage resources. However, in real-world applications, the finite resources of edge devices limit the deployment of large-scale models. In addition, traditional networks cannot handle real-world signals of varying lengths and local missing data. Thus, we propose a network structure based on a convolutional Transformer with a dual-attention mechanism. This proposed structure effectively utilizes the inductive bias of the lightweight convolution and the global property of the Transformer model, thereby fusing local features with global features to get high recognition accuracy. Moreover, the model can adapt to the length of the input signals while maintaining strong robustness against incomplete signals. Experimental results on the open-source datasets RML2016.10a, RML2016.10b, and RML2018.01a demonstrate that the proposed network structure can achieve 95.05%, 94.79%, and 98.14% accuracy, respectively, with enhancement training and maintain greater than 90% accuracy when the signals are incomplete. In addition, the proposed network structure has fewer parameters and lower computational cost than benchmark methods.