The increasingly frequent network intrusions have brought serious impacts to the production and life, thus malicious network traffic detection has received more and more attention in recent years. However, the traditional rule matching-based and machine learning-based malicious network traffic detection methods have the problems of relying on human experience as well as low detection efficiency. The continuous development of deep learning technology provides new ideas to solve malicious network traffic detection, and the deep learning models are also widely used in the field of malicious network traffic detection. Compared with other deep learning models, bidirectional temporal convolutional network (BiTCN) has achieved better detection results due to its ability to obtain bidirectional semantic features of network traffic, but it does not consider the different meanings as well as different importance of different subsequence segments in network traffic sequences; In addition, the loss function used in BiTCN is the negative log likelihood function, which may lead to overfitting problems when facing multi-classification problems and data imbalance problems. To solve these problems, this paper proposes a malicious network traffic detection model based on BiTCN and multi-head self-attention (MHSA) mechanism, namely BiTCN_MHSA, it innovatively uses the MHSA mechanism to assign different weights to different subsequences of network traffic, thus making the model more focused on the characteristics of malicious network traffic as well as improving the efficiency of processing global network traffic; Moreover, it also changes its loss function to a cross-entropy loss function to penalize misclassification more severely, thereby speeding up the convergence. Finally, extensive experiments are conduced to evaluate the efficiency of proposed BiTCN_MHSA model on two public network traffic, the experimental results verify that the proposed BiTCN_MHSA model outperforms six state-of-the-arts in precision, recall, F1-measure and accuracy.