To address the engineering diagnostic challenge of single-scale features being insufficient to comprehensively characterize complex fault patterns under strong background noise, this article proposes a bidirectional interactive multiscale time-frequency feature fusion method. While multiscale frameworks have demonstrated effectiveness in extracting rich features, existing approaches are confined to single-domain feature extraction. Furthermore, the bidirectional flow and interaction of information during multiscale feature fusion are often overlooked. To overcome this limitation, this study proposes a novel multiscale time-frequency feature fusion method with bidirectional information interaction for bearing fault diagnosis. First, the data are preprocessed by combining fast Fourier transform (FFT) and variational modal decomposition (VMD) to construct multiscale time-frequency features. Then, an adaptive multiscale convolutional neural network (AMCNN) is proposed to extract local features at different scales. Meanwhile, the self-attention mechanism of Transformer architecture is utilized to enhance the capture and encoding of global features. After that, a bidirectional cross-attention module is proposed to facilitate bidirectional interaction and fusion between global and local features for obtaining more discriminative outputs. Finally, the effectiveness and superiority of the proposed method are validated on three bearing datasets. Experimental results demonstrate that the proposed framework achieves better diagnostic accuracy and stability compared to state-of-the-art methods.