Code vulnerability detection is a crucial approach to ensuring software security, aiming to automatically identify potential security vulnerabilities in source code. However, existing static vulnerability detection methods often face challenges such as feature loss and an insufficient expressive power when extracting program features. To address these issues, in this paper, we propose a source code vulnerability detection method based on joint graph and multimodal feature fusion. Innovatively, we construct a joint graph that integrates multiple program dependencies and semantic edges to achieve more comprehensive feature extraction. Additionally, by combining Graph Attention Networks (GATs) with the Transformer architecture, both structural and sequential features of code snippets are captured, further enhancing the model's expressive capabilities. Finally, we introduce pre-fusion and post-fusion strategies to fully integrate multimodal features, thereby improving the accuracy and performance of vulnerability detection. Experimental results on the SARD dataset demonstrate the method's excellent performance in detecting various types of vulnerabilities, achieving an F1 score of 85.20% and an accuracy of 86.50%. On the Real-Vul real-world dataset, the method achieves an F1 score of 73.9% and an accuracy of 86.50%. The detection results exhibit remarkable stability, ensuring a reliable and consistent performance. Overall, the proposed method surpasses the performance of existing mainstream detection approaches.