Objective Owing to the increasing prevalence of oral diseases, the societal demand for oral medical diagnosis has augmented steadily. This has increased the workload for oral health professionals, thus imposing higher requirements on their expertise and diagnostic efficiency. The interpretation of oral panoramic films is crucial in evaluating the oral health of patients. However, professional dentists are scarce in China, and a large number of film readings can take up too much of the doctor's diagnostic time. The advent of artificial intelligence technology has expanded its application in the medical field, particularly in medical image analysis, and has yielded favorable results. Currently, most studies focus on individual tooth diseases. However, patients typically present multiple oral lesions simultaneously, including dental caries, apical periodontitis, furcation involvement, and impacted teeth. Owing to the complexity of these diseases, the existing technologies cannot satisfy actual clinical requirements. This study aims to leverage deep learning to recognize image features by employing a deep-learning network model to promptly and accurately identify diseased areas in oral panoramic films. The goal is to provide comprehensive results regarding conditions such as caries, periodontal disease, impacted teeth, and missing teeth. This approach aims to facilitate doctors in promptly and accurately diagnosing conditions, thereby alleviating diagnostic pressure stemming from inadequate medical resources. Methods In this study, we propose an efficient disease-recognition network named YOLO-Teeth (You only look once-teeth), which is based on YOLOv5s, to identify caries, impacted teeth, periapical periodontitis, and bifurcated root lesions. To enhance the feature-extraction capability of the backbone network, the Triplet attention mechanism is introduced such that the network recognizes the symptoms more accurately. A BiFPN module is used in the neck region to achieve a complete integration of deep and shallow features, thus ensuring that the network can process complex information in the panorama more effectively. The CIoU loss function is replaced by the MPDIoU loss function to improve the positioning accuracy of the network. Results and Discussions Based on the data presented in Table 1 and Fig. 6, the Triplet attentional-mechanism module outperforms the other 5 attentional mechanisms when the dimensionality reduction method is used in the oral disease recognition model. YOLOv5s, which employs the Triplet attention mechanism, demonstrates the most stable detection performance across various disease targets, with minimal fluctuations in the recognition performance of four diseases. Additionally, the accuracy rate (P), recall rate (R), and mean average precision (P-mAP) of the model increase to 79.9%, 79.6%, and 85.9%, respectively, thus demonstrating the best comprehensive evaluation effect. Table 2 shows that, compared with the YOLOv5s network, YOLO-Teeth shows higher values in terms of the P, R, and P-mAP by 5.0%, 3.2%, and 4.1%, respectively. Furthermore, YOLO-Teeth exhibits clear advantages over other mainstream detection networks, as shown in Table 3. Conclusions The YOLO-Teeth network proposed in this study is an efficient disease-recognition network based on YOLOv5s. Its feature-extraction capability is enhanced by introducing the Triplet attention module, whereas the integration of deep and shallow feature layers is improved using the BiFPN module. The CIoU loss function is replaced by the MPDIoU loss function, thereby enhancing the accuracy of disease-location identification. Ablation and comparison experiments are conducted using an oral panoramic-disease dataset. Experimental results show that compared with the YOLOv5s network, YOLO-Teeth shows higher values in terms of the P, R, and P-mAP by 5.0%, 3.2%, and 4.1%, respectively. YOLO-Teeth is clearly advantageous compared with other mainstream detection networks. Therefore, YOLO-Teeth is suitable for disease recognition in oral panoramic films. The current research disparity in obtaining comprehensive disease-recognition results is addressed in this study. The findings obtained enable doctors to diagnose diseases promptly and accurately, thereby alleviating diagnostic pressure.