Osteosarcoma has the highest incidence rate among malignant bone tumours. It is particularly detrimental to human health because it is a highly malignant tumour. Also makes up 40% of initial malignant bone tumours while having a 3–10 per million incidences. High mortality and morbidity rates are associated with osteosarcoma, particularly in underdeveloped nations. To increase patient survival, early osteosarcoma screening and diagnosis are essential. In this study, we suggest a novel encoder–decoder method that associations nested connections and an effective attention mechanism. The encoder, decoder, and skip connection make up the model's structure. The DropBlock regularization technique makes it easier to discard local semantic information while still encouraging the network to learn resilient and useful features. An effective attention module leverages the right kind of cross-channel contact to collect more detailed global data. In the skip connection section, the semantic gap left by a direct simple connection is filled by using the nested connection method to combine the feature maps obtained from the intermediate decoder with the original feature maps from the encoder. The original image is then enhanced with data to increase robustness and avoid the over-fitting issue brought on by insufficient data. Three separate data sets and a variety of performance indicators are used to examine the robustness of the proposed model. The suggested model outperforms other current models thanks to its strong performance, achieving an average accuracy of 99.13%. Here, Datasets 1, 2, and 3 each have an accuracy of 99.67%, 98.16%, and 99.76%, respectively. The investigational results show that our proposed method can significantly progress the presence of convolutional neural networks and state-of-the-art methods.