Objective To evaluate and compare the performance (or accuracy) of publicly available large language models (ChatGPT-4.0, ChatGPT-3.5, and Google Bard) in answering multiple-choice postgraduate-level surgical examination questions. Methods A search was conducted on PubMed/MEDLINE and the Cochrane Library for studies that compared the accuracy of ChatGPT and Google Bard in the context of multiple-choice postgraduate-level surgical examination questions. A random-effects model was used for statistical analysis to estimate and compare the pooled accuracy of the large language models, with results reported as a 95% confidence interval (CI). Heterogeneity was assessed using the I2 statistic and publication bias was evaluated through funnel plots and Egger's test. Statistical significance was set at P < 0.05. Results The full text of 12 studies published between 2023 and 2024 was reviewed, and data extraction was conducted to compare the performance of ChatGPT (GPT-3.5 or GPT-4.0) with Google Bard (rebranded as Gemini). ChatGPT-4.0 exhibited the highest accuracy, with a pooled accuracy of 73% (95% CI: 0.65-0.80, P < 0.01, I2 = 94%). No statistically significant difference was observed when comparing ChatGPT-3.5 with Google Bard (OR: 0.98, 95% CI: 0.8-1.21, P = 0.88, I2 = 67%) which showed that both models performed at a similar level. A statistically significant difference was found when comparing ChatGPT-4.0 with Google Bard (OR: 2.25, 95% CI: 1.73-2.91, P < 0.01, I2 = 78%), showing that ChatGPT-4.0 demonstrated superior performance. Conclusion This meta-analysis highlighted the strong potential of large language models to pass postgraduate-level surgical examinations. Of the three large language models, ChatGPT-4.0 demonstrated the best accuracy, while Google Bard showed the most inconsistent performance, scoring under 50% in 4 of the 12 studies analyzed. These findings suggested that large language models, particularly ChatGPT-4.0, could apply surgical knowledge to solve problems, with the potential for future applications in medical education and patient care.