Facial Analysis for Plastic Surgery in the Era of Artificial Intelligence: A Comparative Evaluation of Multimodal Large Language Models

被引:0
作者
Haider, Syed Ali [1 ]
Prabha, Srinivasagam [1 ]
Gomez-Cabello, Cesar A. [1 ]
Borna, Sahar [1 ]
Genovese, Ariana [1 ]
Trabilsy, Maissa [1 ]
Elegbede, Adekunle [1 ]
Yang, Jenny Fei [1 ]
Galvao, Andrea [2 ]
Tao, Cui [3 ]
Forte, Antonio Jorge [1 ,4 ]
机构
[1] Mayo Clin, Div Plast Surg, Jacksonville, FL 32224 USA
[2] Univ Ctr UNICHRISTUS, Sch Dent, BR-60190180 Fortaleza, Brazil
[3] Mayo Clin, Dept Artificial Intelligence & Informat, Jacksonville, FL 32224 USA
[4] Mayo Clin, Ctr Digital Hlth, Rochester, MN 55905 USA
关键词
artificial intelligence; multimodal large language models; large language models; facial analysis; facial plastic surgery; SATISFACTION; ORTHODONTICS; HEALTH; BEAUTY; FACE; AI;
D O I
10.3390/jcm14103484
中图分类号
R5 [内科学];
学科分类号
1002 ; 100201 ;
摘要
Background/Objectives: Facial analysis is critical for preoperative planning in facial plastic surgery, but traditional methods can be time consuming and subjective. This study investigated the potential of Artificial Intelligence (AI) for objective and efficient facial analysis in plastic surgery, with a specific focus on Multimodal Large Language Models (MLLMs). We evaluated their ability to analyze facial skin quality, volume, symmetry, and adherence to aesthetic standards such as neoclassical facial canons and the golden ratio. Methods: We evaluated four MLLMs-ChatGPT-4o, ChatGPT-4, Gemini 1.5 Pro, and Claude 3.5 Sonnet-using two evaluation forms and 15 diverse facial images generated by a Generative Adversarial Network (GAN). The general analysis form evaluated qualitative skin features (texture, type, thickness, wrinkling, photoaging, and overall symmetry). The facial ratios form assessed quantitative structural proportions, including division into equal fifths, adherence to the rule of thirds, and compatibility with the golden ratio. MLLM assessments were compared with evaluations from a plastic surgeon and manual measurements of facial ratios. Results: The MLLMs showed promise in analyzing qualitative features, but they struggled with precise quantitative measurements of facial ratios. Mean accuracy for general analysis were ChatGPT-4o (0.61 +/- 0.49), Gemini 1.5 Pro (0.60 +/- 0.49), ChatGPT-4 (0.57 +/- 0.50), and Claude 3.5 Sonnet (0.52 +/- 0.50). In facial ratio assessments, scores were lower, with Gemini 1.5 Pro achieving the highest mean accuracy (0.39 +/- 0.49). Inter-rater reliability, based on Cohen's Kappa values, ranged from poor to high for qualitative assessments (kappa > 0.7 for some questions) but was generally poor (near or below zero) for quantitative assessments. Conclusions: Current general purpose MLLMs are not yet ready to replace manual clinical assessments but may assist in general facial feature analysis. These findings are based on testing models not specifically trained for facial analysis and serve to raise awareness among clinicians regarding the current capabilities and inherent limitations of readily available MLLMs in this specialized domain. This limitation may stem from challenges with spatial reasoning and fine-grained detail extraction, which are inherent limitations of current MLLMs. Future research should focus on enhancing the numerical accuracy and reliability of MLLMs for broader application in plastic surgery, potentially through improved training methods and integration with other AI technologies such as specialized computer vision algorithms for precise landmark detection and measurement.
引用
收藏
页数:19
相关论文
共 55 条
[1]   Artificial Intelligence in Facial Measurement: A New Era of Symmetry and Proportions Analysis [J].
Ali, Rizwan ;
Cui, Haiyan .
AESTHETIC PLASTIC SURGERY, 2025,
[2]   Leveraging ChatGPT for Enhanced Aesthetic Evaluations in Minimally Invasive Facial Procedures [J].
Ali, Rizwan ;
Cui, Haiyan .
AESTHETIC PLASTIC SURGERY, 2025, 49 (03) :950-961
[3]   Multimodal Large Language Models in Health Care:Applications,Challenges, and Future Outlook [J].
AlSaad, Rawan ;
Abd-alrazaq, Alaa ;
Boughorbel, Sabri ;
Ahmed, Arfan ;
Renault, Max-Antoine ;
Damseh, Rafat ;
Sheikh, Javaid .
JOURNAL OF MEDICAL INTERNET RESEARCH, 2024, 26
[4]   A Review of Machine Learning and Deep Learning Methods for Person Detection, Tracking and Identification, and Face Recognition with Applications [J].
Amirgaliyev, Beibut ;
Mussabek, Miras ;
Rakhimzhanova, Tomiris ;
Zhumadillayeva, Ainur .
SENSORS, 2025, 25 (05)
[5]   Fully automated quantitative cephalometry using convolutional neural networks [J].
Arik S.Ö. ;
Ibragimov B. ;
Xing L. .
Journal of Medical Imaging, 2017, 4 (01)
[6]   A Deep Learning Approach for Early Detection of Facial Palsy in Video Using Convolutional Neural Networks: A Computational Study [J].
Arora, Anuja ;
Zaeem, Jasir Mohammad ;
Garg, Vibhor ;
Jayal, Ambikesh ;
Akhtar, Zahid .
COMPUTERS, 2024, 13 (08)
[7]   Fully automated landmarking and facial segmentation on 3D photographs [J].
Berends, Bo ;
Bielevelt, Freek ;
Schreurs, Ruud ;
Vinayahalingam, Shankeeth ;
Maal, Thomas ;
de Jong, Guido .
SCIENTIFIC REPORTS, 2024, 14 (01)
[8]  
Chhua K, 2024, Arxiv, DOI arXiv:2408.14842
[9]  
Chong C., 2024, J. Stud. Res, V13, P1, DOI [10.47611/jsrhs.v13i1.6359, DOI 10.47611/JSRHS.V13I1.6359]
[10]   Comparison of two devices for facial skin analysis [J].
Cook, Madison K. ;
Kaszycki, Margaret A. ;
Richardson, Irma ;
Taylor, Sarah L. ;
Feldman, Steven R. .
JOURNAL OF COSMETIC DERMATOLOGY, 2022, 21 (12) :7001-7006