Facial Analysis for Plastic Surgery in the Era of Artificial Intelligence: A Comparative Evaluation of Multimodal Large Language Models

被引:0
作者
Haider, Syed Ali [1 ]
Prabha, Srinivasagam [1 ]
Gomez-Cabello, Cesar A. [1 ]
Borna, Sahar [1 ]
Genovese, Ariana [1 ]
Trabilsy, Maissa [1 ]
Elegbede, Adekunle [1 ]
Yang, Jenny Fei [1 ]
Galvao, Andrea [2 ]
Tao, Cui [3 ]
Forte, Antonio Jorge [1 ,4 ]
机构
[1] Mayo Clin, Div Plast Surg, Jacksonville, FL 32224 USA
[2] Univ Ctr UNICHRISTUS, Sch Dent, BR-60190180 Fortaleza, Brazil
[3] Mayo Clin, Dept Artificial Intelligence & Informat, Jacksonville, FL 32224 USA
[4] Mayo Clin, Ctr Digital Hlth, Rochester, MN 55905 USA
关键词
artificial intelligence; multimodal large language models; large language models; facial analysis; facial plastic surgery; SATISFACTION; ORTHODONTICS; HEALTH; BEAUTY; FACE; AI;
D O I
10.3390/jcm14103484
中图分类号
R5 [内科学];
学科分类号
1002 ; 100201 ;
摘要
Background/Objectives: Facial analysis is critical for preoperative planning in facial plastic surgery, but traditional methods can be time consuming and subjective. This study investigated the potential of Artificial Intelligence (AI) for objective and efficient facial analysis in plastic surgery, with a specific focus on Multimodal Large Language Models (MLLMs). We evaluated their ability to analyze facial skin quality, volume, symmetry, and adherence to aesthetic standards such as neoclassical facial canons and the golden ratio. Methods: We evaluated four MLLMs-ChatGPT-4o, ChatGPT-4, Gemini 1.5 Pro, and Claude 3.5 Sonnet-using two evaluation forms and 15 diverse facial images generated by a Generative Adversarial Network (GAN). The general analysis form evaluated qualitative skin features (texture, type, thickness, wrinkling, photoaging, and overall symmetry). The facial ratios form assessed quantitative structural proportions, including division into equal fifths, adherence to the rule of thirds, and compatibility with the golden ratio. MLLM assessments were compared with evaluations from a plastic surgeon and manual measurements of facial ratios. Results: The MLLMs showed promise in analyzing qualitative features, but they struggled with precise quantitative measurements of facial ratios. Mean accuracy for general analysis were ChatGPT-4o (0.61 +/- 0.49), Gemini 1.5 Pro (0.60 +/- 0.49), ChatGPT-4 (0.57 +/- 0.50), and Claude 3.5 Sonnet (0.52 +/- 0.50). In facial ratio assessments, scores were lower, with Gemini 1.5 Pro achieving the highest mean accuracy (0.39 +/- 0.49). Inter-rater reliability, based on Cohen's Kappa values, ranged from poor to high for qualitative assessments (kappa > 0.7 for some questions) but was generally poor (near or below zero) for quantitative assessments. Conclusions: Current general purpose MLLMs are not yet ready to replace manual clinical assessments but may assist in general facial feature analysis. These findings are based on testing models not specifically trained for facial analysis and serve to raise awareness among clinicians regarding the current capabilities and inherent limitations of readily available MLLMs in this specialized domain. This limitation may stem from challenges with spatial reasoning and fine-grained detail extraction, which are inherent limitations of current MLLMs. Future research should focus on enhancing the numerical accuracy and reliability of MLLMs for broader application in plastic surgery, potentially through improved training methods and integration with other AI technologies such as specialized computer vision algorithms for precise landmark detection and measurement.
引用
收藏
页数:19
相关论文
共 55 条
[31]   Use of a Novel Artificial Intelligence Approach for a Faster and More Precise Computerized Facial Evaluation in Aesthetic Dentistry [J].
Maniega-Manes, Irene ;
Monterde-Hernandez, Manuel ;
Mora-Barrios, Karla ;
Boquete-Castro, Ana .
JOURNAL OF ESTHETIC AND RESTORATIVE DENTISTRY, 2025, 37 (02) :346-351
[32]   Chain of Thought Utilization in Large Language Models and Application in Nephrology [J].
Miao, Jing ;
Thongprayoon, Charat ;
Suppadungsuk, Supawadee ;
Krisanapan, Pajaree ;
Radhakrishnan, Yeshwanter ;
Cheungpasitporn, Wisit .
MEDICINA-LITHUANIA, 2024, 60 (01)
[33]   Reframing Our Approach to Facial Analysis [J].
Miller, Lauren E. ;
Kozin, Elliott D. ;
Lee, Linda N. .
OTOLARYNGOLOGY-HEAD AND NECK SURGERY, 2020, 162 (05) :595-596
[34]  
Moridani M.K., 2023, Int. J. Cogn. Comput. Eng, V4, P160, DOI [10.1016/j.ijcce.2023.04.001, DOI 10.1016/J.IJCCE.2023.04.001]
[35]   Unveiling the black box: A systematic review of Explainable Artificial Intelligence in medical image analysis [J].
Muhammad, Dost ;
Bendechache, Malika .
COMPUTATIONAL AND STRUCTURAL BIOTECHNOLOGY JOURNAL, 2024, 24 :542-560
[36]   Dissecting racial bias in an algorithm used to manage the health of populations [J].
Obermeyer, Ziad ;
Powers, Brian ;
Vogeli, Christine ;
Mullainathan, Sendhil .
SCIENCE, 2019, 366 (6464) :447-+
[37]   Automatic Facial Palsy Diagnosis as a Classification Problem Using Regional Information Extracted from a Photograph [J].
Parra-Dominguez, Gemma S. ;
Garcia-Capulin, Carlos H. ;
Sanchez-Yanez, Raul E. .
DIAGNOSTICS, 2022, 12 (07)
[38]   Neoclassical canons of facial beauty: Do we see the deviations? [J].
Pavlic, Andrej ;
Zrinski, Magda Trinajstic ;
Katic, Visnja ;
Spalj, Stjepan .
JOURNAL OF CRANIO-MAXILLOFACIAL SURGERY, 2017, 45 (05) :741-747
[39]   AI and Ethics: A Systematic Review of the Ethical Considerations of Large Language Model Use in Surgery Research [J].
Pressman, Sophia M. ;
Borna, Sahar ;
Gomez-Cabello, Cesar A. ;
Haider, Syed A. ;
Haider, Clifton ;
Forte, Antonio J. .
HEALTHCARE, 2024, 12 (08)
[40]   Artificial Intelligence Used for Diagnosis in Facial Deformities: A Systematic Review [J].
Ravelo, Victor ;
Acero, Julio ;
Fuentes-Zambrano, Jorge ;
Garcia Guevara, Henry ;
Olate, Sergio .
JOURNAL OF PERSONALIZED MEDICINE, 2024, 14 (06)