Open-Source Text-to-Image Models: Evaluation using Metrics and Human Perception

被引:0
作者
Yamac, Aylin [1 ]
Genc, Dilan [1 ]
Zaman, Esra [1 ]
Gerschner, Felix [1 ]
Klaiber, Marco [1 ]
Theissler, Andreas [1 ]
机构
[1] Aalen Univ Appl Sci, Aalen, Germany
来源
2024 IEEE 48TH ANNUAL COMPUTERS, SOFTWARE, AND APPLICATIONS CONFERENCE, COMPSAC 2024 | 2024年
关键词
text-to-image; open-source; weaknesses;
D O I
10.1109/COMPSAC61105.2024.00261
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Text-to-image models, which aim to convert text input into images, have gained popularity partly due to their flexibility and user-friendliness. However, there are still weaknesses in the generation of images intended to display emotions, visual text, multiple objects, relative positioning, and attribute binding. This study analyzes the weaknesses of three open-source models: Stable Diffusion v2-1, Openjourney, and Dreamlike Photoreal 2.0. The models are compared based on scores for quality, alignment, and aesthetics. The evaluation is based on (a) the metrics ClipScore, Frechet Inception Distance (FID), and Large-scale Artificial Intelligence Open Network (LAION) and (b) human perception obtained in user surveys. The evaluation revealed that all models show predominantly unsatisfactory performance, and the identified weaknesses were confirmed.
引用
收藏
页码:1659 / 1664
页数:6
相关论文
共 50 条
  • [21] Comparative evaluation of an open-source FDM system
    Johnson, Wayne M.
    Rowell, Matthew
    Deason, Bill
    Eubanks, Malik
    RAPID PROTOTYPING JOURNAL, 2014, 20 (03) : 205 - 214
  • [22] SimplePhy: An open-source tool for quick online perception experiments
    Lago, Miguel A.
    BEHAVIOR RESEARCH METHODS, 2021, 53 (04) : 1669 - 1676
  • [23] SimplePhy: An open-source tool for quick online perception experiments
    Miguel A. Lago
    Behavior Research Methods, 2021, 53 : 1669 - 1676
  • [24] Open-Ethical AI: Advancements in Open-Source Human-Centric Neural Language Models
    Sicari, Sabrina
    Cevallos, M. Jesus F.
    Rizzardi, Alessandra
    Coen-porisini, Alberto
    ACM COMPUTING SURVEYS, 2025, 57 (04)
  • [25] A Comprehensive Comparison of Open-Source Libraries for Handwritten Text Recognition in Norwegian
    Maarand, Martin
    Beyer, Yngvil
    Kasen, Andre
    Fosseide, Knut T.
    Kermorvant, Christopher
    DOCUMENT ANALYSIS SYSTEMS, DAS 2022, 2022, 13237 : 399 - 413
  • [26] Illustrating Classic Brazilian Books using a Text-To-Image Diffusion Model
    Mahlow, Felipe Rodrigues Perche
    Castaneda, William Alberto Cruz
    Zanella, Andre Felipe
    Sarzi-Ribeiro, Regilene Aparecida
    IEEE LATIN AMERICA TRANSACTIONS, 2024, 22 (12) : 1000 - 1008
  • [27] Ransomware Detection Using Open-source Tools
    Lee, Sun-Jin
    Shim, Hye-Yeon
    Lee, Yu-Rim
    Park, Tae-Rim
    Lee, Il-Gu
    2022 24TH INTERNATIONAL CONFERENCE ON ADVANCED COMMUNICATION TECHNOLOGY (ICACT): ARITIFLCIAL INTELLIGENCE TECHNOLOGIES TOWARD CYBERSECURITY, 2022, : 1386 - +
  • [28] Real Silicon Using Open-Source EDA
    Gaur, Hari M.
    Singh, Ashutosh K.
    Mohan, Anand
    Fujita, Masahiro
    Pradhan, Dhiraj K.
    IEEE DESIGN & TEST, 2021, 38 (02) : 89 - 96
  • [29] Sharing experiments using open-source software
    Nelson, Adam
    Menzies, Tim
    Gay, Gregory
    SOFTWARE-PRACTICE & EXPERIENCE, 2011, 41 (03) : 283 - 305
  • [30] Performance Evaluation of Open-Source Serverless Platforms for Kubernetes
    Decker, Jonathan
    Kasprzak, Piotr
    Kunkel, Julian Martin
    ALGORITHMS, 2022, 15 (07)