Vision Transformers (ViTs) for Feature Extraction and Classification of AI-Generated Visual Designs

被引:0
作者
Yun, Qing [1 ]
机构
[1] Kyungil Univ, Sch Int Exchange, Gyongsan 38428, South Korea
关键词
Artificial intelligence; Art; Visualization; Deep learning; Transformers; Accuracy; Data models; Computer vision; Feature extraction; Computational modeling; convolutional neural network; deep learning; vision transformer; visual aesthetics;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Deep learning has become a cornerstone of modern Artificial Intelligence (AI), enabling machines to process and interpret complex visual information with unprecedented accuracy. As AI-generated content becomes more realistic, the ability to distinguish between machine-created and human-created images is increasingly important. This challenge extends beyond technical concerns, influencing digital media credibility, intellectual property rights, and the integrity of visual communication. Developing robust classification models to accurately attribute image origins is crucial for ensuring transparency, preventing misinformation, and upholding artistic authenticity in an era of rapidly evolving generative AI technologies. This study addresses the critical need to differentiate between AI-generated and human-generated aesthetic images through the application of advanced deep learning models. We investigate the effectiveness of advanced deep learning architectures including High Resolution Networks (HRNet), and Vision Transformers (ViT) which are generally accurate when used to infer the creative characteristics of human visual art. The proposed model ViT, employing a mechanism of self-attention to process images as sequences of patches for feature extraction, is examined for its potential to capture global contextual relationships within images, which is essential for recognizing the nuanced differences between AI and human artistry. ViT achieves 97% accuracy shows that superior performance validates its ability, using its transformer structure, to analyze and learn about the complex features of images which disclose their origin as compared to HRNet model of 95%. This research highlights the potential of using sophisticated deep learning techniques to address the challenges of content authenticity in digital media. By leveraging the unique strengths of each model, we provide insights into their applicability and effectiveness in distinguishing between different forms of digital creation, marking a significant step forward in the field of digital forensics and content verification.
引用
收藏
页码:69459 / 69477
页数:19
相关论文
共 41 条
[11]   Exploring Progress in Text-to-Image Synthesis: An In-Depth Survey on the Evolution of Generative Adversarial Networks [J].
Habib, Md Ahsan ;
Wadud, Md Anwar Hussen ;
Patwary, Md Fazlul Karim ;
Rahman, Mohammad Motiur ;
Mridha, M. F. ;
Okuyama, Yuichi ;
Shin, Jungpil .
IEEE ACCESS, 2024, 12 :178401-178440
[12]  
Hossain M.Z., 2023, 2023 26 INT C COMP I, P1, DOI [10.1109/ICCIT60459.2023.10440990, DOI 10.1109/ICCIT60459.2023.10440990]
[13]  
Huang JB, 2024, Arxiv, DOI arXiv:2404.02990
[14]   Generative Adversarial Networks-Enabled Human-Artificial Intelligence Collaborative Applications for Creative and Design Industries: A Systematic Review of Current Approaches and Trends [J].
Hughes, Rowan T. ;
Zhu, Liming ;
Bednarz, Tomasz .
FRONTIERS IN ARTIFICIAL INTELLIGENCE, 2021, 4
[15]   Using Deep Learning Techniques to Enhance Blood Cell Detection in Patients with Leukemia [J].
Ilyas, Mahwish ;
Bilal, Muhammad ;
Malik, Nadia ;
Khan, Hikmat Ullah ;
Ramzan, Muhammad ;
Naz, Anam .
INFORMATION, 2024, 15 (12)
[16]   Artistic Style Recognition: Combining Deep and Shallow Neural Networks for Painting Classification [J].
Imran, Saqib ;
Naqvi, Rizwan Ali ;
Sajid, Muhammad ;
Malik, Tauqeer Safdar ;
Ullah, Saif ;
Moqurrab, Syed Atif ;
Yon, Dong Keon .
MATHEMATICS, 2023, 11 (22)
[17]  
Jangtjik KA, 2017, IEEE IMAGE PROC, P2866, DOI 10.1109/ICIP.2017.8296806
[18]   Optimally configured generative adversarial networks to distinguish real and AI-generated human faces [J].
Kalaimani, G. ;
Kavitha, G. ;
Chinnaiyan, Selvan ;
Mylapalli, Srikanth .
SIGNAL IMAGE AND VIDEO PROCESSING, 2024, 18 (11) :7921-7938
[19]   AI Art Neural Constellation: Revealing the Collective and Contrastive State of AI-Generated and Human Art [J].
Khan, Faizan Farooq ;
Kim, Diana ;
Jha, Divyansh ;
Mohamed, Youssef ;
Chang, Hanna H. ;
Elgammal, Ahmed ;
Elliott, Luba ;
Elhoseiny, Mohamed .
2024 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW, 2024, :7470-7478
[20]   Four decades of image processing: a bibliometric analysis [J].
Khan, Uzair ;
Khan, Hikmat Ullah ;
Iqbal, Saqib ;
Munir, Hamza .
LIBRARY HI TECH, 2024, 42 (01) :180-202