A multimodal deep learning architecture for smoking detection with a small data approach

被引:1
作者
Lakatos, Robert [1 ,2 ,3 ]
Pollner, Peter [4 ]
Hajdu, Andras [2 ]
Joo, Tamas [3 ,4 ]
机构
[1] Univ Debrecen, Doctoral Sch Informat, Debrecen, Hungary
[2] Univ Debrecen, Fac Informat, Dept Data Sci & Visualizat, Debrecen, Hungary
[3] Neumann Nonprofit Ltd, Neumann Technol Platform, Budapest, Hungary
[4] Semmelweis Univ, Hlth Serv Management Training Ctr, Data Driven Hlth Div, Natl Lab Hlth Secur, Budapest, Hungary
来源
FRONTIERS IN ARTIFICIAL INTELLIGENCE | 2024年 / 7卷
关键词
AI supported preventive healthcare; pre-training with generative AI; multimodal deep learning; automated assessment of covert advertisement; few-shot learning; smoking detections;
D O I
10.3389/frai.2024.1326050
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Covert tobacco advertisements often raise regulatory measures. This paper presents that artificial intelligence, particularly deep learning, has great potential for detecting hidden advertising and allows unbiased, reproducible, and fair quantification of tobacco-related media content. We propose an integrated text and image processing model based on deep learning, generative methods, and human reinforcement, which can detect smoking cases in both textual and visual formats, even with little available training data. Our model can achieve 74% accuracy for images and 98% for text. Furthermore, our system integrates the possibility of expert intervention in the form of human reinforcement. Using the pre-trained multimodal, image, and text processing models available through deep learning makes it possible to detect smoking in different media even with few training data.
引用
收藏
页数:8
相关论文
共 43 条
  • [1] Abu-El-Haija S., 2016, arXiv
  • [2] Afzal Ali S., 2022, 2 INT C EM FRONT EL, P1
  • [3] [Anonymous], 2022, TOBACCO
  • [4] [Anonymous], 2018, Improving language understanding by generative pre-training
  • [5] Arthur D, 2007, PROCEEDINGS OF THE EIGHTEENTH ANNUAL ACM-SIAM SYMPOSIUM ON DISCRETE ALGORITHMS, P1027
  • [6] Bianco F., 2021, Automated Detection of Street-Level Tobacco Advertising Displays
  • [7] Latent Dirichlet allocation
    Blei, DM
    Ng, AY
    Jordan, MI
    [J]. JOURNAL OF MACHINE LEARNING RESEARCH, 2003, 3 (4-5) : 993 - 1022
  • [8] Chapman S, 1997, Tob Control, V6, P269
  • [9] Xception: Deep Learning with Depthwise Separable Convolutions
    Chollet, Francois
    [J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 1800 - 1807
  • [10] Canine: Pre-training an Efficient Tokenization-Free Encoder for Language Representation
    Clark, Jonathan H.
    Garrette, Dan
    Turc, Iulia
    Wieting, John
    [J]. TRANSACTIONS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, 2022, 10 : 73 - 91