A multimodal deep learning model for detecting endoscopic images of near-infrared fluorescence capsules

被引：0

作者：

Wang, Junhao ^{[1
,2
]}

Zhou, Cheng ^{[1
]}

Wang, Wei ^{[1
]}

Zhang, Hanxiao ^{[2
]}

Zhang, Amin ^{[4
]}

Cui, Daxiang ^{[1
,3
]}

机构：

[1] Shanghai Jiao Tong Univ, Sch Elect Informat & Elect Engn, Shanghai 200240, Peoples R China

[2] Shanghai Jiao Tong Univ, Inst & Med Robot, Shanghai 200240, Peoples R China

[3] Henan Univ, Med & Engn Cross Res Inst, Sch Med, Kaifeng 475004, Peoples R China

[4] Shanghai Jiao Tong Univ, Sch Agr & Biol, Dept Food Sci & Technol, Shanghai 200240, Peoples R China

来源：

BIOSENSORS & BIOELECTRONICS | 2025年 / 278卷

基金：

国家自然科学基金国际合作与交流项目; 上海市自然科学基金; 中国国家自然科学基金; 中国博士后科学基金;

关键词：

Fluorescence endoscopy; Multimodal deep learning; Disease detection; CANCER;

D O I：

10.1016/j.bios.2025.117251

中图分类号：

Q6 [生物物理学];

学科分类号：

071011 ;

摘要：

Early screening for gastrointestinal (GI) diseases is critical for preventing cancer development. With the rapid advancement of deep learning technology, artificial intelligence (AI) has become increasingly prominent in the early detection of GI diseases. Capsule endoscopy is a non-invasive medical imaging technique used to examine the gastrointestinal tract. In our previous work, we developed a near-infrared fluorescence capsule endoscope (NIRF-CE) capable of exciting and capturing near-infrared (NIR) fluorescence images to specifically identify subtle mucosal microlesions and submucosal abnormalities while simultaneously capturing conventional white- light images to detect lesions with significant morphological changes. However, limitations such as low camera resolution and poor lighting within the gastrointestinal tract may lead to misdiagnosis and other medical errors. Manually reviewing and interpreting large volumes of capsule endoscopy images is time-consuming and prone to errors. Deep learning models have shown potential in automatically detecting abnormalities in NIRF-CE images. This study focuses on an improved deep learning model called Retinex-Attention-YOLO (RAY), which is based on single-modality image data and built on the YOLO series of object detection models. RAY enhances the accuracy and efficiency of anomaly detection, especially under low-light conditions. To further improve detection performance, we also propose a multimodal deep learning model, Multimodal-Retinex-Attention-YOLO (MRAY), which combines both white-light and fluorescence image data. The dataset used in this study consists of images of pig stomachs captured by our NIRF-CE system, simulating the human GI tract. In conjunction with a targeted fluorescent probe, which accumulates at lesion sites and releases fluorescent signals for imaging when abnormalities are present, a bright spot indicates a lesion. The MRAY model achieved an impressive precision of 96.3%, outperforming similar object detection models. To further validate the model's performance, ablation experiments were conducted, and comparisons were made with publicly available datasets. MRAY shows great promise for the automated detection of GI cancers, ulcers, inflammations, and other medical conditions in clinical practice.

引用

页数：12

共 47 条

[41] Enhancing Geometric Factors in Model Learning and Inference for Object Detection and Instance Segmentation
Zheng, Zhaohui
Wang, Ping
Ren, Dongwei
Liu, Wei
Ye, Rongguang
Hu, Qinghua
Zuo, Wangmeng
[J]. IEEE TRANSACTIONS ON CYBERNETICS, 2022, 52 (08) : 8574 - 8586
[42] Programmable Adaptive Anti-Retention Module for Capsule Endoscopy Based on Durable and Ultra-Wide Range Pressure Sensor
Zhou, Cheng
Wang, Weicheng
Jiang, Jinlei
Wang, Wei
Tang, Ning
Liu, Yamin
Xue, Shenghao
Guo, Yiping
Cui, Daxiang
Li, Qichao
[J]. IEEE SENSORS JOURNAL, 2024, 24 (09) : 15167 - 15174
[43] An ingestible near-infrared fluorescence capsule endoscopy for specific gastrointestinal diagnoses
Zhou, Cheng
Jiang, Jinlei
Huang, Songwei
Wang, Junhao
Cui, Xinyuan
Wang, Weicheng
Chen, Mingrui
Peng, Jiawei
Shi, Nanqing
Wang, Bensong
Zhang, Amin
Zhang, Qian
Li, Qichao
Cui, Shengsheng
Xue, Shenghao
Wang, Wei
Tang, Ning
Cui, Daxiang
[J]. BIOSENSORS & BIOELECTRONICS, 2024, 257
[44] Classification of precancerous lesions based on fusion of multiple hierarchical features
Zhou, Huijun
Liu, Zhenyang
Li, Ting
Chen, Yifei
Huang, Wei
Zhang, Zijian
[J]. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE, 2023, 229
[45] A Review of Deep Learning in Medical Imaging: Imaging Traits, Technology Trends, Case Studies With Progress Highlights, and Future Promises
Zhou, S. Kevin
Greenspan, Hayit
Davatzikos, Christos
Duncan, James S.
Van Ginneken, Bram
Madabhushi, Anant
Prince, Jerry L.
Rueckert, Daniel
Summers, Ronald M.
[J]. PROCEEDINGS OF THE IEEE, 2021, 109 (05) : 820 - 838
[46] A nanozyme-based colorimetric sensor array as electronic tongue for thiols discrimination and disease identification
Zhu, Xueying
Li, Tian
Hai, Xin
Bi, Sai
[J]. BIOSENSORS & BIOELECTRONICS, 2022, 213
[47] A Comprehensive Survey on Transfer Learning
Zhuang, Fuzhen
Qi, Zhiyuan
Duan, Keyu
Xi, Dongbo
Zhu, Yongchun
Zhu, Hengshu
Xiong, Hui
He, Qing
[J]. PROCEEDINGS OF THE IEEE, 2021, 109 (01) : 43 - 76

← 1 2 3 4 5 →