Radiographs and texts fusion learning based deep networks for skeletal bone age assessment

被引:4
作者
Hao, Pengyi [1 ]
Ye, Taotao [1 ]
Xie, Xuhang [1 ]
Wu, Fuli [1 ]
Ding, Weilong [1 ]
Zuo, Wuheng [2 ]
Chen, Wei [3 ,4 ]
Wu, Jian [5 ,6 ]
Luo, Xiaonan [7 ]
机构
[1] Zhejiang Univ Technol, Coll Comp Sci & Technol, Hangzhou, Zhejiang, Peoples R China
[2] Zhejiang Univ Technol, Coll Educ Sci & Technol, Hangzhou, Zhejiang, Peoples R China
[3] Zhejiang Univ, Affiliated Hosp 1, Hangzhou, Peoples R China
[4] Zhejiang Univ, State Key Lab CAD & CG, Hangzhou, Peoples R China
[5] Zhejiang Univ, Coll Comp Sci & Technol, Hangzhou, Zhejiang, Peoples R China
[6] Zhejiang Univ, Real Doctor Res Ctr, Hangzhou, Zhejiang, Peoples R China
[7] Guilin Univ Elect Technol, Inst Artificial Intelligence, Guilin, Peoples R China
基金
中国国家自然科学基金;
关键词
Bone age assessment; Convolutional neural network; Attention mechanism; Spatial pyramid pooling; Fusion learning;
D O I
10.1007/s11042-020-08943-1
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Bone age assessment is a pediatric examination that determines the difference between skeletal age and chronological age. The discrepancy between the two ages will often trigger the likelihood of genetic disorders, hormonal complications and abnormalities of maturity in the skeletal system. Recently, although some automated bone age assessment methods by analyzing radiographs have been researched, the available text data from radiological reports are not used. Texts and radiographs are two different modals, the fusion of them can give us much more information for bone age assessment. In this paper, we present a novel multi-modal data fusion-learning network, called RT-FuseNet, for bone age assessment utilizing radiographs and texts. Specifically, we develop a convolutional neural network with spatial pyramid pooling layer and attention mechanism module to ensure the integrity of the image space information and enhance the subtle difference of features among radiographs respectively. In addition, texts are incorporated into the learning model to jointly learn non-linear correlations between various heterogeneous data. To evaluate the proposed approach, two datasets are used and several neural network structures are compared. Experimental results show that the proposed approach performs well.
引用
收藏
页码:16347 / 16366
页数:20
相关论文
共 30 条
  • [1] INVESTIGATION OF DELAYED PUBERTY
    ALBANESE, A
    STANHOPE, R
    [J]. CLINICAL ENDOCRINOLOGY, 1995, 43 (01) : 105 - 110
  • [2] [Anonymous], 2019, NEUROCOMPUTING
  • [3] Athanasios C., 2007, PEDIATR RADIOL, V37, P1241
  • [4] CHEN T, 2019, IEEE 16 INT S BIOM I
  • [5] Remember and forget: video and text fusion for video question answering
    Gao, Feng
    Ge, Yuanyuan
    Liu, Yongge
    [J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2018, 77 (22) : 29269 - 29282
  • [6] Greulich WW., 1959, Radiographic Atlas of Skeletal Development of the Hand and Wrist
  • [7] Multi-source heterogeneous data recognition based on linguistic labels
    Guo, Chen
    Chai, Yong
    Wang, Cong
    [J]. 2016 INTERNATIONAL CONFERENCE ON CYBER-ENABLED DISTRIBUTED COMPUTING AND KNOWLEDGE DISCOVERY PROCEEDINGS - CYBERC 2016, 2016, : 278 - 285
  • [8] Hahmann F, 2013, LECT NOTES COMPUT SC, V8142, P313, DOI 10.1007/978-3-642-40602-7_34
  • [9] Skeletal bone age assessments for young children based on regression convolutional neural networks
    Hao, Pengyi
    Chokuwa, Sharon
    Xie, Xuhang
    Wu, Fuli
    Wu, Jian
    Bai, Cong
    [J]. MATHEMATICAL BIOSCIENCES AND ENGINEERING, 2019, 16 (06) : 6454 - 6466
  • [10] Deep Residual Learning for Image Recognition
    He, Kaiming
    Zhang, Xiangyu
    Ren, Shaoqing
    Sun, Jian
    [J]. 2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 770 - 778