Natural Language Generation Model for Mammography Reports Simulation

被引:14
作者
Hoogi, Assaf [1 ]
Mishra, Arjun [2 ]
Gimenez, Francisco [1 ]
Dong, Jeffrey [1 ]
Rubin, Daniel [1 ]
机构
[1] Stanford Univ, Biomed Data Sci, Stanford, CA 94305 USA
[2] Berkeley City Coll, Comp Sci, Berkeley, CA 94704 USA
关键词
Natural language generation; mammography reports; RNN-LSTM; simulation;
D O I
10.1109/JBHI.2020.2980118
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Extending the size of labeled corpora of medical reports is a major step towards a successful training of machine learning algorithms. Simulating new text reports is a key solution for reports augmentation, which extends the cohort size. However, text generation in the medical domain is challenging because it needs to preserve both content and style that are typical for real reports, without risking the patients' privacy. In this paper, we present a conditioned LSTM-RNN architecture for simulating realistic mammography reports. We evaluated the performance by analyzing the characteristics of the simulated reports and classifying them into benign and malignant classes. An average classification AUC was calculated over two distinct test sets. A qualitative analysis was also performed in which a masked radiologist classified 0.75 of the simulated reports as real reports, showing that both the style and content of the simulated reports were similar to real reports. Finally, we compared our RNN-LSTM generative model with Markov Random Fields. The RNN-LSTM provided significantly better and more stable performance than MRFs (p< 0.01, Wilcoxon).
引用
收藏
页码:2711 / 2717
页数:7
相关论文
共 50 条
  • [41] Natural language generation of biomedical argumentation for lay audiences
    Green, Nancy
    Dwight, Rachael
    Navoraphan, Kanyamas
    Stadler, Brian
    ARGUMENT & COMPUTATION, 2011, 2 (01) : 23 - 50
  • [42] Natural Language Generation Using Sequential Models: A Survey
    Abhishek Kumar Pandey
    Sanjiban Sekhar Roy
    Neural Processing Letters, 2023, 55 : 7709 - 7742
  • [43] Piloting Natural Language Generation for Personalized Progress Feedback
    Leppanen, Leo
    Hellas, Arto
    Leinonen, Juho
    2022 IEEE FRONTIERS IN EDUCATION CONFERENCE, FIE, 2022,
  • [44] Predictability and Causality in Spanish and English Natural Language Generation
    Busto-Castineira, Andrea
    Javier Gonzalez-Castano, Francisco
    Garcia-Mendez, Silvia
    de Arriba-Perez, Francisco
    IEEE ACCESS, 2024, 12 : 132521 - 132532
  • [45] Characterizing Mammography Reports for Health Analytics
    Rojas, Carlos C.
    Patton, Robert M.
    Beckerman, Barbara G.
    JOURNAL OF MEDICAL SYSTEMS, 2011, 35 (05) : 1197 - 1210
  • [46] Characterizing Mammography Reports for Health Analytics
    Carlos C. Rojas
    Robert M. Patton
    Barbara G. Beckerman
    Journal of Medical Systems, 2011, 35 : 1197 - 1210
  • [47] A scoping review of natural language processing of radiology reports in breast cancer
    Saha, Ashirbani
    Burns, Levi
    Kulkarni, Ameya Madhav
    FRONTIERS IN ONCOLOGY, 2023, 13
  • [48] A Study on Flexibility in Natural Language Generation Through a Statistical Approach to Story Generation
    Vicente, Marta
    Barros, Cristina
    Lloret, Elena
    NATURAL LANGUAGE PROCESSING AND INFORMATION SYSTEMS, NLDB 2017, 2017, 10260 : 492 - 498
  • [49] Understanding English as a Foreign Language Students' Idea Generation Strategies for Creative Writing With Natural Language Generation Tools
    Woo, David James
    Wang, Yanzhi
    Susanto, Hengky
    Guo, Kai
    JOURNAL OF EDUCATIONAL COMPUTING RESEARCH, 2023, 61 (07) : 1464 - 1482
  • [50] Enhanced model for abstractive Arabic text summarization using natural language generation and named entity recognition
    Nada Essa
    M. M. El-Gayar
    Eman M. El-Daydamony
    Neural Computing and Applications, 2025, 37 (10) : 7279 - 7301