Natural Language Generation Model for Mammography Reports Simulation

被引:15
作者
Hoogi, Assaf [1 ]
Mishra, Arjun [2 ]
Gimenez, Francisco [1 ]
Dong, Jeffrey [1 ]
Rubin, Daniel [1 ]
机构
[1] Stanford Univ, Biomed Data Sci, Stanford, CA 94305 USA
[2] Berkeley City Coll, Comp Sci, Berkeley, CA 94704 USA
关键词
Natural language generation; mammography reports; RNN-LSTM; simulation;
D O I
10.1109/JBHI.2020.2980118
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Extending the size of labeled corpora of medical reports is a major step towards a successful training of machine learning algorithms. Simulating new text reports is a key solution for reports augmentation, which extends the cohort size. However, text generation in the medical domain is challenging because it needs to preserve both content and style that are typical for real reports, without risking the patients' privacy. In this paper, we present a conditioned LSTM-RNN architecture for simulating realistic mammography reports. We evaluated the performance by analyzing the characteristics of the simulated reports and classifying them into benign and malignant classes. An average classification AUC was calculated over two distinct test sets. A qualitative analysis was also performed in which a masked radiologist classified 0.75 of the simulated reports as real reports, showing that both the style and content of the simulated reports were similar to real reports. Finally, we compared our RNN-LSTM generative model with Markov Random Fields. The RNN-LSTM provided significantly better and more stable performance than MRFs (p< 0.01, Wilcoxon).
引用
收藏
页码:2711 / 2717
页数:7
相关论文
共 50 条
  • [31] Frontiers: Supporting Content Marketing with Natural Language Generation
    Reisenbichler, Martin
    Reutterer, Thomas
    Schweidel, David A.
    Dan, Daniel
    MARKETING SCIENCE, 2022, 41 (03) : 441 - 452
  • [32] Natural Language Generation Using Sequential Models: A Survey
    Pandey, Abhishek Kumar
    Roy, Sanjiban Sekhar
    NEURAL PROCESSING LETTERS, 2023, 55 (06) : 7709 - 7742
  • [33] Study on Natural Language Generation for Spatial Information Representation
    Ren, Fu
    Du, Qingyun
    FIFTH INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS AND KNOWLEDGE DISCOVERY, VOL 4, PROCEEDINGS, 2008, : 213 - 216
  • [34] A Repository of Data and Evaluation Resources for Natural Language Generation
    Belz, Anja
    Gatt, Albert
    LREC 2012 - EIGHTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2012, : 4027 - 4032
  • [35] Natural language generation of negative sentences in the minimalist paradigm
    Xu, EQ
    Proceedings of 2005 International Conference on Machine Learning and Cybernetics, Vols 1-9, 2005, : 3744 - 3748
  • [36] Natural language generation for social robotics: opportunities and challenges
    Foster, Mary Ellen
    PHILOSOPHICAL TRANSACTIONS OF THE ROYAL SOCIETY B-BIOLOGICAL SCIENCES, 2019, 374 (1771)
  • [37] Natural Language Generation for Sponsored-Search Advertisements
    Bartz, Kevin
    Barr, Cory
    Aijaz, Adil
    EC'08: PROCEEDINGS OF THE 2008 ACM CONFERENCE ON ELECTRONIC COMMERCE, 2008, : 1 - 9
  • [38] Predictability and Causality in Spanish and English Natural Language Generation
    Busto-Castineira, Andrea
    Javier Gonzalez-Castano, Francisco
    Garcia-Mendez, Silvia
    de Arriba-Perez, Francisco
    IEEE ACCESS, 2024, 12 : 132521 - 132532
  • [39] Piloting Natural Language Generation for Personalized Progress Feedback
    Leppanen, Leo
    Hellas, Arto
    Leinonen, Juho
    2022 IEEE FRONTIERS IN EDUCATION CONFERENCE, FIE, 2022,
  • [40] Natural Language Generation Using Sequential Models: A Survey
    Abhishek Kumar Pandey
    Sanjiban Sekhar Roy
    Neural Processing Letters, 2023, 55 : 7709 - 7742