Natural Language Generation Model for Mammography Reports Simulation

被引:15
|
作者
Hoogi, Assaf [1 ]
Mishra, Arjun [2 ]
Gimenez, Francisco [1 ]
Dong, Jeffrey [1 ]
Rubin, Daniel [1 ]
机构
[1] Stanford Univ, Biomed Data Sci, Stanford, CA 94305 USA
[2] Berkeley City Coll, Comp Sci, Berkeley, CA 94704 USA
关键词
Natural language generation; mammography reports; RNN-LSTM; simulation;
D O I
10.1109/JBHI.2020.2980118
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Extending the size of labeled corpora of medical reports is a major step towards a successful training of machine learning algorithms. Simulating new text reports is a key solution for reports augmentation, which extends the cohort size. However, text generation in the medical domain is challenging because it needs to preserve both content and style that are typical for real reports, without risking the patients' privacy. In this paper, we present a conditioned LSTM-RNN architecture for simulating realistic mammography reports. We evaluated the performance by analyzing the characteristics of the simulated reports and classifying them into benign and malignant classes. An average classification AUC was calculated over two distinct test sets. A qualitative analysis was also performed in which a masked radiologist classified 0.75 of the simulated reports as real reports, showing that both the style and content of the simulated reports were similar to real reports. Finally, we compared our RNN-LSTM generative model with Markov Random Fields. The RNN-LSTM provided significantly better and more stable performance than MRFs (p< 0.01, Wilcoxon).
引用
收藏
页码:2711 / 2717
页数:7
相关论文
共 50 条
  • [21] Flexible Natural Language Generation in Multiple Contexts
    Cullen, Caroline
    O'Neill, Ian
    Hanna, Philip
    HUMAN LANGUAGE TECHNOLOGY: CHALLENGES OF THE INFORMATION SOCIETY, 2009, 5603 : 142 - 153
  • [22] Stylistic Control for Neural Natural Language Generation
    Oraby, Shereen
    COMPANION PROCEEDINGS OF THE WEB CONFERENCE 2022, WWW 2022 COMPANION, 2022, : 1179 - 1179
  • [23] Automated Insights on Visualizations with Natural Language Generation
    Brath, Richard
    Hagerman, Craig
    2021 25TH INTERNATIONAL CONFERENCE INFORMATION VISUALISATION (IV): AI & VISUAL ANALYTICS & DATA SCIENCE, 2021, : 278 - 284
  • [24] A systematic review of natural language processing applied to radiology reports
    Casey, Arlene
    Davidson, Emma
    Poon, Michael
    Dong, Hang
    Duma, Daniel
    Grivas, Andreas
    Grover, Claire
    Suarez-Paniagua, Victor
    Tobin, Richard
    Whiteley, William
    Wu, Honghan
    Alex, Beatrice
    BMC MEDICAL INFORMATICS AND DECISION MAKING, 2021, 21 (01)
  • [25] Evaluation in Natural Language Generation: Lessons from Referring Expression Generation
    Viethen, Jette
    Dale, Robert
    TRAITEMENT AUTOMATIQUE DES LANGUES, 2007, 48 (01): : 141 - 160
  • [26] The Effect of Multiple Replies for Natural Language Generation Chatbots
    Chen, Eason
    EXTENDED ABSTRACTS OF THE 2022 CHI CONFERENCE ON HUMAN FACTORS IN COMPUTING SYSTEMS, CHI 2022, 2022,
  • [27] A library for automatic natural language generation of spanish texts
    Garcia-Mendez, Silvia
    Fernandez-Gavilanes, Milagros
    Costa-Montenegro, Enrique
    Juncal-Martinez, Jonathan
    Javier Gonzalez-Castano, F.
    EXPERT SYSTEMS WITH APPLICATIONS, 2019, 120 : 372 - 386
  • [28] Up-cycling Data for Natural Language Generation
    Isard, Amy
    Oberlander, Jon
    Grover, Claire
    PROCEEDINGS OF THE ELEVENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2018), 2018, : 3055 - 3061
  • [29] A Review on Question Generation from Natural Language Text
    Zhang, Ruqing
    Guo, Jiafeng
    Chen, Lu
    Fan, Yixing
    Cheng, Xueqi
    ACM TRANSACTIONS ON INFORMATION SYSTEMS, 2022, 40 (01)
  • [30] Analysing the Influence of Semantic Knowledge in Natural Language Generation
    Barros, Cristina
    Lloret, Elena
    2017 TWELFTH INTERNATIONAL CONFERENCE ON DIGITAL INFORMATION MANAGEMENT (ICDIM), 2017, : 185 - 190