Natural Language Generation Model for Mammography Reports Simulation

被引:15
|
作者
Hoogi, Assaf [1 ]
Mishra, Arjun [2 ]
Gimenez, Francisco [1 ]
Dong, Jeffrey [1 ]
Rubin, Daniel [1 ]
机构
[1] Stanford Univ, Biomed Data Sci, Stanford, CA 94305 USA
[2] Berkeley City Coll, Comp Sci, Berkeley, CA 94704 USA
关键词
Natural language generation; mammography reports; RNN-LSTM; simulation;
D O I
10.1109/JBHI.2020.2980118
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Extending the size of labeled corpora of medical reports is a major step towards a successful training of machine learning algorithms. Simulating new text reports is a key solution for reports augmentation, which extends the cohort size. However, text generation in the medical domain is challenging because it needs to preserve both content and style that are typical for real reports, without risking the patients' privacy. In this paper, we present a conditioned LSTM-RNN architecture for simulating realistic mammography reports. We evaluated the performance by analyzing the characteristics of the simulated reports and classifying them into benign and malignant classes. An average classification AUC was calculated over two distinct test sets. A qualitative analysis was also performed in which a masked radiologist classified 0.75 of the simulated reports as real reports, showing that both the style and content of the simulated reports were similar to real reports. Finally, we compared our RNN-LSTM generative model with Markov Random Fields. The RNN-LSTM provided significantly better and more stable performance than MRFs (p< 0.01, Wilcoxon).
引用
收藏
页码:2711 / 2717
页数:7
相关论文
共 50 条
  • [1] SemScribe: Natural Language Generation for Medical Reports
    Varges, Sebastian
    Bieler, Heike
    Stede, Manfred
    Faulstich, Lukas C.
    Irsig, Kristin
    Atalla, Malik
    LREC 2012 - EIGHTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2012, : 2674 - 2681
  • [2] An Ontology Model for Nursing Narratives with Natural Language Generation Technology
    Min, Yul Ha
    Park, Hyeoun-Ae
    Jeon, Eunjoo
    Lee, Joo Yun
    Jo, Soo Jung
    MEDINFO 2013: PROCEEDINGS OF THE 14TH WORLD CONGRESS ON MEDICAL AND HEALTH INFORMATICS, PTS 1 AND 2, 2013, 192 : 962 - 962
  • [3] Aggregation in natural language generation
    Dalianis, H
    COMPUTATIONAL INTELLIGENCE, 1999, 15 (04) : 384 - 414
  • [4] A Survey of Natural Language Generation
    Dong, Chenhe
    Li, Yinghui
    Gong, Haifan
    Chen, Miaoxin
    Li, Junxin
    Shen, Ying
    Yang, Min
    ACM COMPUTING SURVEYS, 2023, 55 (08)
  • [5] Natural Language Generation of Museum Object Descriptions based on User Model
    Chen, Hsiao Wei
    Lim, Mary Grace
    Perez, Patricia Bea
    Reyes, Joanna Patricia
    Lim, Nathalie Rose
    PACLIC 22: PROCEEDINGS OF THE 22ND PACIFIC ASIA CONFERENCE ON LANGUAGE, INFORMATION AND COMPUTATION, 2008, : 141 - 150
  • [6] Automatic detection of contextual laterality in Mammography Reports using Large Language Models
    Godoy, Eduardo
    de Ferrari, Joaquin
    Mellado, Diego
    Chabert, Steren
    Salas, Rodrigo
    2024 14TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION SYSTEMS, ICPRS, 2024,
  • [7] Sentence Compression with Natural Language Generation
    Li, Peng
    Wang, Yinglin
    KNOWLEDGE ENGINEERING AND MANAGEMENT, 2011, 123 : 357 - 363
  • [8] Natural language generation of surgical procedures
    Wagner, JC
    Rogers, JE
    Baud, RH
    Scherrer, JR
    MEDINFO '98 - 9TH WORLD CONGRESS ON MEDICAL INFORMATICS, PTS 1 AND 2, 1998, 52 : 591 - 595
  • [9] Natural Language Generation from Ontologies
    Nguyen, Van
    Son, Tran Cao
    Pontelli, Enrico
    PRACTICAL ASPECTS OF DECLARATIVE LANGUAGES (PADL 2019), 2019, 11372 : 64 - 81
  • [10] Natural Language Generation and Semantic Technologies
    Staykova, Kamenka
    CYBERNETICS AND INFORMATION TECHNOLOGIES, 2014, 14 (02) : 3 - 23