REDBot: Natural language process methods for clinical copy number variation reporting in prenatal and products of conception diagnosis

被引:1
|
作者
Liu, Mengmeng [1 ]
Zhong, Yunshan [1 ]
Liu, Hongqian [2 ]
Liang, Desheng [3 ,4 ]
Liu, Erhong [1 ]
Zhang, Yu [1 ]
Tian, Feng [1 ]
Liang, Qiaowei [4 ]
Cram, David S. [1 ]
Wang, Hua [5 ]
Wu, Lingqian [3 ]
Yu, Fuli [6 ]
机构
[1] Berry Genomics Corp, Beijing, Peoples R China
[2] Sichuan Univ, West China Second Univ Hosp, Dept Obstet & Gynecol, Chengdu, Peoples R China
[3] Cent South Univ, Sch Life Sci, Ctr Med Genet, Changsha, Peoples R China
[4] Hunan Jiahui Genet Hosp, Changsha, Peoples R China
[5] Hunan Prov Maternal & Child Hlth Care Hosp, Changsha, Peoples R China
[6] Baylor Coll Med, Dept Mol & Human Genet, Human Genome Sequencing Ctr, Houston, TX 77030 USA
来源
MOLECULAR GENETICS & GENOMIC MEDICINE | 2020年 / 8卷 / 11期
关键词
active learning; copy number variation; natural language process; prenatal diagnosis; variant classification; JOINT CONSENSUS RECOMMENDATION; MEDICAL GENETICS; AMERICAN-COLLEGE; STRUCTURAL VARIATION; VARIANTS; STANDARDS; FETUSES; GENOME; GUIDELINES; RESOURCE;
D O I
10.1002/mgg3.1488
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
Background: Current copy number variation (CNV) identification methods have rapidly become mature. However, the postdetection processes such as variant interpretation or reporting are inefficient. To overcome this situation, we developed REDBot as an automated software package for accurate and direct generation of clinical diagnostic reports for prenatal and products of conception (POC) samples. Methods: We applied natural language process (NLP) methods for analyzing 30,235 in-house historical clinical reports through active learning, and then, developed clinical knowledge bases, evidence-based interpretation methods and reporting criteria to support the whole postdetection pipeline. Results: Of the 30,235 reports, we obtained 37,175 CNV-paragraph pairs. For these pairs, the active learning approaches achieved a 0.9466 average F1-score in sentence classification. The overall accuracy for variant classification was 95.7%, 95.2%, and 100.0% in retrospective, prospective, and clinical utility experiments, respectively. Conclusion: By integrating NLP methods in CNVs postdetection pipeline, REDBot is a robust and rapid tool with clinical utility for prenatal and POC diagnosis.
引用
收藏
页数:11
相关论文
共 13 条
  • [1] Combined use of karyotyping and copy number variation sequencing technology in prenatal diagnosis
    Zhang, Suhua
    Xu, Yuexin
    Lu, Dan
    Fu, Dan
    Zhao, Yan
    PEERJ, 2022, 10
  • [2] Evaluation of the clinical effects of non-invasive prenatal screening for diseases associated with aneuploidy and copy number variation
    Zhu, Shaohua
    Jia, Chunyang
    Hao, Shengju
    Zhang, Qinghua
    He, Jing
    Wang, Xing
    Lin, Pengwu
    Guo, Yuanyuan
    Li, Yigang
    Feng, Xuan
    MOLECULAR GENETICS & GENOMIC MEDICINE, 2023, 11 (09):
  • [3] Investigation on combined copy number variation sequencing and cytogenetic karyotyping for prenatal diagnosis
    Zhang, Jinman
    Tang, Xinhua
    Hu, Jilin
    He, Guilin
    Wang, Jian
    Zhu, Yingting
    Zhu, Baosheng
    BMC PREGNANCY AND CHILDBIRTH, 2021, 21 (01)
  • [4] Copy number variation sequencing for the products of conception: What is the optimal testing strategy
    Chen, Yiyao
    Han, Xu
    Hua, Renyi
    Li, Niu
    Zhang, Lanlan
    Hu, Wenjing
    Wang, Yanlin
    Qian, Zhida
    Li, Shuyuan
    CLINICA CHIMICA ACTA, 2024, 557
  • [5] Investigation on combined copy number variation sequencing and cytogenetic karyotyping for prenatal diagnosis
    Jinman Zhang
    Xinhua Tang
    Jilin Hu
    Guilin He
    Jian Wang
    Yingting Zhu
    Baosheng Zhu
    BMC Pregnancy and Childbirth, 21
  • [6] Consistent count region-copy number variation (CCR-CNV): an expandable and robust tool for clinical diagnosis of copy number variation at the exon level using next-generation sequencing data
    Kim, Man Jin
    Lee, Sungyoung
    Yun, Hongseok
    Cho, Sung Im
    Kim, Boram
    Lee, Jee-Soo
    Chae, Jong Hee
    Sun, Choonghyun
    Park, Sung Sup
    Seong, Moon-Woo
    GENETICS IN MEDICINE, 2022, 24 (03) : 663 - 672
  • [7] Towards an evidence-based process for the clinical interpretation of copy number variation
    Riggs, E. R.
    Church, D. M.
    Hanson, K.
    Horner, V. L.
    Kaminsky, E. B.
    Kuhn, R. M.
    Wain, K. E.
    Williams, E. S.
    Aradhya, S.
    Kearney, H. M.
    Ledbetter, D. H.
    South, S. T.
    Thorland, E. C.
    Martin, C. L.
    CLINICAL GENETICS, 2012, 81 (05) : 403 - 412
  • [8] Analysis results of 579 cases of genomic copy number variation sequencing of pregnant women in prenatal diagnosis
    Huang, L. -L.
    Chen, H. -F.
    Huang, Y.
    Wei, Y. -N.
    Tong, J. -R.
    Chen, Y.
    Luo, J.
    Liao, S.
    Wei, L. -L.
    Deng, L.
    Su, J. -Y.
    EUROPEAN REVIEW FOR MEDICAL AND PHARMACOLOGICAL SCIENCES, 2022, 26 (20) : 7572 - 7579
  • [9] Copy number variation sequencing-based prenatal diagnosis using cell-free fetal DNA in amniotic fluid
    Qi, Qingwei
    Lu, Sijia
    Zhou, Xiya
    Yao, Fengxia
    Hao, Na
    Yin, Guangjun
    Li, Wenhui
    Bai, Junjie
    Li, Ning
    Cram, David S.
    PRENATAL DIAGNOSIS, 2016, 36 (06) : 576 - 583
  • [10] 8p23.1 duplication syndrome differentiated from copy number variation of the defensin cluster at prenatal diagnosis in four new families
    John CK Barber
    Dave Bunyan
    Merryl Curtis
    Denise Robinson
    Susanne Morlot
    Anette Dermitzel
    Thomas Liehr
    Claudia Alves
    Joana Trindade
    Ana I Paramos
    Clare Cooper
    Kevin Ocraft
    Emma-Jane Taylor
    Viv K Maloney
    Molecular Cytogenetics, 3