Machine learning-based donor permission extraction from informed consent documents

被引:0
作者
Zhang, Meng [1 ]
Sankaranarayanapillai, Madhuri [1 ]
Du, Jingcheng [1 ]
Xiang, Yang [1 ]
Manion, Frank J. [2 ]
Harris, Marcelline R. [2 ]
Stansbury, Cooper [2 ]
Pham, Huy Anh [1 ]
Tao, Cui [1 ,3 ]
机构
[1] Univ Texas Hlth Sci Ctr Houston, McWilliam Sch Biomed Informat, Houston, TX 77030 USA
[2] Univ Michigan, Sch Nursing, Ann Arbor, MI USA
[3] Mayo Clin, Dept Artificial Intelligence & Informat, Jacksonville, FL 32224 USA
基金
美国国家卫生研究院;
关键词
Informed consent; Machine learning; Natural language processing; Text classification;
D O I
10.1186/s12859-023-05568-7
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
BackgroundWith more clinical trials are offering optional participation in the collection of bio-specimens for biobanking comes the increasing complexity of requirements of informed consent forms. The aim of this study is to develop an automatic natural language processing (NLP) tool to annotate informed consent documents to promote biorepository data regulation, sharing, and decision support. We collected informed consent documents from several publicly available sources, then manually annotated them, covering sentences containing permission information about the sharing of either bio-specimens or donor data, or conducting genetic research or future research using bio-specimens or donor data.ResultsWe evaluated a variety of machine learning algorithms including random forest (RF) and support vector machine (SVM) for the automatic identification of these sentences. 120 informed consent documents containing 29,204 sentences were annotated, of which 1250 sentences (4.28%) provide answers to a permission question. A support vector machine (SVM) model achieved a F-1 score of 0.95 on classifying the sentences when using a gold standard, which is a prefiltered corpus containing all relevant sentences.ConclusionsThis study provides the feasibility of using machine learning tools to classify permission-related sentences in informed consent documents.
引用
收藏
页数:10
相关论文
共 50 条
[21]   Neuromusculoskeletal model-informed machine learning-based control of a knee exoskeleton with uncertainties quantification [J].
Zhang, Longbin ;
Zhang, Xiaochen ;
Zhu, Xueyu ;
Wang, Ruoli ;
Gutierrez-Farewik, Elena M. .
FRONTIERS IN NEUROSCIENCE, 2023, 17
[22]   Machine Learning-Based Automatic Text Summarization Techniques [J].
Radhakrishnan P. ;
Senthil kumar G. .
SN Computer Science, 4 (6)
[23]   Machine Learning-Based Prediction of Stroke in Emergency Departments [J].
Abedi, Vida ;
Misra, Debdipto ;
Chaudhary, Durgesh ;
Avula, Venkatesh ;
Schirmer, Clemens M. ;
Li, Jiang ;
Zand, Ramin .
THERAPEUTIC ADVANCES IN NEUROLOGICAL DISORDERS, 2024, 17
[24]   Machine Learning-Based Framework for the Analysis of Project Viability [J].
Tshimula, Jean Marie ;
Togashi, Atsushi .
PROCEEDINGS OF 2018 3RD INTERNATIONAL CONFERENCE ON COMPUTER AND COMMUNICATION SYSTEMS (ICCCS), 2018, :80-84
[25]   A machine learning-based feature extraction method for image classification using ResNet architecture [J].
Liao, Jing ;
Guo, Linpei ;
Jiang, Lei ;
Yu, Chang ;
Liang, Wei ;
Li, Kuanching ;
Pop, Florin .
DIGITAL SIGNAL PROCESSING, 2025, 160
[26]   Automated invoice processing: Machine learning-based information extraction for long tail suppliers [J].
Krieger, Felix ;
Drews, Paul ;
Funk, Burkhardt .
INTELLIGENT SYSTEMS WITH APPLICATIONS, 2023, 20
[27]   Machine learning-based lung and colon cancer detection using deep feature extraction and ensemble learning [J].
Talukder, Md Alamin ;
Islam, Md Manowarul ;
Uddin, Md Ashraf ;
Akhter, Arnisha ;
Hasan, Khondokar Fida ;
Moni, Mohammad Ali .
EXPERT SYSTEMS WITH APPLICATIONS, 2022, 205
[28]   MudraGyaan: A Novel Feature Extraction Algorithm for Machine Learning-Based Bharatanatyam Mudra Classification [J].
Baskar, Sarvesh ;
Hans, W. Jino ;
Anuprapaa, V. R. ;
Solomif, V. Sherlin ;
Arthi, R. .
2024 INTERNATIONAL CONFERENCE ON ADVANCEMENT IN RENEWABLE ENERGY AND INTELLIGENT SYSTEMS, AREIS, 2024,
[29]   Machine Learning-Based Feature Extraction and Classification of EMG Signals for Intuitive Prosthetic Control [J].
Kok, Chiang Liang ;
Ho, Chee Kit ;
Tan, Fu Kai ;
Koh, Yit Yan .
APPLIED SCIENCES-BASEL, 2024, 14 (13)
[30]   Machine learning-based multi-documents sentiment-oriented summarization using linguistic treatment [J].
Abdi, Asad ;
Shamsuddin, Siti Mariyam ;
Hasan, Shafaatunnur ;
Piran, Md Jalil .
EXPERT SYSTEMS WITH APPLICATIONS, 2018, 109 :66-85