Biomedical text mining for research rigor and integrity: tasks, challenges, directions

被引:34
作者
Kilicoglu, Halil [1 ]
机构
[1] US Natl Lib Med, Lister Hill Natl Ctr Biomed Commun, Bethesda, MD 20894 USA
基金
美国国家卫生研究院;
关键词
biomedical research waste; biomedical text mining; natural language processing; research rigor; research integrity; reproducibility; AUTOMATIC RECOGNITION; PLAGIARISM; ARTICLES; CITATION; KNOWLEDGE; REPRODUCIBILITY; CLASSIFICATION; EXTRACTION; SENTENCES; MEDICINE;
D O I
10.1093/bib/bbx057
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
An estimated quarter of a trillion US dollars is invested in the biomedical research enterprise annually. There is growing alarm that a significant portion of this investment is wasted because of problems in reproducibility of research findings and in the rigor and integrity of research conduct and reporting. Recent years have seen a flurry of activities focusing on standardization and guideline development to enhance the reproducibility and rigor of biomedical research. Research activity is primarily communicated via textual artifacts, ranging from grant applications to journal publications. These artifacts can be both the source and the manifestation of practices leading to research waste. For example, an article may describe a poorly designed experiment, or the authors may reach conclusions not supported by the evidence presented. In this article, we pose the question of whether biomedical text mining techniques can assist the stakeholders in the biomedical research enterprise in doing their part toward enhancing research integrity and rigor. In particular, we identify four key areas in which text mining techniques can make a significant contribution: plagiarism/fraud detection, ensuring adherence to reporting guidelines, managing information overload and accurate citation/enhanced bibliometrics. We review the existing methods and tools for specific tasks, if they exist, or discuss relevant research that can provide guidance for future work. With the exponential increase in biomedical research output and the ability of text mining approaches to perform automatic tasks at large scale, we propose that such approaches can support tools that promote responsible research practices, providing significant benefits for the biomedical research enterprise.
引用
收藏
页码:1400 / 1414
页数:15
相关论文
共 50 条
  • [41] Genetic syndromes, neuroconstuctivism and replicable research; challenges and future directions
    Farran, Emily K.
    Scerif, Gaia
    INFANT AND CHILD DEVELOPMENT, 2022, 31 (01)
  • [42] Integrity in Biomedical Research: A Systematic Review of Studies in China
    Nannan Yi
    Benoit Nemery
    Kris Dierickx
    Science and Engineering Ethics, 2019, 25 : 1271 - 1301
  • [43] Exploring Antecedents, Consequences, Research Constituents and Future Directions of Circular Economy: A Predictive Analysis in the Preview of Text Mining
    Mishra, Manoj Kumar
    Sharma, Chetan
    Sharma, Shamneesh
    Kumar, Sunil
    Srivastav, Arun Lal
    JOURNAL OF THE KNOWLEDGE ECONOMY, 2024,
  • [44] Preprocessing Techniques for Clustering Arabic Text: Challenges and Future Directions
    Almutairi, Tahani
    Saifuddin, Shireen
    Alotaibi, Reem
    Sarhan, Shahendah
    Nassif, Sarah
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2024, 15 (08) : 1301 - 1314
  • [45] A comprehensive evaluation of large Language models on benchmark biomedical text processing tasks
    Jahan, Israt
    Laskar, Md Tahmid Rahman
    Peng, Chun
    Huang, Jimmy Xiangji
    COMPUTERS IN BIOLOGY AND MEDICINE, 2024, 171
  • [46] BCISeach: a searching platform of breast cancer text mining for biomedical literature
    Gong, Lejun
    Yang, Ronggen
    Yang, Haoyu
    Dong, Zhenjiang
    Jiang, Kaiyu
    Chen, Hong
    Yang, Geng
    PROCEEDINGS OF 2016 12TH INTERNATIONAL CONFERENCE ON SEMANTICS, KNOWLEDGE AND GRIDS (SKG), 2016, : 158 - 161
  • [47] RIGOR, RELEVANCE AND CHALLENGES IN ADMINISTRATION: TENSIONS BETWEEN RESEARCH AND PROFESSIONAL EDUCATION
    Mascarenhas, Andre Ofenhejm
    Zambaldi, Felipe
    de Moraes, Edmilson Alves
    RAE-REVISTA DE ADMINISTRACAO DE EMPRESAS, 2011, 51 (03): : 265 - 279
  • [48] Efficient Retrieval of Text for Biomedical Domain using Data Mining Algorithm
    Vashishta, Sumit
    Jain, Yogendra Kumar
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2011, 2 (04) : 77 - 80
  • [49] Text Mining for Building Biomedical Networks Using Cancer as a Case Study
    Conceicao, Sofia I. R.
    Couto, Francisco M.
    BIOMOLECULES, 2021, 11 (10)
  • [50] Research on Relation Classification Tasks Based on Cybersecurity Text
    Shi, Ze
    Li, Hongyi
    Zhao, Di
    Pan, Chengwei
    MATHEMATICS, 2023, 11 (12)