Extraction of mitigation-related text from Endangered Species Act documents using machine learning: a case study

被引:0
|
作者
Varghese A. [1 ]
Allen K. [2 ]
Agyeman-Badu G. [1 ]
Haire J. [2 ]
Madsen R. [3 ]
机构
[1] ICF, 2635 Meridian Parkway, Durham, 27713, NC
[2] ICF, 980 9th Street, Suite 1200, Sacramento, 95814, CA
[3] Electric Power Resources Institute, 3420 Hillview Avenue, Palo Alto, 94303, CA
关键词
Artificial intelligence; BERT; Endangered Species Act; Machine learning; Natural language processing; Text mining;
D O I
10.1007/s10669-021-09830-2
中图分类号
学科分类号
摘要
Various industrial and development projects have the potential to adversely affect threatened and endangered species and their habitats. The federal Endangered Species Act (ESA) requires preparation of a biological assessment or habitat conservation plan before federal agencies can authorize, through decision documents and permits, unintentional and otherwise prohibited “take” (i.e., harm) of listed species. These documents describe the potential effects of proposed projects on listed species and include measures to mitigate those effects. Collectively, these assessments, plans, decision documents, and permits—termed ESA documents in our study—are valuable for identifying approved mitigation options that could apply to future projects. However, owing to the volume, length, and complexity of these documents, manual review would be time- and labor-intensive. In this study, we apply three supervised machine learning algorithms, including two based on state-of-the-art transfer learning, to develop and evaluate predictive models capable of extracting mitigation-related text from ESA documents. The machine learning models were developed based on a training dataset that was created as part of this study. The best performing model showed an estimated ROC-AUC score of 0.98 and a precision recall AUC score of 0.86 during cross-validation, indicating great potential for effectively extracting mitigation-related content from existing documents. To illustrate the utility of this technology, we present a simulated case study application in which the use of pretrained machine learning models capable of recognizing mitigation measures, coupled with a large historical corpus of ESA documents and keyword filters, provided a means to rapidly assess the commonly used mitigation measures for a given species. While this technology did not eliminate the requirement for biological expertise, it did allow for rapid scoping assessments and could serve as a supporting resource even for experienced biologists. © 2021, The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature.
引用
收藏
页码:63 / 74
页数:11
相关论文
共 50 条
  • [1] Learning from endangered and threatened species recovery programs: A case study using US Endangered Species Act recovery scores
    Kerkvliet, Joe
    Lanypap, Christian
    ECOLOGICAL ECONOMICS, 2007, 63 (2-3) : 499 - 510
  • [2] Detecting malware using text documents extracted from spam email through machine learning
    Angel Redondo-Gutierrez, Luis
    Janez-Martino, Francisco
    Fidalgo, Eduardo
    Alegre, Enrique
    Gonzalez-Castro, Victor
    Alaiz-Rodriguez, Rocio
    PROCEEDINGS OF THE 2022 ACM SYMPOSIUM ON DOCUMENT ENGINEERING, DOCENG 2022, 2022,
  • [3] Automatic extraction of titles from general documents using machine learning
    Hu, Yunhua
    Li, Hang
    Cao, Yunbo
    Teng, Li
    Meyerzon, Dmitriy
    Zheng, Qinghua
    INFORMATION PROCESSING & MANAGEMENT, 2006, 42 (05) : 1276 - 1293
  • [4] Automatic extraction of titles from general documents using machine learning
    Hu, YH
    Li, H
    Cao, YB
    Meyerzon, D
    Zheng, QH
    PROCEEDINGS OF THE 5TH ACM/IEEE JOINT CONFERENCE ON DIGITAL LIBRARIES, PROCEEDINGS, 2005, : 145 - 154
  • [6] Predicting applicable law sections from judicial case reports using legislative text analysis with machine learning
    Souvik Sengupta
    Vishwang Dave
    Journal of Computational Social Science, 2022, 5 : 503 - 516
  • [7] Predicting applicable law sections from judicial case reports using legislative text analysis with machine learning
    Sengupta, Souvik
    Dave, Vishwang
    JOURNAL OF COMPUTATIONAL SOCIAL SCIENCE, 2022, 5 (01): : 503 - 516
  • [8] Effective sampling for drift mitigation in machine learning using scenario selection: A microgrid case study
    Darville, Joshua
    Yavuz, Abdurrahman
    Runsewe, Temitope
    Celik, Nurcin
    APPLIED ENERGY, 2023, 341
  • [9] Tourism-Related Placeness Feature Extraction From Social Media Data Using Machine Learning Models
    Munoz, P.
    Donaque, E.
    Larranaga, A.
    Martinez, J.
    Mejias, A.
    INTERNATIONAL JOURNAL OF INTERACTIVE MULTIMEDIA AND ARTIFICIAL INTELLIGENCE, 2023, 8 (04): : 176 - 181
  • [10] Using Machine Learning to Capture Quality Metrics from Natural Language: A Case Study of Diabetic Eye Exams
    Fong, Allan
    Scoulios, Nicholas
    Blumenthal, H. Joseph
    Anderson, Ryan E.
    METHODS OF INFORMATION IN MEDICINE, 2021, 60 (03/04) : 110 - 115