Automated synthesis of biodiversity knowledge requires better tools and standardised research output

被引:3
作者
Cornford, Richard [1 ,2 ,3 ]
Millard, Joseph [4 ,5 ]
Gonzalez-Suarez, Manuela [6 ]
Freeman, Robin [2 ]
Johnson, Thomas Frederick [7 ]
机构
[1] Imperial Coll London, Dept Life Sci, London, England
[2] Zool Soc London, Inst Zool, London, England
[3] Nat Hist Museum, Dept Life Sci, London, England
[4] UCL, Dept Genet Evolut & Environm, London, England
[5] Univ Oxford, Leverhulme Ctr Demog Sci, Oxford, England
[6] Univ Reading, Sch Biol Sci, Reading, Berks, England
[7] Univ Sheffield, Dept Anim & Plant Sci, Sheffield, S Yorkshire, England
关键词
data extraction; ecology; literature synthesis; machine learning; population trends; text mining;
D O I
10.1111/ecog.06068
中图分类号
X176 [生物多样性保护];
学科分类号
090705 ;
摘要
As the impact of anthropogenic activity on the environment has grown, research into biodiversity change and associated threats has also accelerated. Synthesising this vast literature is important for understanding the drivers of biodiversity change and identifying those actions that will mitigate further ecological losses. However, keeping pace with an ever-increasing publication rate presents a substantial challenge to efficient syntheses, an issue which could be partly addressed by increasing levels of automation in the synthesis pipeline. Here, we evaluate the potential for automated tools to extract ecologically important information from the abstracts of articles compiled in the Living Planet Database. Specifically, we focused on extracting key information on taxonomy (studied species names), geographic location and estimated population trend, assessing the accuracy of automated versus manual information extraction, the potential for automated tools to introduce biases into syntheses, and evaluating if synthesising abstracts was enough to capture the key information from the full article. Taxonomic and geographic extraction tools performed reasonably well, although information on studied species was sometimes limited in the abstract (compared to the main text) preventing fast extraction. In contrast, extraction of trends was less successful, highlighting the challenges involved in automating information extraction from abstracts, such as deficiencies in the algorithms, linguistic complexity associated with ecological findings, and limited information when compared to the main text. In light of these results, we cautiously advocate for a wider use of automated taxonomic and geographic parsing tools for ecological synthesis. Additionally, to further the use of automated synthesis within ecology, we recommend a dual approach: development of improved computational tools to reduce biases; and enhanced protocols for abstracts (and associated metadata) to ensure key information is included in a format that facilitates machine-readability.
引用
收藏
页数:9
相关论文
共 64 条
[1]   NetiNeti: discovery of scientific names from text using machine learning methods [J].
Akella, Lakshmi Manohar ;
Norton, Catherine N. ;
Miller, Holly .
BMC BIOINFORMATICS, 2012, 13
[2]  
Almond R. E., 2020, Living Planet Report 2020Bending the curve of biodiversity loss. World Wildlife Fund
[3]   Tapping into non-English-language science for the conservation of global biodiversity [J].
Amano, Tatsuya ;
Berdejo-Espinola, Violeta ;
Christie, Alec P. ;
Willott, Kate ;
Akasaka, Munemitsu ;
Baldi, Andras ;
Berthinussen, Anna ;
Bertolino, Sandro ;
Bladon, Andrew J. ;
Chen, Min ;
Choi, Chang-Yong ;
Kharrat, Magda Bou Dagher ;
de Oliveira, Luis G. ;
Farhat, Perla ;
Golivets, Marina ;
Aranzamendi, Nataly Hidalgo ;
Jantke, Kerstin ;
Kajzer-Bonk, Joanna ;
Aytekin, M. Cisel Kemahli ;
Khorozyan, Igor ;
Kito, Kensuke ;
Konno, Ko ;
Lin, Da-Li ;
Littlewood, Nick ;
Liu, Yang ;
Liu, Yifan ;
Loretto, Matthias-Claudio ;
Marconi, Valentina ;
Martin, Philip A. ;
Morgan, William H. ;
Narvaez-Gomez, Juan P. ;
Negret, Pablo Jose ;
Nourani, Elham ;
Ochoa Quintero, Jose M. ;
Ockendon, Nancy ;
Oh, Rachel Rui Ying ;
Petrovan, Silviu O. ;
Piovezan-Borges, Ana C. ;
Pollet, Ingrid L. ;
Ramos, Danielle L. ;
Segovia, Ana L. Reboredo ;
Nayelli Rivera-Villanueva, A. ;
Rocha, Ricardo ;
Rouyer, Marie-Morgane ;
Sainsbury, Katherine A. ;
Schuster, Richard ;
Schwab, Dominik ;
Sekercioglu, Cagan H. ;
Seo, Hae-Min ;
Shackelford, Gorm .
PLOS BIOLOGY, 2021, 19 (10)
[4]   Supporting Systematic Reviews Using Text Mining [J].
Ananiadou, Sophia ;
Rea, Brian ;
Okazaki, Naoaki ;
Procter, Rob ;
Thomas, James .
SOCIAL SCIENCE COMPUTER REVIEW, 2009, 27 (04) :509-523
[5]   Trends in ecology and conservation over eight decades [J].
Anderson, Sean C. ;
Elsen, Paul R. ;
Hughes, Brent B. ;
Tonietto, Rebecca K. ;
Bletz, Molly C. ;
Gill, David A. ;
Holgerson, Meredith A. ;
Kuebbing, Sara E. ;
McDonough MacKenzie, Caitlin ;
Meek, Mariah H. ;
Verissimo, Diogo .
FRONTIERS IN ECOLOGY AND THE ENVIRONMENT, 2021, 19 (05) :274-282
[7]   Developing a fully automated evidence synthesis tool for identifying, assessing and collating the evidence [J].
Brassey, Jon ;
Price, Christopher ;
Edwards, Jonny ;
Zlabinger, Markus ;
Bampoulidis, Alexandros ;
Hanbury, Allan .
BMJ EVIDENCE-BASED MEDICINE, 2021, 26 (01) :24-27
[8]  
Buscaldi D., 2008, Proceeding of the 2nd international workshop on Geographic information retrieval, GIR '08, P19, DOI DOI 10.1145/1460007.1460011
[9]   Accelerated modern human-induced species losses: Entering the sixth mass extinction [J].
Ceballos, Gerardo ;
Ehrlich, Paul R. ;
Barnosky, Anthony D. ;
Garcia, Andres ;
Pringle, Robert M. ;
Palmer, Todd M. .
SCIENCE ADVANCES, 2015, 1 (05)
[10]  
Chamberlain S., 2018, R PACKAGE