Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques

被引:4
|
作者
Qinjun Qiu
Zhong Xie
Liang Wu
Liufeng Tao
机构
[1] China University of Geosciences,School of Geography and Information Engineering
[2] National Engineering Research Center of Geographic Information System,undefined
来源
Earth Science Informatics | 2020年 / 13卷
关键词
Geoscience document; Knowledge graph; Geological text mining; Natural language processing;
D O I
暂无
中图分类号
学科分类号
摘要
A large number of georeferenced quantitative data about rock and geoscience surveys are buried in geological documents and remain unused. Data analytics and information extraction offer opportunities to use this data for improved understanding of ore forming processes and to enhance our knowledge. Extracting spatiotemporal and semantic information from a set of geological documents enables us to develop a rich representation of the geoscience knowledge recorded in unstructured text written in Chinese. This paper presents the workflow for spatiotemporal and semantic information extraction, which is a geological document analysis approach that uses automated techniques for browsing and searching relevant geological content. The developed workflow applies spatial and temporal gazetteer matching, pattern-based rules and spatiotemporal relationship extraction to identify and label terms in geological text documents. It offers a representation of contextual information in knowledge graph form, extracts a set of relevant tables and figures, and queries a list of relevant documents by using geological topic information. Here, text mining techniques are used to facilitate the analysis of geological knowledge and to show the effectiveness of text analysis for improving the rapid assessment of a massive number of documents. Furthermore, autogenerated keyword suggestions derived from extracted keyword associations are used to reduce document search efforts. This research illustrates the usefulness and effectiveness of the developed information extraction workflow and demonstrates the potential of incorporating text mining and NLP techniques for geoscience.
引用
收藏
页码:1393 / 1410
页数:17
相关论文
共 44 条
  • [21] Automatic extraction and structuration of soil-environment relationship information from soil survey reports
    Wang De-sheng
    Liu Jun-zhi
    Zhu A-xing
    Wang Shu
    Zeng Can-ying
    Ma Tian-wu
    JOURNAL OF INTEGRATIVE AGRICULTURE, 2019, 18 (02) : 328 - 339
  • [22] Automatic screening for posttraumatic stress disorder in early adolescents following the Ya'an earthquake using text mining techniques
    Yuan, Yuzhuo
    Liu, Zhiyuan
    Miao, Wei
    Tian, Xuetao
    FRONTIERS IN PSYCHIATRY, 2024, 15
  • [23] Using text mining to establish knowledge graph from accident/incident reports in risk assessment
    Liu, Chang
    Yang, Shiwu
    EXPERT SYSTEMS WITH APPLICATIONS, 2022, 207
  • [24] Extraction of Disease Symptoms from Free Text Using Natural Language Processing Techniques
    Laabidi, Adil
    Aissaoui, Mohammed
    Madani, Mohamed Amine
    PROCEEDINGS OF NINTH INTERNATIONAL CONGRESS ON INFORMATION AND COMMUNICATION TECHNOLOGY, VOL 2, ICICT 2024, 2024, 1012 : 549 - 561
  • [25] Automatic Knowledge Extraction and Data Mining from Echo Reports of Pediatric Heart Disease: Application on Clinical Decision Support
    Shi, Yahui
    Li, Zuofeng
    Jia, Zheng
    Hu, Binyang
    Ju, Meizhi
    Zhang, Xiaoyan
    Li, Haomin
    CHINESE COMPUTATIONAL LINGUISTICS AND NATURAL LANGUAGE PROCESSING BASED ON NATURALLY ANNOTATED BIG DATA (CCL 2015), 2015, 9427 : 417 - 424
  • [26] Information Extraction from Spam Emails using Stylistic and Semantic Features to Identify Spammers
    Halder, Soma
    Tiwari, Richa
    Sprague, Alan
    2011 IEEE INTERNATIONAL CONFERENCE ON INFORMATION REUSE AND INTEGRATION (IRI), 2011, : 104 - 107
  • [27] Automatic Extraction of Major Osteoporotic Fractures from Radiology Reports using Natural Language Processing
    Wang, Yanshan
    Mehrabi, Saeed
    Sohn, Sunghwan
    Atkinson, Elizabeth
    Amin, Shreyasee
    Liu, Hongfang
    2018 IEEE INTERNATIONAL CONFERENCE ON HEALTHCARE INFORMATICS WORKSHOPS (ICHI-W), 2018, : 64 - 65
  • [28] Toward Complete Structured Information Extraction from Radiology Reports Using Machine Learning
    Jackson M. Steinkamp
    Charles Chambers
    Darco Lalevic
    Hanna M. Zafar
    Tessa S. Cook
    Journal of Digital Imaging, 2019, 32 : 554 - 564
  • [29] Toward Complete Structured Information Extraction from Radiology Reports Using Machine Learning
    Steinkamp, Jackson M.
    Chambers, Charles
    Lalevic, Darco
    Zafar, Hanna M.
    Cook, Tessa S.
    JOURNAL OF DIGITAL IMAGING, 2019, 32 (04) : 554 - 564
  • [30] Using text-mining techniques in electronic patient records to identify ADRs from medicine use
    Warrer, Pernille
    Hansen, Ebba Holme
    Juhl-Jensen, Lars
    Aagaard, Lise
    BRITISH JOURNAL OF CLINICAL PHARMACOLOGY, 2012, 73 (05) : 674 - 684