A general framework for subjective information extraction from unstructured English text

被引:10
|
作者
Mangassarian, Hratch [1 ]
Artail, Hassan [1 ]
机构
[1] Amer Univ Beirut, Dept Elect & Comp Engn, Beirut, Lebanon
关键词
information extraction; natural language processing; text evaluation; intelligent systems; financial analysis;
D O I
10.1016/j.datak.2006.10.001
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we present an information extraction (IE) strategy for handling subjective information from unstructured text. The presented methodology is general: it can be useful in many real-life applications that could potentially benefit from an automatic IE system that makes human-like decisions. We test our methodology in the sphere of company news evaluation with respect to the potential effect of the news on the company's stock prices. The described general framework comprises four sequential processing steps: part-of-speech tagging, syntactic parsing, relation generation, and criteria evaluation. The first two steps perform generic NLP tasks, while the last two phases are application-specific and require a thorough understanding of the application domain. We describe each stage and illustrate the flow of the modus operandi. We keep up with the company news evaluation example throughout the paper. Due to the inherent subjectivity of the envisaged problem, results cannot be categorically justified. However, comparing the system's evaluation of company news to our own, the results were very encouraging. (c) 2006 Elsevier B.V. All rights reserved.
引用
收藏
页码:352 / 367
页数:16
相关论文
共 50 条
  • [41] A System for Adaptive Information Extraction from Highly Informal Text
    Alonso i Alemany, Laura
    Carrascosa, Rafael
    NATURAL LANGUAGE PROCESSING AND INFORMATION SYSTEMS, 2011, 6716 : 145 - 152
  • [42] Information Retrieval for Unstructured Text Documents in Serbian into the Crime Domain
    Nikolic, Vojkan
    Markoski, Branko
    Ivkovic, Miodrag
    Kuk, Kristijan
    Djikanovic, Predrag
    2015 16TH IEEE INTERNATIONAL SYMPOSIUM ON COMPUTATIONAL INTELLIGENCE AND INFORMATICS (CINTI), 2015, : 267 - 271
  • [43] Text Mining from Free Unstructured Text: An Experiment of Time Series Retrieval for Volcano Monitoring
    Berardi, Margherita
    Santamaria Amato, Luigi
    Cigna, Francesca
    Tapete, Deodato
    Siciliani de Cumis, Mario
    APPLIED SCIENCES-BASEL, 2022, 12 (07):
  • [44] Developing a mountaineering plan sharing system based on information extraction from unstructured documents
    Nohara, Akihiro
    Shiramatsu, Shun
    Ozono, Tadachika
    Shintani, Toramatsu
    IEEJ Transactions on Electronics, Information and Systems, 2015, 135 (12) : 1470 - 1480
  • [45] A Comprehensive Evaluation of a Novel Approach to Probabilistic Information Extraction from Large Unstructured Datasets
    Trovati, Marcello
    2015 International Conference on Intelligent Networking and Collaborative Systems IEEE INCoS 2015, 2015, : 459 - 462
  • [46] Unsupervised information extraction from unstructured, ungrammatical data sources on the World Wide Web
    Michelson, Matthew
    Knoblock, Craig A.
    INTERNATIONAL JOURNAL ON DOCUMENT ANALYSIS AND RECOGNITION, 2007, 10 (3-4) : 211 - 226
  • [47] Hybrid System for Information Extraction from Social Media Text: Drug Abuse Case Study
    Jenhani, Ferdaous
    Gouider, Mohamed Salah
    Ben Said, Lamjed
    KNOWLEDGE-BASED AND INTELLIGENT INFORMATION & ENGINEERING SYSTEMS (KES 2019), 2019, 159 : 688 - 697
  • [48] Unsupervised information extraction from unstructured, ungrammatical data sources on the World Wide Web
    Matthew Michelson
    Craig A. Knoblock
    International Journal of Document Analysis and Recognition (IJDAR), 2007, 10 : 211 - 226
  • [49] CustRE: a rule based system for family relations extraction from english text
    Mumtaz, Raabia
    Qadir, Muhammad Abdul
    KNOWLEDGE AND INFORMATION SYSTEMS, 2022, 64 (07) : 1817 - 1844
  • [50] CustRE: a rule based system for family relations extraction from english text
    Raabia Mumtaz
    Muhammad Abdul Qadir
    Knowledge and Information Systems, 2022, 64 : 1817 - 1844