A class of neural-network-based transducers for web information extraction

被引:0
作者
Sleiman, Hassan A. [1 ]
Corchuelo, Rafael [1 ]
机构
[1] University of Sevilla, ETSI Informática, 41012 Sevilla, Spain
关键词
Information retrieval - Learning algorithms - Learning systems;
D O I
暂无
中图分类号
学科分类号
摘要
The Web is a huge and still growing information repository that has attracted the attention of many companies. Many such companies rely on information extractors to integrate information that is buried into semi-structured web documents into automatic business processes. Many information extractors build on extraction rules, which can be handcrafted or learned using supervised or unsupervised techniques. The literature provides a variety of techniques to learn information extraction rules that build on ad hoc machine learning techniques. In this paper, we propose a hybrid approach that explores the use of standard machine-learning techniques to extract web information. We have specifically explored using neural networks; our results show that our proposal outperforms three state-of-the-art techniques in the literature, which opens up quite a new approach to information extraction. © 2013 Elsevier B.V.
引用
收藏
页码:61 / 68
相关论文
empty
未找到相关数据