SEED: A Framework for Extracting Social Events from Press News

被引:0
作者
Orlando, Salvatore [1 ]
Pizzolon, Francesco [1 ]
Tolomei, Gabriele [1 ]
机构
[1] Univ Ca Foscari, DAIS, Venice, Italy
来源
PROCEEDINGS OF THE 22ND INTERNATIONAL CONFERENCE ON WORLD WIDE WEB (WWW'13 COMPANION) | 2013年
关键词
Information extraction; Named-entity recognition; Relation extraction; Social event discovery; RULES;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Everyday people are exchanging a huge amount of data through the Internet. Mostly, such data consist of unstructured texts, which often contain references to structured information (e.g., person names, contact records, etc.). In this work, we propose a novel solution to discover social events from actual press news edited by humans. Concretely, our method is divided in two steps, each one addressing a specific Information Extraction (IE) task: first, we use a technique to automatically recognize four classes of named-entities from press news: DATE, LOCATION, PLACE, and ARTIST. Furthermore, we detect social events by extracting ternary relations between such entities, also exploiting evidence from external sources (i.e., the Web). Finally, we evaluate both stages of our proposed solution on a real-world dataset. Experimental results highlight the quality of our first-step Named-Entity Recognition (NER) approach, which indeed performs consistently with state-of-the-art solutions. Eventually, we show how to precisely select true events from the list of all candidate events (i.e., all the ternary relations), which result from our second-step Relation Extraction (RE) method. Indeed, we discover that true social events can be detected if enough evidence of those is found in the result list of Web search engines.
引用
收藏
页码:1285 / 1293
页数:9
相关论文
共 26 条
[1]  
Agichtein E., 2000, ACM 2000. Digital Libraries. Proceedings of the Fifth ACM Conference on Digital Libraries, P85, DOI 10.1145/336597.336644
[2]  
[Anonymous], 2002, COMP ALGORITHMS MAXI, DOI DOI 10.3115/1118853.1118871
[3]  
[Anonymous], 2007, Proceedings of the 16th ACM Conference on Con- ference on Information and Knowledge Management, DOI DOI 10.1145/1321440.1321475.19
[4]  
[Anonymous], 2001, ICML 01 P 18 INT C M
[5]  
[Anonymous], 2009, THESIS U WASHINGTON
[6]  
Bach N., 2007, Lit. Rev. Lang. Stat, V2, P1
[7]  
Brin S, 1999, LECT NOTES COMPUT SC, V1590, P172
[8]  
Bunescu R.C., 2005, P C HUM LANG TECHN E, DOI DOI 10.3115/1220575.1220666
[9]   Bottom-up relational learning of pattern matching rules for information extraction [J].
Califf, ME ;
Mooney, RJ .
JOURNAL OF MACHINE LEARNING RESEARCH, 2004, 4 (02) :177-210
[10]  
Cunningham H, 2002, 40TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, PROCEEDINGS OF THE CONFERENCE, P168