Evaluation of selection criteria for noun phrases with relevance for information retrieval

被引:2
作者
do Nascimento, Gustavo Diniz [1 ]
Correa, Renato Fernandes [2 ]
机构
[1] Univ Fed Campina Grande, Biblioteca Cent, Campina Grande, PB, Brazil
[2] Univ Fed Pernambuco, Ctr Artes & Comunicacao, Dept Ciencia Informacao, Av Arquitetura S-N,Campus Univ, BR-50740550 Recife, PE, Brazil
来源
TRANSINFORMACAO | 2018年 / 30卷 / 02期
关键词
Automatic indexing; Legal information; Information representation; Noun phrase selection; Noun phrases; DISSERTATIONS;
D O I
10.1590/2318-08892018000200004
中图分类号
G25 [图书馆学、图书馆事业]; G35 [情报学、情报工作];
学科分类号
1205 ; 120501 ;
摘要
This study assesses the criteria for selecting the most representative noun phrases from documents written in Portuguese in the field of law. The research methods were literature review and an experiment. In the experiment, ten selection criteria were applied to noun phrases extracted from a set of abstracts of theses and dissertations. The effectiveness of the criteria was assessed regarding the selection of noun phrases relevant for information retrieval. Through the experiment, the most effective criteria identified were removal of noun phrases with stopwords value or noun phrases containing pronouns, the selection criteria of noun phrases based on position of occurrence, level of the noun phrase, inverse document frequency, and document occurrence frequency.
引用
收藏
页码:179 / 192
页数:14
相关论文
共 22 条
[1]  
[Anonymous], 2014, THESIS
[2]  
Bick E, 2000, THESIS, P505
[3]  
Borges GSB, 2008, INFORM SOC-ESTUD, V18, P181
[4]  
CORREA R. F., 2013, CIENCIA INFORM BRASI, V42, P255
[5]   Indexing and information retrieval of theses and dissertations through noun phrases [J].
Correa, Renato Fernandes ;
de Miranda, Darliane Goes ;
de Almeida Lima, Camila Oliveira ;
da Silva, Tiago Jose .
ATOZ-NOVAS PRATICAS EM INFORMACAO E CONHECIMENTO, 2011, 1 (01) :11-22
[6]   Automatic keyphrase extraction from scientific articles [J].
Kim, Su Nam ;
Medelyan, Olena ;
Kan, Min-Yen ;
Baldwin, Timothy .
LANGUAGE RESOURCES AND EVALUATION, 2013, 47 (03) :723-742
[7]  
Kuramoto H, 1995, CIENCIA INFORM, V25, P1
[8]  
KURAMOTO Helio, 2002, DATAGRAMAZERO REV CI, V3
[9]  
LEGUERN M, 1991, FRANCAIS MODERN, V59, P22
[10]  
Lopes L, 2012, THESIS, P156