Multi-word term indexing for Arabic document retrieval

被引:0
|
作者
Boulaknadel, Siham [1 ]
Daille, Beatrice [1 ]
Driss, Aboutajdine [2 ]
机构
[1] Univ Nantes, CNRS, FRE 2729, LINA, 2 Rue Houssinire,BP 92208, F-44322 Nantes 03, France
[2] Mohammed V Univ, GSCM, Rabat, Morocco
来源
2008 IEEE SYMPOSIUM ON COMPUTERS AND COMMUNICATIONS, VOLS 1-3 | 2008年
关键词
D O I
暂无
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
To improve information retrieval system performances, it seems important to identify key phrases which constitute a better representation of text semantic content than single word terms. In this paper, we adapt the standard method for multi-word term extraction for Arabic language. We define the linguistic specifications and develop a term extraction tool. We experiment the term extraction program for document retrieval in a specific domain, evaluate two kinds of multi-word term weighting functions considering either the corpus or the document, and demonstrate the efficiency of multi-word term indexing for both weighting up to 5.8% of average precision.
引用
收藏
页码:480 / +
页数:3
相关论文
共 50 条
  • [31] Multi-scale audio indexing for translingual spoken document retrieval
    Wang, HM
    Meng, H
    Schone, P
    Chen, B
    Lo, WK
    2001 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-VI, PROCEEDINGS: VOL I: SPEECH PROCESSING 1; VOL II: SPEECH PROCESSING 2 IND TECHNOL TRACK DESIGN & IMPLEMENTATION OF SIGNAL PROCESSING SYSTEMS NEURALNETWORKS FOR SIGNAL PROCESSING; VOL III: IMAGE & MULTIDIMENSIONAL SIGNAL PROCESSING MULTIMEDIA SIGNAL PROCESSING - VOL IV: SIGNAL PROCESSING FOR COMMUNICATIONS; VOL V: SIGNAL PROCESSING EDUCATION SENSOR ARRAY & MULTICHANNEL SIGNAL PROCESSING AUDIO & ELECTROACOUSTICS; VOL VI: SIGNAL PROCESSING THEORY & METHODS STUDENT FORUM, 2001, : 605 - 608
  • [32] Multi-word term variation Prepositional and adjectival complex nominals in Spanish
    Cabezas-Garcia, Melania
    Chambo, Santiago
    REVISTA ESPANOLA DE LINGUISTICA APLICADA, 2021, 34 (02): : 402 - 434
  • [33] Enabling Indexing and Retrieval of Historical Arabic Manuscripts through Template Matching Based Word Spotting
    Faisal, Tayyeba
    AlMaadeed, Somaya
    2017 1ST INTERNATIONAL WORKSHOP ON ARABIC SCRIPT ANALYSIS AND RECOGNITION (ASAR), 2017, : 57 - 63
  • [34] Chunks, multi-word units et cetera: The role of multi-word units in second language acquisition
    Aguado, Karin
    DEUTSCH ALS FREMDSPRACHE-ZEITSCHRIFT ZUR THEORIE UND PRAXIS DES FACHES DEUTSCH ALS FREMDSPRACHE, 2024, 61 (01):
  • [35] Document Indexing Framework for Retrieval of Degraded Document Images
    Garg, Ritu
    Hassan, Ehtesham
    Chaudhury, Santanu
    2015 13TH IAPR INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION (ICDAR), 2015, : 1261 - 1265
  • [36] Phonological similarity in multi-word units
    Gries, Stefan Th.
    COGNITIVE LINGUISTICS, 2011, 22 (03) : 491 - 510
  • [37] Verbal Multi-Word Expressions in Yiddish
    Liebeskind, Chaya
    HaCohen-Kerner, Yaakov
    NATURAL LANGUAGE PROCESSING AND INFORMATION SYSTEMS (NLDB 2018), 2018, 10859 : 205 - 216
  • [38] Lexical selection in multi-word production
    Janssen, Niels
    Caramazza, Alfonso
    FRONTIERS IN PSYCHOLOGY, 2011, 2
  • [39] EXPERIMENTS WITH DOCUMENT COMPONENTS FOR INDEXING AND RETRIEVAL
    KWOK, KL
    KUAN, W
    INFORMATION PROCESSING & MANAGEMENT, 1988, 24 (04) : 405 - 417
  • [40] Experiments with document components for indexing and retrieval
    Kwok, K.L.
    Kuan, William
    Information Processing and Management, 1988, 24 (04): : 405 - 417