Automatic terminological collocations extraction from large corpus

被引:0
|
作者
Suarez, Octavio Santana [1 ]
Aguiar, Jose Perez [1 ]
Berriel, Isabel Sanchez [2 ]
Rodriguez, Virginia Gutierrez [2 ]
机构
[1] Univ Las Palmas Gran Canaria, Edificio Dept Informat & Matemat, Las Palmas Gran Canaria 35017, Spain
[2] Univ La Laguna, Edificio Fis & Matemat,Campus Univ Anchieta, San Cristobal la Laguna 38271, Spain
来源
PROCESAMIENTO DEL LENGUAJE NATURAL | 2011年 / 47期
关键词
automatic extraction of collocations; terminology; computational linguistics; text mining;
D O I
暂无
中图分类号
H0 [语言学];
学科分类号
030303 ; 0501 ; 050102 ;
摘要
The automatic systems which deal with term's extractions constitute an important tool when they make reference to the labor of compilation of lexemes, which is restricted to a specific field or specialty. The textual analysis that are realized for this type of software must include strategies that could detect collocations in the field in which is done. In this topic is studied the viability of the use from extensive textual's corpus, that have not contain linguistic information, as happen with those textual's corpus that could be compiled from internet. The internet is used like a source of information for the recompilation of terminology's collocations. With that purpose is analyzed the behavior of different indicators based on the frequencies registered for a collection of economic terms in a Spanish corpus of 300.000 words.
引用
收藏
页码:145 / 152
页数:8
相关论文
共 50 条
  • [1] Toward an Automatic Extraction of Collocations in Verb Definitions from a Spanish Explanatory Dictionary
    Alejandro Castro-Sanchez, Noe
    Cruz Dominguez, Irasema
    Sidorov, Grigori
    Martiez Rebollar, Alicia
    REVISTA SIGNOS, 2015, 48 (88): : 174 - 196
  • [2] EMCOR: a medical corpus for terminological purposes
    Varela Vila, Tamara
    Sanchez Trigo, Elena
    JOURNAL OF SPECIALISED TRANSLATION, 2012, (18): : 139 - 159
  • [3] Automatic Retrieval of Parallel Collocations
    Novitskiy, Valeriy I.
    PATTERN RECOGNITION AND MACHINE INTELLIGENCE, 2011, 6744 : 261 - 267
  • [4] CULTURAL AND TERMINOLOGICAL LANDMARKS IN THE CREATION OF THE MCVRO CORPUS
    Chircu, Adrian
    DISCOURSE AS A FORM OF MULTICULTURALISM IN LITERATURE AND COMMUNICATION - LANGUAGE AND DISCOURSE, 2015, : 94 - 102
  • [5] Exploitation of Causal Relation for Automatic Extraction of Contradiction from a Domain-Restricted Patent Corpus
    Berdyugina, Daria
    Cavallucci, Denis
    SYSTEMATIC INNOVATION PARTNERSHIPS WITH ARTIFICIAL INTELLIGENCE AND INFORMATION TECHNOLOGY, 2022, 655 : 86 - 95
  • [6] Methodical analysis for terminological extraction in a nuclear domain
    Calberg-Challot, Marie
    Candel, Danielle
    Bourigault, Didier
    Dumont, Xavier
    Et, John Humbley
    Joseph, Jacques
    TERMINOLOGY, 2008, 14 (02): : 183 - 203
  • [7] Automatic Enrichment of Terminological Resources: the IATE RDF Example
    Arcan, Mihael
    Montiel-Ponsoda, Elena
    McCrae, John P.
    Buitelaar, Paul
    PROCEEDINGS OF THE ELEVENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2018), 2018, : 930 - 937
  • [8] Terminological Thesaurus of the Gospel Text by Dostoevsky: Corpus Analysis and Interpretation Results
    Borisova, Valentina V.
    NEIZVESTNYI DOSTOEVSKII-THE UNKNOWN DOSTOEVSKY, 2023, 10 (02):
  • [9] Automatic users extraction from patents
    Chiarello, Filippo
    Cimino, Andrea
    Fantoni, Gualtiero
    Dell'Orletta, Felice
    WORLD PATENT INFORMATION, 2018, 54 : 28 - 38
  • [10] Automatic framenet-guided extraction of terminology: an application to the cord-19 electronic corpus
    Crespo Miguel, Mario
    ESTUDIOS DE LINGUISTICA-UNIVERSIDAD DE ALICANTE-ELUA, 2022, (38): : 281 - 300