Automatic terminological collocations extraction from large corpus

被引:0
|
作者
Suarez, Octavio Santana [1 ]
Aguiar, Jose Perez [1 ]
Berriel, Isabel Sanchez [2 ]
Rodriguez, Virginia Gutierrez [2 ]
机构
[1] Univ Las Palmas Gran Canaria, Edificio Dept Informat & Matemat, Las Palmas Gran Canaria 35017, Spain
[2] Univ La Laguna, Edificio Fis & Matemat,Campus Univ Anchieta, San Cristobal la Laguna 38271, Spain
来源
PROCESAMIENTO DEL LENGUAJE NATURAL | 2011年 / 47期
关键词
automatic extraction of collocations; terminology; computational linguistics; text mining;
D O I
暂无
中图分类号
H0 [语言学];
学科分类号
030303 ; 0501 ; 050102 ;
摘要
The automatic systems which deal with term's extractions constitute an important tool when they make reference to the labor of compilation of lexemes, which is restricted to a specific field or specialty. The textual analysis that are realized for this type of software must include strategies that could detect collocations in the field in which is done. In this topic is studied the viability of the use from extensive textual's corpus, that have not contain linguistic information, as happen with those textual's corpus that could be compiled from internet. The internet is used like a source of information for the recompilation of terminology's collocations. With that purpose is analyzed the behavior of different indicators based on the frequencies registered for a collection of economic terms in a Spanish corpus of 300.000 words.
引用
收藏
页码:145 / 152
页数:8
相关论文
共 50 条
  • [21] Automatic Extraction of Performance Indicators from Financial Statements
    Kamaruddin, Siti Sakira
    Hamdan, Abdul Razak
    Abu Bakar, Azuraliza
    Nor, Fauzias Mat
    2009 INTERNATIONAL CONFERENCE ON ELECTRICAL ENGINEERING AND INFORMATICS, VOLS 1 AND 2, 2009, : 337 - 339
  • [22] Automatic concept extraction from spoken medical reports
    Happe, A
    Pouliquen, B
    Burgun, A
    Cuggia, M
    Le Beux, P
    INTERNATIONAL JOURNAL OF MEDICAL INFORMATICS, 2003, 70 (2-3) : 255 - 263
  • [23] TarMiner: automatic extraction of miRNA targets from literature
    Tsoupidi, Rodothea-Myrsini
    Kanellos, Ilias
    Vergoulis, Thanasis
    Vlachos, Ioannis S.
    Hatzigeorgiou, Artemis G.
    Dalamagas, Theodore
    PROCEEDINGS OF THE 27TH INTERNATIONAL CONFERENCE ON SCIENTIFIC AND STATISTICAL DATABASE MANAGEMENT, 2015,
  • [24] The influence of the disciplinary field on terminological variation: A corpus-based study in the interdisciplinary domain of fishing
    Fernandez Silva, Sabela
    REVISTA SIGNOS, 2013, 46 (83): : 361 - 388
  • [25] On terminological figurativeness From theory to practice
    Timofeeva-Timofeev, Larissa
    Vargas-Sierra, Chelo
    TERMINOLOGY, 2015, 21 (01): : 102 - 125
  • [26] Anchoring points for bilingual lexical extraction from small, specialized, comparable corpus
    Prochasson, Emmanuel
    Morin, Emmanuel
    TRAITEMENT AUTOMATIQUE DES LANGUES, 2009, 50 (01): : 283 - 304
  • [27] Extracting Temporal Patterns from Large-Scale Text Corpus
    Liu, Yu
    Hua, Wen
    Zhou, Xiaofang
    DATABASES THEORY AND APPLICATIONS (ADC 2019), 2019, 11393 : 17 - 30
  • [28] Automatic extraction of informal topics from online suicidal ideation
    Reilly N. Grant
    David Kucher
    Ana M. León
    Jonathan F. Gemmell
    Daniela S. Raicu
    Samah J. Fodeh
    BMC Bioinformatics, 19
  • [29] Automatic extraction of numerical values from unstructured data in EHRs
    Bigeard, Elise
    Jouhet, Vianney
    Mougin, Fleur
    Thiessard, Frantz
    Grabar, Natalia
    DIGITAL HEALTHCARE EMPOWERING EUROPEANS, 2015, 210 : 50 - 54
  • [30] Automatic extraction of informal topics from online suicidal ideation
    Grant, Reilly N.
    Kucher, David
    Leon, Ana M.
    Gemmell, Jonathan F.
    Raicu, Daniela S.
    Fodeh, Samah J.
    BMC BIOINFORMATICS, 2018, 19