Text mining in different languages

被引:0
|
作者
Lebart, L [1 ]
机构
[1] Ecole Natl Super Telecommun, CNRS, F-75013 Paris, France
来源
APPLIED STOCHASTIC MODELS AND DATA ANALYSIS | 1998年 / 14卷 / 04期
关键词
Text Mining; text categorization; language independent methods; discriminant analysis;
D O I
暂无
中图分类号
C93 [管理学]; O22 [运筹学];
学科分类号
070105 ; 12 ; 1201 ; 1202 ; 120202 ;
摘要
The purpose of Text Mining is to describe and explore textual data, to uncover structural traits, and proceed to predictions. The field of application concerns Information Retrieval, processing responses to open-ended questions in sample surveys as well as processing textual corpora of a more general nature. At the intersection of Corpora Linguistics and Exploratory Statistical Analysis, a series of language independent tools and methods can perform most of the previously mentioned tasks, including the assessment and validation of the obtained results, be it visualization or categorization. Multiple confusion matrices calculated on test-samples characterize the quality of the prediction as well as the structure of errors of prediction. In the case of multinational surveys and corpora, they allow us to proceed to comparisons among several countries, in spite of the very heterogeneous character of the basic information (texts in different languages). Copyright (C) 1998 John Wiley & Sons, Ltd.
引用
收藏
页码:323 / 334
页数:12
相关论文
共 50 条
  • [21] Processing the Text of the Holy Quran: a Text Mining Study
    Alhawarat, Mohammad
    Hegazi, Mohamed
    Hilal, Anwer
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2015, 6 (02) : 262 - 267
  • [22] A REVIEW ON METHODOLOGY OF TEXT MINING
    Gautam, Divya
    Pandey, Ritvik
    INTERNATIONAL JOURNAL OF LIFE SCIENCE AND PHARMA RESEARCH, 2018, : 94 - 100
  • [23] Comparison of Text Mining Tools
    Kaur, Arvinder
    Chopra, Deepti
    2016 5TH INTERNATIONAL CONFERENCE ON RELIABILITY, INFOCOM TECHNOLOGIES AND OPTIMIZATION (TRENDS AND FUTURE DIRECTIONS) (ICRITO), 2016, : 186 - 192
  • [24] Text Mining in Radiology Reports
    Kocatekin, Tugberk
    Unay, Devrim
    2013 21ST SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE (SIU), 2013,
  • [25] Integration of Text Mining Taxonomies
    Pfeifer, Katja
    Peukert, Eric
    KNOWLEDGE DISCOVERY, KNOWLEDGE ENGINEERING AND KNOWLEDGE MANAGEMENT, IC3K 2013, 2015, 454 : 39 - 55
  • [26] Mining Text for Disease Diagnosis
    Tsumoto, Shusaku
    Kimura, Tomohiro
    Iwata, Haruko
    Hirano, Shoji
    5TH INTERNATIONAL CONFERENCE ON INFORMATION TECHNOLOGY AND QUANTITATIVE MANAGEMENT, ITQM 2017, 2017, 122 : 1133 - 1140
  • [27] A Survey on Text Mining Techniques
    Tandel, Sayali Sunil
    Jamadar, Abhishek
    Dudugu, Siddharth
    2019 5TH INTERNATIONAL CONFERENCE ON ADVANCED COMPUTING & COMMUNICATION SYSTEMS (ICACCS), 2019, : 1022 - 1026
  • [28] Text Mining of Medical Records
    Tache, Irina Andra
    Dragoicea, Monica
    Apostol, Elena-Simona
    Truica, Ciprian-Octavian
    2019 E-HEALTH AND BIOENGINEERING CONFERENCE (EHB), 2019,
  • [29] Text Mining in Organizational Research
    Kobayashi, Vladimer B.
    Mol, Stefan T.
    Berkers, Hannah A.
    Kismihok, Gabor
    Den Hartog, Deanne N.
    ORGANIZATIONAL RESEARCH METHODS, 2018, 21 (03) : 733 - 765
  • [30] AN APPROACH ON MULTILEVEL TEXT MINING
    Onet, Adrian
    KEPT 2009: KNOWLEDGE ENGINEERING PRINCIPLES AND TECHNIQUES, 2009, : 85 - 92