Text mining in different languages

被引:0
|
作者
Lebart, L [1 ]
机构
[1] Ecole Natl Super Telecommun, CNRS, F-75013 Paris, France
来源
APPLIED STOCHASTIC MODELS AND DATA ANALYSIS | 1998年 / 14卷 / 04期
关键词
Text Mining; text categorization; language independent methods; discriminant analysis;
D O I
暂无
中图分类号
C93 [管理学]; O22 [运筹学];
学科分类号
070105 ; 12 ; 1201 ; 1202 ; 120202 ;
摘要
The purpose of Text Mining is to describe and explore textual data, to uncover structural traits, and proceed to predictions. The field of application concerns Information Retrieval, processing responses to open-ended questions in sample surveys as well as processing textual corpora of a more general nature. At the intersection of Corpora Linguistics and Exploratory Statistical Analysis, a series of language independent tools and methods can perform most of the previously mentioned tasks, including the assessment and validation of the obtained results, be it visualization or categorization. Multiple confusion matrices calculated on test-samples characterize the quality of the prediction as well as the structure of errors of prediction. In the case of multinational surveys and corpora, they allow us to proceed to comparisons among several countries, in spite of the very heterogeneous character of the basic information (texts in different languages). Copyright (C) 1998 John Wiley & Sons, Ltd.
引用
收藏
页码:323 / 334
页数:12
相关论文
共 50 条
  • [1] DNA AND NATURAL LANGUAGES Text Mining
    Bel-Enguix, Gemma
    Dahl, Veronica
    Dolores Jimenez-Lopez, M.
    KDIR 2009: PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND INFORMATION RETRIEVAL, 2009, : 140 - 145
  • [2] A customizable text classifier for text mining
    Zhang, Yun-Liang
    Zhang, Quan
    Data Science Journal, 2007, 6 (SUPPL.) : S904 - S909
  • [3] TEXT MINING-BASED FORMATION OF DICTIONARIES EXPRESSING OPINIONS IN NATURAL LANGUAGES
    Darena, Frantisek
    Zizka, Jan
    MENDEL 2011 - 17TH INTERNATIONAL CONFERENCE ON SOFT COMPUTING, 2011, : 374 - 381
  • [4] Grid-based Support for Different Text Mining Tasks
    Sarnovsky, Martin
    Butka, Peter
    Paralic, Jan
    ACTA POLYTECHNICA HUNGARICA, 2009, 6 (04) : 5 - 27
  • [5] Mining text using keyword distributions
    Feldman, R
    Dagan, I
    Hirsh, H
    JOURNAL OF INTELLIGENT INFORMATION SYSTEMS, 1998, 10 (03) : 281 - 300
  • [6] Mining Text Using Keyword Distributions
    Ronen Feldman
    Ido Dagan
    Haym Hirsh
    Journal of Intelligent Information Systems, 1998, 10 : 281 - 300
  • [7] Mining Significant Words from Customer Opinions Written in Different Natural Languages
    Zizka, Jan
    Darena, Frantisek
    TEXT, SPEECH AND DIALOGUE, TSD 2011, 2011, 6836 : 211 - 218
  • [8] Text Mining-Implementation of Extract Summarization in a Text Mining Application
    Akbar, Ali
    Sultan, Ahmer
    Mustafa, Atika
    INTERNATIONAL SYMPOSIUM OF INFORMATION TECHNOLOGY 2008, VOLS 1-4, PROCEEDINGS: COGNITIVE INFORMATICS: BRIDGING NATURAL AND ARTIFICIAL KNOWLEDGE, 2008, : 698 - 703
  • [9] A text mining approach to Internet abuse detection
    Chou, Chen-Huei
    Sinha, Atish P.
    Zhao, Huimin
    INFORMATION SYSTEMS AND E-BUSINESS MANAGEMENT, 2008, 6 (04) : 419 - 439
  • [10] A text mining approach to Internet abuse detection
    Chen-Huei Chou
    Atish P. Sinha
    Huimin Zhao
    Information Systems and e-Business Management, 2008, 6 : 419 - 439