An empirical study on dimensionality optimization in text mining for linguistic knowledge acquisition

被引:0
|
作者
Kim, YS [1 ]
Chang, JH
Zhang, BT
机构
[1] Hallym Univ, Div Informat & Telecommun Engn, Kang Won 200702, South Korea
[2] Seoul Natl Univ, Sch Comp Sci & Engn, Seoul 151744, South Korea
来源
ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING | 2003年 / 2637卷
关键词
knowledge acquisition; text mining; Latent Semantic Analysis; probabilistic latent semantic analysis; target word selection;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we try to find empirically the optimal dimensionality in data-driven models, Latent Semantic Analysis (LSA) model and Probabilistic Latent Semantic Analysis (PLSA) model. These models are used for building linguistic semantic knowledge which could be used in estimating contextual semantic similarity for the target word selection in English-Korean machine translation. We also facilitate k-Nearest Neighbor learning algorithm. We diversify our experiments by analyzing the covariance between the value of k in k-NN learning and accuracy of selection, in addition to that between the dimensionality and the accuracy. While we could not find regular tendency of relationship between the dimensionality and the accuracy, however, we could find the optimal dimensionality having the most sound distribution of data during experiments.
引用
收藏
页码:111 / 116
页数:6
相关论文
共 50 条
  • [1] Knowledge Based Dimensionality Reduction for Technical Text Mining
    Shalaby, Walid
    Zadrozny, Wlodek
    Gallagher, Sean
    2014 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2014,
  • [2] An empirical evaluation of a system for text knowledge acquisition
    Hahn, U
    Schnattinger, K
    KNOWLEDGE ACQUISITION, MODELING AND MANAGEMENT, 1997, 1319 : 129 - 144
  • [3] INFORMATION EXTRACTION AND TEXT SUMMARIZATION USING LINGUISTIC KNOWLEDGE ACQUISITION
    RAU, LF
    JACOBS, PS
    ZERNIK, U
    INFORMATION PROCESSING & MANAGEMENT, 1989, 25 (04) : 419 - 428
  • [4] Lightly supervised acquisition of named entities and linguistic patterns for multilingual text mining
    de Pablo-Sanchez, Cesar
    Segura-Bedmar, Isabel
    Martinez, Paloma
    Iglesias-Maqueda, Ana
    KNOWLEDGE AND INFORMATION SYSTEMS, 2013, 35 (01) : 87 - 109
  • [5] Lightly supervised acquisition of named entities and linguistic patterns for multilingual text mining
    César de Pablo-Sánchez
    Isabel Segura-Bedmar
    Paloma Martínez
    Ana Iglesias-Maqueda
    Knowledge and Information Systems, 2013, 35 : 87 - 109
  • [6] An Empirical Study of the Dimensionality of the Mathematical Knowledge for Teaching Construct
    Copur-Gencturk, Yasemin
    Tolar, Tammy
    Jacobson, Erik
    Fan, Weihua
    JOURNAL OF TEACHER EDUCATION, 2019, 70 (05) : 485 - 497
  • [7] Linguistic Text Mining for Problem Reports
    Malin, Jane T.
    Throop, David R.
    Millward, Christopher
    Schwarz, Hansen A.
    Gomez, Fernando
    Thronesbery, Carroll
    2009 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN AND CYBERNETICS (SMC 2009), VOLS 1-9, 2009, : 1578 - +
  • [8] Predictive Maintenance with Linguistic Text Mining
    Postiglione, Alberto
    Monteleone, Mario
    MATHEMATICS, 2024, 12 (07)
  • [9] Effective Pattern Discovery and Dimensionality Reduction for Text Under Text Mining
    Vijayakumar, T.
    Priya, R.
    Palanisamy, C.
    ARTIFICIAL INTELLIGENCE AND EVOLUTIONARY ALGORITHMS IN ENGINEERING SYSTEMS, VOL 2, 2015, 325 : 615 - 623
  • [10] Organizational context and knowledge acquisition in IJVs: An empirical study
    Evangelista, Felicitas
    Le Nguyen Hau
    JOURNAL OF WORLD BUSINESS, 2009, 44 (01) : 63 - 73