An improved method for the feature extraction of Chinese text by combining rough set theory with automatic abstracting technology

被引:0
|
作者
Shen, Min [1 ]
Dong, Baosen [1 ]
Xu, Linying [1 ]
机构
[1] School of Computer Science and Technology, Tianjin University, 300072 Tianjin, China
关键词
Data mining - Forestry - Abstracting - Information retrieval systems - Semantics - Information retrieval - Search engines;
D O I
10.1007/978-3-642-34447-3_44
中图分类号
学科分类号
摘要
The Rough Set Theory can reduce features of Chinese text effectively [1], but it is often encountered that the reduction will need a very long time in the case of a large number of training sets [2]. To solve the problem, this article proposes a method of associating Rough Set Theory with Automatic Abstracting Technology (AAT). Firstly, by calculating the weight of each node-it consists of the Self-Frequency, Tree Frequency, Concept Generalization Degree and Concept Selection Degree -in the Concept Hierarchy Tree [3] which based on Tongyici Cilin semantic dictionary [4] [5], it can determine theme concepts of Chinese Text. Secondly, it will extract the topic sentences [6] by calculating the importance of sentences [7]. Finally, it reduces features of these topic sentences again by IQR (Improved Quick Reduct Algorithm), and constructs the vector. Then from the whole information retrieval system perspective, it is clear that this method can save time for Automatic Abstracting and reduction. © Springer-Verlag Berlin Heidelberg 2012.
引用
收藏
页码:496 / 509
相关论文
共 50 条
  • [1] An Improved Method for the Feature Extraction of Chinese Text by Combining Rough Set Theory with Automatic Abstracting Technology
    Shen, Min
    Dong, Baosen
    Xu, Linying
    CONTEMPORARY RESEARCH ON E-BUSINESS TECHNOLOGY AND STRATEGY, 2012, 332 : 496 - 509
  • [2] Research of Text Topic Automatic Extraction Method Based on Rough Set Theory
    Sun, Zhanfeng
    Bao, Kongjun
    COMPUTATIONAL MATERIALS SCIENCE, PTS 1-3, 2011, 268-270 : 1127 - 1131
  • [3] Text Feature Extraction Based on Rough Set
    Cheng, Yiyuan
    Zhang, Ruiling
    Wang, Xiufeng
    Chen, Qiushuang
    FIFTH INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS AND KNOWLEDGE DISCOVERY, VOL 2, PROCEEDINGS, 2008, : 310 - 314
  • [4] Text feature ranking based on rough-set theory
    Tan, Songbo
    Wang, Yuefen
    Cheng, Xueqi
    PROCEEDINGS OF THE IEEE/WIC/ACM INTERNATIONAL CONFERENCE ON WEB INTELLIGENCE: WI 2007, 2007, : 659 - +
  • [5] A NEW FEATURE SELECTION METHOD BASED ON CONCEPT EXTRACTION IN AUTOMATIC CHINESE TEXT CLASSIFICATION
    Liao, Shasha
    Jiang, Minghu
    NEW MATHEMATICS AND NATURAL COMPUTATION, 2007, 3 (03) : 331 - 347
  • [6] Investigation and Application of Feature Extraction Based on Rough Set Theory
    Tang, Zhi-hang
    Zhang, Jing
    Li, Rong-jun
    ADVANCED RESEARCH ON ELECTRONIC COMMERCE, WEB APPLICATION, AND COMMUNICATION, PT 2, 2011, 144 : 195 - +
  • [7] Improved Deep BeliefNetwork to Feature Extraction in Chinese Text Classification
    Gao, Jingmin
    Yi, Junkai
    Jia, Wenhao
    Zhao, Xianghui
    PROCEEDINGS OF 2018 IEEE 9TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING AND SERVICE SCIENCE (ICSESS), 2018, : 283 - 287
  • [8] Decision rule extraction method based on rough set theory and fuzzy set theory
    Wang, MC
    Wang, ZO
    Zhang, M
    Yan, P
    Proceedings of 2005 International Conference on Machine Learning and Cybernetics, Vols 1-9, 2005, : 2212 - 2216
  • [9] Method of Chinese Text Categorization Based On Variable Precision Rough Set
    Wang, Ming-Yan
    Liu, Ting
    IITAW: 2009 THIRD INTERNATIONAL SYMPOSIUM ON INTELLIGENT INFORMATION TECHNOLOGY APPLICATIONS WORKSHOPS, 2009, : 26 - 29
  • [10] Research on Feature Selection Method in Chinese Text Automatic Classification
    Hong, Ying
    Shao, Xiwen
    PROCEEDINGS OF THE 2015 INTERNATIONAL CONFERENCE ON APPLIED SCIENCE AND ENGINEERING INNOVATION, 2015, 12 : 1759 - 1763