An improved method for the feature extraction of Chinese text by combining rough set theory with automatic abstracting technology

被引:0
|
作者
Shen, Min [1 ]
Dong, Baosen [1 ]
Xu, Linying [1 ]
机构
[1] School of Computer Science and Technology, Tianjin University, 300072 Tianjin, China
关键词
Data mining - Forestry - Abstracting - Information retrieval systems - Semantics - Information retrieval - Search engines;
D O I
10.1007/978-3-642-34447-3_44
中图分类号
学科分类号
摘要
The Rough Set Theory can reduce features of Chinese text effectively [1], but it is often encountered that the reduction will need a very long time in the case of a large number of training sets [2]. To solve the problem, this article proposes a method of associating Rough Set Theory with Automatic Abstracting Technology (AAT). Firstly, by calculating the weight of each node-it consists of the Self-Frequency, Tree Frequency, Concept Generalization Degree and Concept Selection Degree -in the Concept Hierarchy Tree [3] which based on Tongyici Cilin semantic dictionary [4] [5], it can determine theme concepts of Chinese Text. Secondly, it will extract the topic sentences [6] by calculating the importance of sentences [7]. Finally, it reduces features of these topic sentences again by IQR (Improved Quick Reduct Algorithm), and constructs the vector. Then from the whole information retrieval system perspective, it is clear that this method can save time for Automatic Abstracting and reduction. © Springer-Verlag Berlin Heidelberg 2012.
引用
收藏
页码:496 / 509
相关论文
共 50 条
  • [31] A Method to Select Optimal Feature Parameters of Radar Targets Based on Rough Set Theory
    Guo, Wanhai
    Xiu, Zhihong
    Zhang, Jidong
    2008 CHINESE CONTROL AND DECISION CONFERENCE, VOLS 1-11, 2008, : 4529 - +
  • [32] An Automated Text Classification Method: Using Improved Fuzzy Set Approach for Feature Selection
    Abbasi, Bushra Zaheer
    Hussain, Shahid
    Faisal, Muhammad Imran
    PROCEEDINGS OF 2019 16TH INTERNATIONAL BHURBAN CONFERENCE ON APPLIED SCIENCES AND TECHNOLOGY (IBCAST), 2019, : 666 - 670
  • [33] A rough set-based hybrid feature selection method for topic-specific text filtering
    Li, Q
    Li, JH
    Liu, GS
    Li, SH
    PROCEEDINGS OF THE 2004 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-7, 2004, : 1464 - 1468
  • [34] Automatic breast tumor classification using a level set method and feature extraction in mammography
    Pashoutan, Soheil
    Shokouhi, Shahriar Baradaran
    Pashoutan, Meisam
    2017 24TH NATIONAL AND 2ND INTERNATIONAL IRANIAN CONFERENCE ON BIOMEDICAL ENGINEERING (ICBME), 2017, : 240 - 245
  • [35] Feature extraction using rough set theory and genetic algorithms - an application for the simplification of product quality evaluation
    Zhai, LY
    Khoo, LP
    Fok, SC
    COMPUTERS & INDUSTRIAL ENGINEERING, 2002, 43 (04) : 661 - 676
  • [36] An integration method combining Rough Set Theory with formal concept analysis for personal investment portfolios
    Shyng, Jhieh-Yu
    Shieh, How-Ming
    Tzeng, Gwo-Hshiung
    KNOWLEDGE-BASED SYSTEMS, 2010, 23 (06) : 586 - 597
  • [37] Automatic extraction of image texture feature of diesel engine fault based on CWT time-frequency image and variable precision rough set theory
    Ren, Jin-Cheng
    Xiao, Yun-Kui
    Zhang, Ling-Ling
    Shen, Hong
    Feng, Hui-Juan
    Neiranji Gongcheng/Chinese Internal Combustion Engine Engineering, 2015, 36 (03): : 106 - 112
  • [38] An Improved Method of Short Text Feature Extraction Based on Words Co-occurrence
    Wang, Lihong
    COMPUTER AND INFORMATION TECHNOLOGY, 2014, 519-520 : 842 - 845
  • [39] A method for attribute reduction using rough set theory and improved particle swarm optimization
    Liu, Xingwen
    Wang, Dianhong
    Jiang, Liangxiao
    Chen, Fenxiong
    ICIC Express Letters, Part B: Applications, 2011, 2 (06): : 1261 - 1266
  • [40] An Automatic Text Summary Extraction Method Based on Improved TextRank and TF-IDF
    Guan, Xinxin
    Li, Yeli
    Zeng, Qingtao
    Zhou, Chufeng
    2019 INTERNATIONAL CONFERENCE ON ADVANCED ELECTRONIC MATERIALS, COMPUTERS AND MATERIALS ENGINEERING (AEMCME 2019), 2019, 563