Effective Categorization of Text in Practical Design

被引:0
作者
Ravi, S. [1 ]
Sambath, M. [1 ]
RameshKumar, K. [1 ]
机构
[1] Hindustan Univ, Sch Comp Sci, POB 1, Rajiv Gandhi Salai 603013, Padur, India
来源
2014 INTERNATIONAL CONFERENCE ON INFORMATION COMMUNICATION AND EMBEDDED SYSTEMS (ICICES) | 2014年
关键词
Textcategorization; NovelActive learning; Manifold learning;
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Data mining extracts novel and useful knowledge from large repositories of data and has become an effective analysis and decision means in corporationIn many information processing tasks, labels are usually expensive and the unlabeled data points are abundant. To reduce the cost on collecting labels, it is crucial to predict which unlabeled examples are the most informative, i.e., improve the classifier the most if they were labeled. Many active learning techniques have been proposed for text categorization, such as SVM Active and Transductive Experimental Design. However, most of previous approaches try to discover the discriminant structure of the data space, whereas the geometrical structure is not well respected. By minimizing the expected error with respect to the optimal classifier, They can select the most representative and discriminative data points for labeling. Experimental results on text categorization have demonstrated the effectiveness of proposed approach.
引用
收藏
页数:5
相关论文
共 8 条
[1]  
Angelova R., 2006, P 29 INT C RES DEV I
[2]  
[Anonymous], 2008, Proceedings of the 17th ACM Conference on Information and Knowledge Management, CIKM '08
[3]  
Belkin M, 2002, ADV NEUR IN, V14, P585
[4]  
Belkin M, 2006, J MACH LEARN RES, V7, P2399
[5]  
CAI D, 2008, P INT C DAT MIN ICDM
[6]  
Joachims T., EUR C MACH LEARN, P137, DOI DOI 10.1007/BFB0026683
[7]  
Seung H. S., 1992, Proceedings of the Fifth Annual ACM Workshop on Computational Learning Theory, P287, DOI 10.1145/130385.130417
[8]  
Zhang W. V., 2007, P 16 ACM C INF KNOWL, P741