AN EMPIRICAL EVALUATION OF DIMENSIONALITY REDUCTION USING LATENT SEMANTIC ANALYSIS ON HINDI TEXT
被引:1
作者:
Krishnamurthi, Karthik
论文数: 0引用数: 0
h-index: 0
机构:
SNIST, Dept IT, Hyderabad, IndiaSNIST, Dept IT, Hyderabad, India
Krishnamurthi, Karthik
[1
]
Sudi, Ravi Kumar
论文数: 0引用数: 0
h-index: 0
机构:
JPNCE, Dept IT, Mahabubnagar, IndiaSNIST, Dept IT, Hyderabad, India
Sudi, Ravi Kumar
[2
]
Panuganti, Vijayapal Reddy
论文数: 0引用数: 0
h-index: 0
机构:
MRCE, Dept CSE, Hyderabad, IndiaSNIST, Dept IT, Hyderabad, India
Panuganti, Vijayapal Reddy
[3
]
Bulusu, Vishnu Vardhan
论文数: 0引用数: 0
h-index: 0
机构:
JNTUHCEJ, Dept CSE, Jagitial, IndiaSNIST, Dept IT, Hyderabad, India
Bulusu, Vishnu Vardhan
[4
]
机构:
[1] SNIST, Dept IT, Hyderabad, India
[2] JPNCE, Dept IT, Mahabubnagar, India
[3] MRCE, Dept CSE, Hyderabad, India
[4] JNTUHCEJ, Dept CSE, Jagitial, India
来源:
2013 INTERNATIONAL CONFERENCE ON ASIAN LANGUAGE PROCESSING (IALP 2013)
|
2013年
关键词:
Latent Semantic Analysis;
Singular Value Decomposition;
Dimensionality Reduction;
Extractive summary;
D O I:
10.1109/IALP.2013.11
中图分类号:
TP18 [人工智能理论];
学科分类号:
081104 ;
0812 ;
0835 ;
1405 ;
摘要:
Dimensionality reduction is the process of deriving an approximate representation of a dataset, that can reflect most of the correlations underlying within the dataset. In the context of text processing, dimensionality reduction is used for transforming any text to a precise representation that efficiently identifies the main insights of the original text. LSA (Latent Semantic Analysis) is a technique that is used to find correlations between words and sentences based on the usage of words within the text. This paper addresses the issue of dimensionality reduction in representing relevant data from Hindi text using LSA. An empirical evaluation is performed to find the influence of language complexity and influence of various weighting schemes on dimensionality reduction. The results are presented using the standard measures such as recall, precision and F-score.