Patent Document Clustering Using Dimensionality Reduction

被引:1
作者
Girthana, K. [1 ]
Swamynathan, S. [1 ]
机构
[1] Anna Univ, Dept Informat Sci & Technol, Madras 600025, Tamil Nadu, India
来源
PROGRESS IN ADVANCED COMPUTING AND INTELLIGENT ENGINEERING, VOL 2 | 2018年 / 564卷
关键词
Prior art search; Dimensionality reduction; Clustering;
D O I
10.1007/978-981-10-6875-1_17
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Patents are a type of intellectual property rights that provide exclusive rights to the invention. Whenever there is a novelty or an invention, prior art search on patents is carried out to check the degree of innovation. Clustering is used to group the relevant documents of prior art search to gain insights about the patent document. The patent documents represent hundreds of features (words extracted from the title and abstract fields). The common sets of features between the documents are subtle. Therefore, the number of features for clustering increases drastically. This leads to the curse of dimensionality. Hence, in thiswork, dimensionality reduction techniques such as PCA and SVD are employed to compare and analyze the quality of clusters formed from the Google patent documents. This comparative analysiswas performed by considering title, abstract, and classification code fields of the patent document. Classification code information was used to decide the number of clusters.
引用
收藏
页码:167 / 176
页数:10
相关论文
empty
未找到相关数据