Weighted z-Distance-Based Clustering and Its Application to Time-Series Data

被引:1
作者
Wang, Zhao-Yu [1 ]
Wu, Chen-Yu [1 ]
Lin, Yan-Ting [1 ]
Lee, Shie-Jue [2 ]
机构
[1] Natl Sun Yat Sen Univ, Dept Elect Engn, Kaohsiung 804, Taiwan
[2] Natl Sun Yat Sen Univ, Dept Elect Engn, Intelligent Elect Commerce Res Ctr, Kaohsiung 804, Taiwan
来源
APPLIED SCIENCES-BASEL | 2019年 / 9卷 / 24期
关键词
data clustering; similarity; training cycle; z-distance; quadratic programming; ALGORITHM;
D O I
10.3390/app9245469
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
Clustering is the practice of dividing given data into similar groups and is one of the most widely used methods for unsupervised learning. Lee and Ouyang proposed a self-constructing clustering (SCC) method in which the similarity threshold, instead of the number of clusters, is specified in advance by the user. For a given set of instances, SCC performs only one training cycle on those instances. Once an instance has been assigned to a cluster, the assignment will not be changed afterwards. The clusters produced may depend on the order in which the instances are considered, and assignment errors are more likely to occur. Also, all dimensions are equally weighted, which may not be suitable in certain applications, e.g., time-series clustering. In this paper, improvements are proposed. Two or more training cycles on the instances are performed. An instance can be re-assigned to another cluster in each cycle. In this way, the clusters produced are less likely to be affected by the feeding order of the instances. Also, each dimension of the input can be weighted differently in the clustering process. The values of the weights are adaptively learned from the data. A number of experiments with real-world benchmark datasets are conducted and the results are shown to demonstrate the effectiveness of the proposed ideas.
引用
收藏
页数:25
相关论文
共 65 条
[1]  
Abdalgader K., 2018, CENTROID BASED LEXIC
[2]  
Achtert E, 2006, LECT NOTES ARTIF INT, V3918, P119
[3]   Time-series clustering - A decade review [J].
Aghabozorgi, Saeed ;
Shirkhorshidi, Ali Seyed ;
Teh Ying Wah .
INFORMATION SYSTEMS, 2015, 53 :16-38
[4]   Stock market co-movement assessment using a three-phase clustering method [J].
Aghabozorgi, Saeed ;
Teh, Ying Wah .
EXPERT SYSTEMS WITH APPLICATIONS, 2014, 41 (04) :1301-1314
[5]   Automatic subspace clustering of high dimensional data [J].
Agrawal, R ;
Gehrke, J ;
Gunopulos, D ;
Raghavan, P .
DATA MINING AND KNOWLEDGE DISCOVERY, 2005, 11 (01) :5-33
[6]  
Alvarez F.M., 2010, IEEE T KNOWL DATA EN, V23, P1230, DOI DOI 10.1109/TKDE.2010.227
[7]  
Améndola C, 2016, J ALGEBR STUD, V7, P14
[8]  
[Anonymous], 2006, TEXT MINING HDB ADV
[9]  
[Anonymous], 2017, Master's thesis
[10]  
[Anonymous], 2012, SELF ORG MAPS