An Integrated Cluster Detection, Optimization, and Interpretation Approach for Financial Data

被引:160
|
作者
Li, Tie [1 ]
Kou, Gang [2 ]
Peng, Yi [1 ]
Yu, Philip S. [3 ]
机构
[1] Univ Elect Sci & Technol China, Sch Management & Econ, Chengdu 611731, Peoples R China
[2] Southwestern Univ Finance & Econ, Sch Business Adm, Chengdu 610074, Peoples R China
[3] Univ Illinois, Dept Comp Sci, Chicago, IL 60607 USA
基金
中国国家自然科学基金; 中国博士后科学基金;
关键词
Clustering algorithms; Data models; Correlation; Shape; Optimization; Laplace equations; Feature extraction; Clustering methods; data mining; financial management; spectral analysis; VALIDATION; ALGORITHM;
D O I
10.1109/TCYB.2021.3109066
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In many financial applications, such as fraud detection, reject inference, and credit evaluation, detecting clusters automatically is critical because it helps to understand the subpatterns of the data that can be used to infer user's behaviors and identify potential risks. Due to the complexity of human behaviors and changing social environments, the distributions of financial data are usually complex and it is challenging to find clusters and give reasonable interpretations. The goal of this study is to develop an integrated approach to detect clusters in financial data, and optimize the scope of the clusters such that the clusters can be easily interpreted. Specifically, we first proposed a new cluster quality evaluation criterion, which is free from large-scale computation and can guide base clustering algorithms such as k-Means to detect hyperellipsoidal clusters adaptively. Then, we designed a new solver for a revised support vector data description model, which efficiently refines the centroids and scopes of the detected clusters to make the clusters tighter such that the data in the clusters share greater similarities, and thus, the clusters can be easily interpreted with eigenvectors. Using ten financial datasets, the experiments showed that the proposed algorithm can efficiently find reasonable number of clusters. The proposed approach is suitable for large-scale financial datasets whose features are meaningful, and also applicable to financial mining tasks, such as data distribution interpretation and anomaly detection.
引用
收藏
页码:13848 / 13861
页数:14
相关论文
共 50 条
  • [1] An optimization approach to cluster data based on aggregate function
    Wang, Y
    PROCEEDINGS OF THE 2004 INTERNATIONAL CONFERENCE ON MANAGEMENT SCIENCE & ENGINEERING, VOLS 1 AND 2, 2004, : 271 - 274
  • [2] Data Mining Approach In Financial Fraud Detection and a Literature Review
    Esen, M. Fevzi
    ESKISEHIR OSMANGAZI UNIVERSITESI IIBF DERGISI-ESKISEHIR OSMANGAZI UNIVERSITY JOURNAL OF ECONOMICS AND ADMINISTRATIVE SCIENCES, 2016, 11 (02): : 93 - 118
  • [3] A Constrained Optimization Approach to Integrated Active Fault Detection and Control
    Forouzanfar, Mehdi
    Khosrowjerdi, Mohammad Javad
    IRANIAN JOURNAL OF SCIENCE AND TECHNOLOGY-TRANSACTIONS OF ELECTRICAL ENGINEERING, 2017, 41 (03) : 229 - 240
  • [4] A Constrained Optimization Approach to Integrated Active Fault Detection and Control
    Mehdi Forouzanfar
    Mohammad Javad Khosrowjerdi
    Iranian Journal of Science and Technology, Transactions of Electrical Engineering, 2017, 41 : 229 - 240
  • [5] Detection of Financial Opportunities in Micro-Blogging Data With a Stacked Classification System
    De Arriba-Perez, Francisco
    Garcia-Mendez, Silvia
    Regueiro-Janeiro, Jose Angel
    Gonzalez-Castano, Francisco J.
    IEEE ACCESS, 2020, 8 : 215679 - 215690
  • [6] Energy Optimization in Data Communications through Cluster Evolution
    Habib, Sami J.
    Marimuthu, Paulvanna N.
    2014 INTERNATIONAL CONFERENCE ON INFORMATION AND COMMUNICATION TECHNOLOGY CONVERGENCE (ICTC), 2014, : 161 - 166
  • [7] Internet Financial Fraud Detection Based on a Distributed Big Data Approach With Node2vec
    Zhou, Hangjun
    Sun, Guang
    Fu, Sha
    Wang, Linli
    Hu, Juan
    Gao, Ying
    IEEE ACCESS, 2021, 9 : 43378 - 43386
  • [8] A new approach for gender detection from voice data: Feature selection with optimization methods
    Ozbay, Feyza Altunbey
    Ozbay, Erdal
    JOURNAL OF THE FACULTY OF ENGINEERING AND ARCHITECTURE OF GAZI UNIVERSITY, 2023, 38 (02): : 1179 - 1192
  • [9] The application of cluster analysis in geophysical data interpretation
    Yu-Chen Song
    Hai-Dong Meng
    Michael J. O’Grady
    Gregory M. P. O’Hare
    Computational Geosciences, 2010, 14 : 263 - 271
  • [10] The application of cluster analysis in geophysical data interpretation
    Song, Yu-Chen
    Meng, Hai-Dong
    O'Grady, Michael J.
    O'Hare, Gregory M. P.
    COMPUTATIONAL GEOSCIENCES, 2010, 14 (02) : 263 - 271