Fairness in constrained spectral clustering

被引:0
作者
Agrawal, Laxita [1 ]
Saradhi, V. Vijaya [1 ]
Sharma, Teena [2 ]
机构
[1] Indian Inst Technol, Dept Comp Sci & Engn, Gauhati 781039, Assam, India
[2] Indian Inst Technol, MF Sch Data Sci & Artificial Intelligence, Gauhati 781039, Assam, India
关键词
Spectral clustering; Pairwise constraints; Must-link constraint; Cannot-link constraint; Constrained spectral clustering; Fairness;
D O I
10.1016/j.neucom.2025.129815
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Semi-supervised clustering methods have gained significant attention in both theoretical research and real- world applications, including economics, finance, marketing, and healthcare. Among these methods, constrained spectral clustering enhances clustering quality by incorporating pairwise constraints, namely, must-link and cannot-link constraints, which guide the clustering process by specifying whether certain data points should or should not belong to the same cluster. However, traditional constrained spectral clustering methods may inadvertently propagate biases present in the data or constraints, leading to unequal representation of sensitive groups, such as different genders or racial groups, across clusters. This imbalance raises concerns about fairness, an issue that remains largely unexplored in constrained spectral clustering. To address this gap, this paper proposes a novel method named fair-constrained Spectral Clustering (fair-cSC). The proposed method integrates fairness into the must-link and cannot-link constraints by defining a fair constraint matrix, ensuring that pairwise relationships do not introduce bias against any particular group. Additionally, a balance constraint is incorporated to enforce fairness across input data points, promoting equal representation of sensitive groups within clusters. Comprehensive experiments on six benchmarked datasets, including ablation studies, demonstrate that the proposed fair-cSC method effectively enhances fairness while preserving clustering quality. Furthermore, the ablation study provides insights into the method's performance under different settings, reinforcing its robustness and applicability in real-world scenarios.
引用
收藏
页数:16
相关论文
共 63 条
[1]  
[Anonymous], 2003, INT JOINT C ARTIFICI, DOI DOI 10.5555/1630659.1630742
[2]  
[Anonymous], 1988, Hepatitis, DOI [10.24432/C5Q59J, DOI 10.24432/C5Q59J]
[3]   Semi-Supervised Affinity Propagation with Soft Instance-Level Constraints [J].
Arzeno, Natalia M. ;
Vikalo, Haris .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2015, 37 (05) :1041-1052
[4]  
Backurs Arturs, 2019, ICML 2019, P405
[5]   Semi-Supervised Clustering With Constraints of Different Types From Multiple Information Sources [J].
Bai, Liang ;
Liang, JiYe ;
Cao, Fuyuan .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2021, 43 (09) :3247-3258
[6]  
Becker B., 1996, UCI Machine Learning Repository, DOI DOI 10.24432/C5XW20
[7]  
Bera SK, 2019, ADV NEUR IN, V32
[8]  
Boöehm M, 2021, Arxiv, DOI arXiv:2002.07892
[9]   A review on semi-supervised clustering [J].
Cai, Jianghui ;
Hao, Jing ;
Yang, Haifeng ;
Zhao, Xujun ;
Yang, Yuqing .
INFORMATION SCIENCES, 2023, 632 :164-200
[10]  
Calderon E.D.V., 2023, Data and Information Management, V7, DOI DOI 10.1016/J.DIM.2023.100042