Confidence path regularization for handling label uncertainty in semi-supervised learning: use case in bipolar disorder monitoring

被引:0
作者
Kmita, Kamil [1 ]
Casalino, Gabriella [2 ]
Castellano, Giovanna [2 ]
Hryniewicz, Olgierd [1 ]
Kaczmarek-Majer, Katarzyna [1 ]
机构
[1] Polish Acad Sci, Syst Res Inst, Warsaw, Poland
[2] Univ Bari Aldo Moro, Comp Sci Dept, Bari, Italy
来源
2022 IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS (FUZZ-IEEE) | 2022年
关键词
semi-supervised learning; prediction; label uncertainty; weak learning; regularization; bipolar disorder; process monitoring; acoustic features; smartphones; intelligent data analysis;
D O I
10.1109/FUZZ-IEEE55066.2022.9882759
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Semi-supervised learning has gained great interest because of its ability to combine unlabeled data with potentially few labeled observations in a training process. However, in some application contexts, one can question whether all available labels are equally valid. For example, in the context of bipolar disorder (BD) remote monitoring, a common practice is to extrapolate the psychiatrist's assessment onto some fixed time window surrounding the visit, the so-called ground truth period. In consequence, all data from this period are labeled with the same category. Such an approach may potentially result in misguided supervision affecting the model's performance. In this paper, we consider the problem of label uncertainty, assuming that the labels are crisp, but they may be assigned to particular observations with varying confidence. We propose a novel method called Confidence Path Regularization (CPR) that incorporates this uncertainty into the fuzzy c-means semi-supervised learning. The proposed CPR approach is a novel method for automatic, data-driven handling of label uncertainty. We achieve it by estimating the confidence factor for each labeled observation. In addition, CPR allows for the exploration of potential class-specific patterns in the adjusted confidence. The proposed method is illustrated with experiments on partially labeled data about speech characteristics collected from smartphone application for BD monitoring. In this particular applied scenario, we also use additional contextual data to improve the construction of confidence paths. It is shown that the proposed CPR approach enables to reflect the varying confidence in labels as compared with the nominal approach which assigns the majority of observations to the same class associated with relevant ground truth period.
引用
收藏
页数:8
相关论文
共 28 条
[1]   Image classification with deep learning in the presence of noisy labels: A survey [J].
Algan, Gorkem ;
Ulusoy, Ilkay .
KNOWLEDGE-BASED SYSTEMS, 2021, 215
[2]  
Arazo E, 2019, PR MACH LEARN RES, V97
[3]   Harnessing Label Uncertainty to Improve Modeling: An Application to Student Engagement Recognition [J].
Aung, Arkar Min ;
Whitehill, Jacob R. .
PROCEEDINGS 2018 13TH IEEE INTERNATIONAL CONFERENCE ON AUTOMATIC FACE & GESTURE RECOGNITION (FG 2018), 2018, :166-170
[4]  
Bouveyron C., 2009, ESANN
[5]   Intelligent analysis of data streams about phone calls for bipolar disorder monitoring [J].
Casalino, Gabriella ;
Castellano, Giovanna ;
Kaczmarek-Majer, Katarzyna ;
Hryniewicz, Olgierd .
IEEE CIS INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS 2021 (FUZZ-IEEE), 2021,
[6]   Data Stream Classification by Dynamic Incremental Semi-Supervised Fuzzy Clustering [J].
Casalino, Gabriella ;
Castellano, Giovanna ;
Mencar, Corrado .
INTERNATIONAL JOURNAL ON ARTIFICIAL INTELLIGENCE TOOLS, 2019, 28 (08)
[7]  
Chen Pengfei, 2019, P MACHINE LEARNING R, V97
[8]   Do experts make mistakes?: A comparison of human and machine identification of dinoflagellates [J].
Culverhouse, PF ;
Williams, R ;
Reguera, B ;
Herry, V ;
González-Gil, S .
MARINE ECOLOGY PROGRESS SERIES, 2003, 247 :17-25
[9]  
Dominiak M., 2021, J. Med. Internet Res.
[10]  
El-Zahhar Mohamed M., 2010, Proceedings 10th International Conference on Intelligent Systems Design and Applications (ISDA 2010), P1136, DOI 10.1109/ISDA.2010.5687034