A novel imbalanced data classification approach for suicidal ideation detection on social media

被引:0
作者
Mohamed Ali Ben Hassine
Safa Abdellatif
Sadok Ben Yahia
机构
[1] University of Tunis El Manar,Faculty of Sciences of Tunis
[2] Tallinn University of Technology,Department of Software Science
来源
Computing | 2022年 / 104卷
关键词
Suicidal ideation detection; Feature extraction and selection; Associative classification; Imbalanced datasets; 06-08;
D O I
暂无
中图分类号
学科分类号
摘要
Suicide has become a serious social health issue in modern society. Suicidal ideation is people’s thoughts about committing or planning suicide. Many factors, such as long-term exposure to negative feelings or life events, can lead to suicidal ideation and suicide attempts. Among these approaches to suicide prevention, early detection of suicidal ideation is one of the most effective ways. Using social networking services provides a platform for people to express their sufferings and feelings in the real world, which provides a source for a deeper investigation into models and approaches for the detection of suicidal intent to enable prevention. This paper addresses the early detection of suicide ideation through the associative classification approach applied to Twitter social media. However, since the number of suicide intention tweets is tiny compared to the number of all the tweets, this leads us to an imbalanced classification problem, in which, the minority class (suicide intention) is more important than the majority class (no suicide intention). In such a situation, classical classifiers usually yield very inaccurate results regarding minor classes, since they can easily discover rules predicting the majority class and overlook those related to the minor. This paper aims to contribute to this line of research by introducing a new interestingness measure to enhance the classification process. This measure highlights the two classes regardless of their imbalanced distribution. Carried out experiments proved that the adapted CBA outweighs in terms of prediction accuracy the original one, and other pioneering baseline classification approaches.
引用
收藏
页码:741 / 765
页数:24
相关论文
共 60 条
[1]  
Ben Yahia S(2009)A new generic basis of factual and implicative association rules Intell Data Anal 13 633-656
[2]  
Gasmi G(2001)Random forests Mach Learn 45 5-32
[3]  
Nguifo EM(2010)Apples to oranges? A direct comparison between suicide attempters and suicide completers J Affect Disord 124 90-97
[4]  
Breiman L(2006)Interestingness measures for data mining: a survey ACM Comput Surv (CSUR) 38 9-3644
[5]  
DeJong TM(2018)Classification algorithms with attribute selection: an evaluation study using WEKA Int J Adv Netw Appl 9 3640-3644
[6]  
Overholser JC(2018)Classification algorithms with attribute selection: an evaluation study using WEKA Int J Adv Netw Appl 9 3640-18
[7]  
Stockmeier CA(2009)The WEKA data mining software: an update ACM SIGKDD Explor Newsl 11 10-226
[8]  
Geng L(2021)Suicidal ideation detection: a review of machine learning methods and applications IEEE Trans Comput Soc Syst 8 214-141
[9]  
Hamilton HJ(2013)An insight into classification with imbalanced data: empirical results and current trends on using data intrinsic characteristics Inf Sci 250 113-262
[10]  
Gnanambal S(2020)A review on classification of imbalanced data for wireless sensor networks Int J Distrib Sens Netw 36 255-160