A Semi-supervised Framework for Misinformation Detection

被引:1
作者
Liu, Yueyang [1 ]
Boukouvalas, Zois [1 ]
Japkowicz, Nathalie [1 ]
机构
[1] Amer Univ, Washington, DC 20016 USA
来源
DISCOVERY SCIENCE (DS 2021) | 2021年 / 12986卷
关键词
Semi-supervised learning; Class imbalance; Misinformation detection;
D O I
10.1007/978-3-030-88942-5_5
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The spread of misinformation in social media outlets has become a prevalent societal problem and is the cause of many kinds of social unrest. Curtailing its prevalence is of great importance and machine learning has shown significant promise. However, there are two main challenges when applying machine learning to this problem. First, while much too prevalent in one respect, misinformation, actually, represents only a minor proportion of all the postings seen on social media. Second, labeling the massive amount of data necessary to train a useful classifier becomes impractical. Considering these challenges, we propose a simple semi-supervised learning framework in order to deal with extreme class imbalances that has the advantage, over other approaches, of using actual rather than simulated data to inflate the minority class. We tested our framework on two sets of Covid-related Twitter data and obtained significant improvement in F1-measure on extremely imbalanced scenarios, as compared to simple classical and deep-learning data generation methods such as SMOTE, ADASYN, or GAN-based data generation.
引用
收藏
页码:57 / 66
页数:10
相关论文
共 21 条
[1]   Manifold-based synthetic oversampling with manifold conformance estimation [J].
Bellinger, Colin ;
Drummond, Christopher ;
Japkowicz, Nathalie .
MACHINE LEARNING, 2018, 107 (03) :605-637
[2]  
Boukouvalas Z, 2020, Arxiv, DOI arXiv:2006.01284
[3]   A Survey of Predictive Modeling on Im balanced Domains [J].
Branco, Paula ;
Torgo, Luis ;
Ribeiro, Rita P. .
ACM COMPUTING SURVEYS, 2016, 49 (02)
[4]  
Chakraborty T., 2021, CCIS, V1402, DOI [10.1007/978-3-030-73696-5, DOI 10.1007/978-3-030-73696-5]
[5]   SMOTE: Synthetic minority over-sampling technique [J].
Chawla, Nitesh V. ;
Bowyer, Kevin W. ;
Hall, Lawrence O. ;
Kegelmeyer, W. Philip .
2002, American Association for Artificial Intelligence (16)
[6]  
Drummond Chris, 2003, ICML WORKSH, V11, P1
[7]   Bringing Transparency Design into Practice [J].
Eiband, Malin ;
Schneider, Hanna ;
Bilandzic, Mark ;
Fazekas-Con, Julian ;
Haug, Mareike ;
Hussmann, Heinrich .
IUI 2018: PROCEEDINGS OF THE 23RD INTERNATIONAL CONFERENCE ON INTELLIGENT USER INTERFACES, 2018, :211-223
[8]  
Goodfellow IJ, 2014, ADV NEUR IN, V27, P2672
[9]   ADASYN: Adaptive Synthetic Sampling Approach for Imbalanced Learning [J].
He, Haibo ;
Bai, Yang ;
Garcia, Edwardo A. ;
Li, Shutao .
2008 IEEE INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, VOLS 1-8, 2008, :1322-1328
[10]  
Hu Z, 2017, ICML