Analysis of child development facts and myths using text mining techniques and classification models

被引:0
作者
Tajrian, Mehedi [1 ]
Rahman, Azizur [1 ]
Kabir, Muhammad Ashad [1 ]
Islam, Md Rafiqul [1 ]
机构
[1] Charles Sturt Univ, Sch Comp Math & Engn, Bathurst, NSW 2795, Australia
关键词
Misinformation; Myth; Text mining; Machine learning; Deep learning; CONSPIRACY THEORIES; BELIEF; MALTREATMENT; SCIENCE; ABUSE; RUMOR;
D O I
10.1016/j.heliyon.2024.e36652
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
The rapid dissemination of misinformation on the internet complicates the decision-making process for individuals seeking reliable information, particularly parents researching child development topics. This misinformation can lead to adverse consequences, such as inappropriate treatment of children based on myths. While previous research has utilized text-mining techniques to predict child abuse cases, there has been a gap in the analysis of child development myths and facts. This study addresses this gap by applying text mining techniques and classification models to distinguish between myths and facts about child development, leveraging newly gathered data from publicly available websites. The research methodology involved several stages. First, text mining techniques were employed to pre-process the data, ensuring enhanced accuracy. Subsequently, the structured data was analysed using six robust Machine Learning (ML) classifiers and one Deep Learning (DL) model, with two feature extraction techniques applied to assess their performance across three different training-testing splits. To ensure the reliability of the results, cross-validation was performed using both k-fold and leave-one-out methods. Among the classification models tested, Logistic Regression (LR) demonstrated the highest accuracy, achieving a 90 % accuracy with the Bag-of-Words (BoW) feature extraction technique. LR stands out for its exceptional speed and efficiency, maintaining low testing time per statement (0.97 mu s). These findings suggest that LR, when combined with BoW, is effective in accurately classifying child development information, thus providing a valuable tool for combating misinformation and assisting parents in making informed decisions.
引用
收藏
页数:17
相关论文
共 67 条
[1]   Classification for Authorship of Tweets by Comparing Logistic Regression and Naive Bayes Classifiers [J].
Aborisade, Opeyemi Mulikat ;
Anwar, Mohd .
2018 IEEE INTERNATIONAL CONFERENCE ON INFORMATION REUSE AND INTEGRATION (IRI), 2018, :269-276
[2]   Understanding the intent behind sharing misinformation on social media [J].
Agarwal, Basant ;
Agarwal, Ajay ;
Harjule, Priyanka ;
Rahman, Azizur .
JOURNAL OF EXPERIMENTAL & THEORETICAL ARTIFICIAL INTELLIGENCE, 2023, 35 (04) :573-587
[3]   Detecting opinion spams and fake news using text classification [J].
Ahmed, Hadeer ;
Traore, Issa ;
Saad, Sherif .
SECURITY AND PRIVACY, 2018, 1 (01)
[4]   Identifying child abuse through text mining and machine learning [J].
Amrit, Chintan ;
Paauw, Tim ;
Aly, Robin ;
Lavric, Miha .
EXPERT SYSTEMS WITH APPLICATIONS, 2017, 88 :402-418
[5]   Political paranoia v. political realism: on distinguishing between bogus conspiracy theories and genuine conspiratorial politics [J].
Bale, Jeffrey M. .
PATTERNS OF PREJUDICE, 2007, 41 (01) :45-60
[6]  
Bernard D., A man drank a bottle of rubbing alcohol for COVID-19
[7]   A Literature Review on Detecting, Verifying, and Mitigating Online Misinformation [J].
Bodaghi, Arezo ;
Schmitt, Ketra A. ;
Watine, Pierre ;
Fung, Benjamin C. M. .
IEEE TRANSACTIONS ON COMPUTATIONAL SOCIAL SYSTEMS, 2024, 11 (04) :5119-5145
[8]   A survey on fake news and rumour detection techniques [J].
Bondielli, Alessandro ;
Marcelloni, Francesco .
INFORMATION SCIENCES, 2019, 497 :38-55
[9]  
Casillo M., 2021, Computational Data and Social Networks, V12575, P333, DOI [10.1007/978-3-030-66046-8_27, DOI 10.1007/978-3-030-66046-8_27]
[10]  
Chen Y., 2015, P 2015 ACM WORKSH MU, P15, DOI [DOI 10.1145/2823465.2823467, 10.1145/2823465.2823467]