Finding warning markers: Leveraging natural language processing and machine learning technologies to detect risk of school violence

被引:19
作者
Ni, Yizhao [1 ,2 ]
Barzman, Drew [2 ,3 ]
Bachtel, Alycia [3 ]
Griffey, Marcus [3 ]
Osborn, Alexander [3 ]
Sorter, Michael [2 ,3 ]
机构
[1] Cincinnati Childrens Hosp Med Ctr, Div Biomed Informat, 3333 Burnet Avenuem, Cincinnati, OH 45229 USA
[2] Univ Cincinnati, Coll Med, Dept Pediat, Cincinnati, OH USA
[3] Cincinnati Childrens Hosp Med Ctr, Div Child & Adolescent Psychiat, Cincinnati, OH 45229 USA
基金
美国医疗保健研究与质量局; 美国国家卫生研究院;
关键词
Automated risk assessment; School violence; Machine learning; Natural language processing; ADOLESCENTS BRACHA; CHILDREN; BEHAVIOR; RELIABILITY; AGGRESSION;
D O I
10.1016/j.ijmedinf.2020.104137
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Introduction: School violence has a far-reaching effect, impacting the entire school population including staff, students and their families. Among youth attending the most violent schools, studies have reported higher dropout rates, poor school attendance, and poor scholastic achievement. It was noted that the largest crime-prevention results occurred when youth at elevated risk were given an individualized prevention program. However, much work is needed to establish an effective approach to identify at-risk subjects. Objective: In our earlier research, we developed a risk assessment program to interview subjects, identify risk and protective factors, and evaluate risk for school violence. This study focused on developing natural language processing (NLP) and machine learning technologies to automate the risk assessment process. Material and methods: We prospectively recruited 131 students with or without behavioral concerns from 89 schools between 05/01/2015 and 04/30/2018. The subjects were interviewed with two risk assessment scales and a questionnaire, and their risk of violence were determined by pediatric psychiatrists based on clinical judgment. Using NLP technologies, different types of linguistic features were extracted from the interview content. Machine learning classifiers were then applied to predict risk of school violence for individual subjects. A two-stage feature selection was implemented to identify violence-related predictors. The performance was validated on the psychiatrist-generated reference standard of risk levels, where positive predictive value (PPV), sensitivity (SEN), negative predictive value (NPV), specificity (SPEC) and area under the ROC curve (AUC) were assessed. Results: Compared to subjects' sociodemographic information, use of linguistic features significantly improved classifiers' predictive performance (P < 0.01). The best-performing classifier with n-gram features achieved 86.5 %/86.5 %/85.7 %/85.7 %/94.0 % (PPV/SEN/NPV/SPEC/AUC) on the cross-validation set and 83.3 %/93.8 %/91.7 %/78.6 %/94.6 % (PPV/SEN/NPV/SPEC/AUC) on the test data. The feature selection process identified a set of predictors covering the discussion of subjects' thoughts, perspectives, behaviors, individual characteristics, peers and family dynamics, and protective factors. Conclusions: By analyzing the content from subject interviews, the NLP and machine learning algorithms showed good capacity for detecting risk of school violence. The feature selection uncovered multiple warning markers that could deliver useful clinical insights to assist personalizing intervention. Consequently, the developed approach offered the promise of an accurate and scalable computerized screening service for preventing school violence.
引用
收藏
页数:9
相关论文
共 51 条
  • [1] Abramson N, 2006, Pattern recognition and machine learning, V103, P886, DOI [DOI 10.1117/1.2819119, 10.1117/1.2819119, DOI 10.1117/1]
  • [2] STATISTICS NOTES - DIAGNOSTIC-TESTS-1 - SENSITIVITY AND SPECIFICITY .3.
    ALTMAN, DG
    BLAND, JM
    [J]. BRITISH MEDICAL JOURNAL, 1994, 308 (6943) : 1552 - 1552
  • [3] DIAGNOSTIC-TESTS-2 - PREDICTIVE VALUES .4.
    ALTMAN, DG
    BLAND, JM
    [J]. BRITISH MEDICAL JOURNAL, 1994, 309 (6947) : 102 - 102
  • [4] [Anonymous], 2015, LIWC 2015 operators manual
  • [5] [Anonymous], 2004, KERNEL METHODS PATTE
  • [6] Automated Risk Assessment for School Violence: a Pilot Study
    Barzman, Drew
    Ni, Yizhao
    Griffey, Marcus
    Bachtel, Alycia
    Lin, Kenneth
    Jackson, Hannah
    Sorter, Michael
    DelBello, Melissa
    [J]. PSYCHIATRIC QUARTERLY, 2018, 89 (04) : 817 - 828
  • [7] Barzman D, 2012, J AM ACAD PSYCHIATRY, V40, P374
  • [8] A Pilot Study on Developing a Standardized and Sensitive School Violence Risk Assessment with Manual Annotation
    Barzman, Drew H.
    Ni, Yizhao
    Griffey, Marcus
    Patel, Bianca
    Warren, Ashaki
    Latessa, Edward
    Sorter, Michael
    [J]. PSYCHIATRIC QUARTERLY, 2017, 88 (03) : 447 - 457
  • [9] Barzman DH, 2011, J AM ACAD PSYCHIATRY, V39, P170
  • [10] Bernes K.B., 2007, Professional School Counseling, V10, P419, DOI [https://doi.org/10.5330/prsc.10.4.e43404402j07480u, DOI 10.5330/PRSC.10.4.E43404402J07480U]