Use Case: Predict Who May Be Involved in Gun Violence with DataRobot
In this article, we’ll explain how to predict which suspects are most at risk for violence and/or future convictions.
What’s the problem?
The city of Chicago is facing high levels of gun violence, with a recognizable surge being recorded since 2015. In an effort to best prioritize resources, the police department is testing strategies to help curb the problem. With an algorithm they have been refining since 2012, the city scores people based on their probability of being involved in a shooting incident, either as a victim or an offender. They can then use an individual’s score as an indicator of need for social services and also as a potential investigative tool for police.
The challenge and solution
The algorithm’s scores result in the Strategic Subject List or SSL. The SSL rates individuals with a score between 0 and 500, with 500 being the highest risk of involvement in a shooting. While the city has not published how it calculates the scores, they have made a version of the list available (without names). The challenge for all involved is to identify the critical factors that drive a high SSL score so that intervention can address them. With DataRobot, it’s possible to develop a model based on historical examples of people’s scores and their backgrounds. The model can then be used to predict scores for individuals as well as provide insight into the factors that affect SSL scores.
Tammy is a social worker for at-risk adults. Some of her clients have been visited by the police because of their high SSL scores. Tammy would like to understand the factors driving the score so that she can intervene and develop strategies to lower their scores. Because Tammy (and anyone else outside of the CPD analytics squad) does not have access to the formula for creating the score, she needs another way to use the data to derive indicators. Tammy wants to build a machine learning model that predicts scores for her client and explains the factors impacting the scores. She uses DataRobot to build that model and simultaneously provide accuracy and SSL score justification. In this example you will use data provided by the City of Chicago to identify high risk individuals.
Training data: https://s3.amazonaws.com/datarobot-use-case-datasets/Strategic_Subject_List_20k_training.csv 2
Prediction data: https://s3.amazonaws.com/datarobot-use-case-datasets/Strategic_Subject_List_20k_testing.csv 1
Data Dictionary: https://s3.amazonaws.com/datarobot-use-case-datasets/Strategic+Suspect+List+-+Data+Dictionary.pdf