Por Robert Pantzer
The movie Moneyball is based on the true story of Oakland Athletics general manager Billy Beane (played by Brad Pitt). Beane took over the As in 2002, when the team was mediocre and lacking the deep pockets of some of its rivals. Beane hired Peter Brand, a young Yale economics graduate with radical ideas about how to assess players’ value. Rather than relying on the scouts’ experience and intuition, Brand used algorithms and machine learning (ML). He could pick players discarded by the other teams because of age, injuries, and other reasons but with undervalued skills. The strategy paid off: the A’s won 19 consecutive games, tying for the longest winning streak in American League history, and other teams began to adopt this new method.
Similar machine learning shows promise in other areas, including domestic violence (DV), a challenge we face regionally and globally. According to the World Health Organization, on average 30% of the female population in Latin America and the Caribbean (LAC) is affected by domestic violence. Twelve women are killed every day just because they belong to “the wrong gender.” Half of all DV cases occur in households where children under the age of 12 are present. DV imposes a terrible burden on the victim and her children, and increases the likelihood that violence is perpetuated from one generation to the next.
While police departments have made progress in reducing street crime, domestic violence remains stubbornly high. Even as experts debate the exact causes of the decline in street crime, many believe the availability of big data and information management platforms have played an important role in determining crime hotspots, focusing police attention, and increasing the impact of law enforcement resources.
When it comes to DV, these tools have had their limits. This form of violence typically takes place behind closed doors. Police departments and social service providers have developed other intervention strategies such as emergency hotlines, community alarms, and mandatory arrests, but these tactics are more reactive than proactive, and they have a mixed record of success in reducing future violence.
In a world of limited resources, the Moneyball question then is: how can the police target scarce resources to reduce DV most effectively? How do you build a data-driven approach to concentrate resources on those who are most at-risk of DV? This means being able to predict the most serious types of repeat domestic violence – not an easy task. While many victims are potentially at-risk, only a handful will suffer the most extreme types of harm. Identifying those victims before something bad happens is like finding a needle in a haystack.
Data scientists are trying to do just that. They have begun applying machine learning to important social problems like domestic violence. Used by Silicon Valley giants like Google and Facebook, machine learning is a statistical technology that is optimized to search for detailed patterns in large data sets. These patterns are often subtle and are therefore difficult for human decision-makers — even experts — to identify.
Back to our analogy with our movie Moneyball. Peter Brand first fed and trained the machine to calculate which players were best suited to be used in which game – similar to what can be done to forecast domestic violence. In both cases, machine learning is basically an algorithm that says, ‘Determine which descriptions of individuals are strongly associated with the outcome,’. Then the machine is given instructions in how to filter. The machine would be “fed” and “taught” to process data. For instance, arraignments in a specific period would be matched with inputs such as age, prior domestic violence, zip codes and weapons used.
So far, the evidence shows that machines can provide decision-makers with insights that would have been otherwise impossible to generate. This is especially true when predicting instance of DV that will result in injuries or attempts to inflict injury.
At a recent presentation here at the IDB, researchers at the Crime Lab from the University of Chicago said that by scanning millions of police records, machine learning can assist decision-makers in identifying the victims who are, in fact, the riskiest and, in so doing, better target scarce resources. The great promise of machine learning is that governments can save lives at reduced costs — simply by becoming smarter with resource allocation. Machine learning has three technical advantages over other methods of analysis:
- ML can interpret more features of a large data set than traditional social science approaches, and more than any one human would be able to do alone.
- It is designed to learn from information such as the text of a police officer report that would be not possible to analyze using traditional methods.
- It uses cross-validation to guard against “overfitting” the available data — that is building an overly complex model that works well with an available data set but which would likely perform poorly in the real world.
We know it is not enough to predict crime. Social services providers and police officers must be able to do something about it. Already, some police departments around the world are using strategies involving victim engagement. A law enforcement officer or a social worker attempt to develop a relationship with a victim with the intention of supporting her and ensuring her continued safety.
This strategy is promising. ML can offer a decision-maker with a way to allocate resources in the most efficient way. Picking the right people to intervene is the key to maximizing the effectiveness.
Many industries, including retailers and sports, have gained by leveraging machine learning. It is now time to make sure we use these technologies to reduce crime and violence, especially violence that occurs behind closed doors.
Robert Pantzer is a specialist in modernization of the state at the IDB.
Flickr CC – Keyboard by J_D_L & Ben Taylor San Jose Giants Using Gameday by Intel Free Press