IDEAS home Printed from https://ideas.repec.org/p/osf/socarx/xc538.html
   My bibliography  Save this paper

Mapping the risk terrain for crime using machine learning

Author

Listed:
  • Wheeler, Andrew Palmer

    (University of Texas at Dallas)

  • Steenbeek, Wouter

Abstract

Objectives: We illustrate how a machine learning algorithm, Random Forests, can provide accurate long-term predictions of crime at micro places relative to other popular techniques. We also show how recent advances in model summaries can help to open the ‘black box’ of Random Forests, considerably improving their interpretability. Methods: We generate long-term crime forecasts for robberies in Dallas at 200 by 200 feet grid cells that allow spatially varying associations of crime generators and demographic factors across the study area. We then show how using interpretable model summaries facilitate understanding the model’s inner workings. Results: We find that Random Forests greatly outperform Risk Terrain Models and Kernel Density Estimation in terms of forecasting future crimes using different measures of predictive accuracy, but only slightly outperform using prior counts of crime. We find different factors that predict crime are highly non-linear and vary over space. Conclusions: We show how using black-box machine learning models can provide accurate micro placed based crime predictions, but still be interpreted in a manner that fosters understanding of why a place is predicted to be risky. Data and code to replicate the results can be downloaded from https://www.dropbox.com/sh/b3n9a6z5xw14rd6/AAAjqnoMVKjzNQnWP9eu7M1ra?dl=0

Suggested Citation

  • Wheeler, Andrew Palmer & Steenbeek, Wouter, 2020. "Mapping the risk terrain for crime using machine learning," SocArXiv xc538, Center for Open Science.
  • Handle: RePEc:osf:socarx:xc538
    DOI: 10.31219/osf.io/xc538
    as

    Download full text from publisher

    File URL: https://osf.io/download/5e21b040edceab008782df53/
    Download Restriction: no

    File URL: https://libkey.io/10.31219/osf.io/xc538?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. David Wheeler & Lance Waller, 2009. "Comparing spatially varying coefficient models: a case study examining violent crime rates and their relationships to alcohol outlets and illegal drug arrests," Journal of Geographical Systems, Springer, vol. 11(1), pages 1-22, March.
    2. Alex Chohlas-Wood & E. S. Levine, 2019. "A Recommendation Engine to Aid in Identifying Crime Patterns," Interfaces, INFORMS, vol. 49(2), pages 154-166, March.
    3. Wright, Marvin N. & Ziegler, Andreas, 2017. "ranger: A Fast Implementation of Random Forests for High Dimensional Data in C++ and R," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 77(i01).
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Rummens, Anneleen & Hardyns, Wim, 2021. "The effect of spatiotemporal resolution on predictive policing model performance," International Journal of Forecasting, Elsevier, vol. 37(1), pages 125-133.
    2. Wheeler, Andrew Palmer & Reuter, Sydney, 2020. "Redrawing hot spots of crime in Dallas, Texas," SocArXiv nmq8r, Center for Open Science.
    3. Guido de Blasio & Alessio D'Ignazio & Marco Letta, 2020. "Predicting Corruption Crimes with Machine Learning. A Study for the Italian Municipalities," Working Papers 16/20, Sapienza University of Rome, DISS.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Mariana Oliveira & Luís Torgo & Vítor Santos Costa, 2021. "Evaluation Procedures for Forecasting with Spatiotemporal Data," Mathematics, MDPI, vol. 9(6), pages 1-27, March.
    2. Arjan S. Gosal & Janine A. McMahon & Katharine M. Bowgen & Catherine H. Hoppe & Guy Ziv, 2021. "Identifying and Mapping Groups of Protected Area Visitors by Environmental Awareness," Land, MDPI, vol. 10(6), pages 1-14, May.
    3. Albert Stuart Reece & Gary Kenneth Hulse, 2022. "European Epidemiological Patterns of Cannabis- and Substance-Related Congenital Neurological Anomalies: Geospatiotemporal and Causal Inferential Study," IJERPH, MDPI, vol. 20(1), pages 1-35, December.
    4. Michael Parzinger & Lucia Hanfstaengl & Ferdinand Sigg & Uli Spindler & Ulrich Wellisch & Markus Wirnsberger, 2020. "Residual Analysis of Predictive Modelling Data for Automated Fault Detection in Building’s Heating, Ventilation and Air Conditioning Systems," Sustainability, MDPI, vol. 12(17), pages 1-18, August.
    5. Van Belle, Jente & Guns, Tias & Verbeke, Wouter, 2021. "Using shared sell-through data to forecast wholesaler demand in multi-echelon supply chains," European Journal of Operational Research, Elsevier, vol. 288(2), pages 466-479.
    6. Albert Stuart Reece & Gary Kenneth Hulse, 2022. "European Epidemiological Patterns of Cannabis- and Substance-Related Body Wall Congenital Anomalies: Geospatiotemporal and Causal Inferential Study," IJERPH, MDPI, vol. 19(15), pages 1-38, July.
    7. Philipp Bach & Victor Chernozhukov & Malte S. Kurz & Martin Spindler & Sven Klaassen, 2021. "DoubleML -- An Object-Oriented Implementation of Double Machine Learning in R," Papers 2103.09603, arXiv.org, revised Feb 2024.
    8. Marchetto, Elisa & Da Re, Daniele & Tordoni, Enrico & Bazzichetto, Manuele & Zannini, Piero & Celebrin, Simone & Chieffallo, Ludovico & Malavasi, Marco & Rocchini, Duccio, 2023. "Testing the effect of sample prevalence and sampling methods on probability- and favourability-based SDMs," Ecological Modelling, Elsevier, vol. 477(C).
    9. Jorge Luis Andrade & José Luis Valencia, 2022. "A Fuzzy Random Survival Forest for Predicting Lapses in Insurance Portfolios Containing Imprecise Data," Mathematics, MDPI, vol. 11(1), pages 1-16, December.
    10. Eeva-Katri Kumpula & Pauline Norris & Adam C Pomerleau, 2020. "Stocks of paracetamol products stored in urban New Zealand households: A cross-sectional study," PLOS ONE, Public Library of Science, vol. 15(6), pages 1-11, June.
    11. Michael Bucker & Gero Szepannek & Alicja Gosiewska & Przemyslaw Biecek, 2020. "Transparency, Auditability and eXplainability of Machine Learning Models in Credit Scoring," Papers 2009.13384, arXiv.org.
    12. Jian Lu & Raheel Ahmad & Thomas Nguyen & Jeffrey Cifello & Humza Hemani & Jiangyuan Li & Jinguo Chen & Siyi Li & Jing Wang & Achouak Achour & Joseph Chen & Meagan Colie & Ana Lustig & Christopher Dunn, 2022. "Heterogeneity and transcriptome changes of human CD8+ T cells across nine decades of life," Nature Communications, Nature, vol. 13(1), pages 1-13, December.
    13. Timo Schulte & Tillmann Wurz & Oliver Groene & Sabine Bohnet-Joschko, 2023. "Big Data Analytics to Reduce Preventable Hospitalizations—Using Real-World Data to Predict Ambulatory Care-Sensitive Conditions," IJERPH, MDPI, vol. 20(6), pages 1-16, March.
    14. Fogliato Riccardo & Oliveira Natalia L. & Yurko Ronald, 2021. "TRAP: a predictive framework for the Assessment of Performance in Trail Running," Journal of Quantitative Analysis in Sports, De Gruyter, vol. 17(2), pages 129-143, June.
    15. Edward J Gregr & Dana R Haggarty & Sarah C Davies & Cole Fields & Joanne Lessard, 2021. "Comprehensive marine substrate classification applied to Canada’s Pacific shelf," PLOS ONE, Public Library of Science, vol. 16(10), pages 1-28, October.
    16. Roman Hornung, 2020. "Ordinal Forests," Journal of Classification, Springer;The Classification Society, vol. 37(1), pages 4-17, April.
    17. Lyubchich, Vyacheslav & Woodland, Ryan J., 2019. "Using isotope composition and other node attributes to predict edges in fish trophic networks," Statistics & Probability Letters, Elsevier, vol. 144(C), pages 63-68.
    18. Marc Deffland & Claudia Spies & Bjoern Weiss & Niklas Keller & Mirjam Jenny & Jochen Kruppa & Felix Balzer, 2020. "Effects of pain, sedation and delirium monitoring on clinical and economic outcome: A retrospective study," PLOS ONE, Public Library of Science, vol. 15(9), pages 1-14, September.
    19. Preston Thomas Sorenson & Jeremy Kiss & Angela Bedard-Haughn, 2024. "A Proposed Methodology for Determining the Economically Optimal Number of Sample Points for Carbon Stock Estimation in the Canadian Prairies," Land, MDPI, vol. 13(1), pages 1-16, January.
    20. Victor Martínez‐de‐Albéniz & Arnau Planas & Stefano Nasini, 2020. "Using Clickstream Data to Improve Flash Sales Effectiveness," Production and Operations Management, Production and Operations Management Society, vol. 29(11), pages 2508-2531, November.

    More about this item

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:osf:socarx:xc538. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: OSF (email available below). General contact details of provider: https://arabixiv.org .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.