U.S. flag

An official website of the United States government, Department of Justice.

NCJRS Virtual Library

The Virtual Library houses over 235,000 criminal justice resources, including all known OJP works.
Click here to search the NCJRS Virtual Library

Criminal justice forecasts of risk: A machine learning approach

NCJ Number
Richard Berk
Date Published

This book analyzes developments in Annotation Machine learning and nonparametric function estimation procedures for risk assessment and forecasting, to inform criminal justice decisions, with an emphasis on the statistical and computer science tools that can dramatically improve those kinds of forecasts in criminal justice settings.


This book’s target audience is researchers in the social sciences, and data analysts in criminal justice agencies. The book examines Annotation Machine learning and nonparametric function estimation procedures, which can be effectively used in forecasting. This book considers how current availability of large administrative databases, inexpensive computing power, and developments in statistics and computer science have increased the accuracy and applicability of forecasting; it places emphasis on the statistical and computer science tools, under the rubric of supervised learning, that can dramatically improve these kinds of forecasts in criminal justice settings. The book is divided into the following eight chapters: the introduction sets the stage, chapter two provides important background material, including policy considerations, data considerations, and statistical considerations; chapter three serves as a conceptual introduction to classification and forecasting; chapter four provides a more formal treatment of classification and forecasting, with discussion of data generation models, notation, classification, estimation in the real world, and the joint probability model; chapter five discusses tree-based forecasting methods, with sections on splitting the data, building the costs of classification errors, classification tables, ensembles of trees – random forests, tree-based alternatives to random forests, and a discussion of why ensembles of classification trees work so well; chapter six consists of examples, a simplified and a more complex example; chapter seven breaks down implementation, with sections on the demonstration phase, hardware and software, data preparation for forecasting, personnel, and public relations; chapter eight provides concluding observations about actuarial justice, noting current trends and larger issues, and data structure and new science.