U.S. flag

An official website of the United States government, Department of Justice.

Using Sentiment Analysis and Topic Modeling in Assessing the Impact of Police Signaling on Investigative and Prosecutorial Outcomes in Sexual Assault Reports

NCJ Number
Date Published
December 2022
147 pages

This report describes a research study that sought to identify signaling in the narratives of police officers’ rape reports that affected subsequent attrition, to better understand the officer reports in a rape case impacted the progression in the criminal justice process.


The authors of this document present a research study that had the goal of better understanding if and how responding officers’ written reports in a rape case impact case progression in the criminal justice process. The authors discuss three research study aims: to assess the presence and type of sentiment specific to rape in the responding officers’ incident reports and, if sentiment is detected, how sentiment varies by the characteristics of the case, victim, and suspect; whether sentiments in the responding officers’ reports are different in cases with increased investigative activity, and how the phrases contained in the incident reports vary depending on the level of investigative activity; and, focusing on the most successful cases that proceeded to prosecution, whether sentiments in the responding officers’ reports are different in successful cases and how phrases contained in the incident reports differ from those unsuccessful cases that did not proceed to prosecution. Findings suggested that detected sentiment tended to skew near neutral/slightly negative and more subjective; incident reports’ sentiment varied based on investigation status; and the most successful cases were more positive and subjective. The authors describe the findings related to the three research aims, as well as several additional products from the research study, including: a protocol detailing the information extraction process for police reports, in Appendix A; an open-source sentiment lexicon, in Appendix B; a pre-trained classifier based on statistical algorithms that flag instances of signaling in police report, in Appendix B; a list of signals that predict less successful investigations and prosecutions, in Appendix B; and a training protocol for officers and detectives for how they respond to and report on rapes, in Appendix C.

Date Published: December 1, 2022