Making UCR Data Useful and Accessible

February 2004
Michael D. Maltz; Joseph Targonski
This paper reports on how the Federal Bureau of Investigation’s (FBI) Uniform Crime Reporting (UCR) data were cleaned and gaps were accounted for, as well as how the data were made more accessible by combining the data from 1977 through 2000 into a single dataset.
FBI Uniform Crime Reporting data are widely used in decisions regarding the allocation of Federal funds and in varied types of research. However, UCR data may be problematic because gaps in the data exist that affect the accuracy of the analyses that had used the UCR. There were originally three goals of this project: (1) to clean and account for the gaps in the UCR data, and to annotate the data so that the type of problem each gap represented is indicated; (2) to make the data more accessible by combining data from 1977 through 2000; and (3) to develop and test methods of imputing the missing data. The first two goals are addressed in this paper; the first two goals proved so time consuming that the third goal was not attempted. The result is a set of 51 Excel files, 50 of which are State files containing annotated crime and ancillary information. Following an introduction in section 1, section 2 describes the data sources used to prepare the State UCR files, which were obtained primarily through the National Archives of Criminal Justice Data (NACJD). Section 3 describes the software used for preparing the crime data. SPSS database software was used to organize the data in preparation for exportation to Excel. Section 4 discusses the missing value codes used in the analysis of the UCR data, which involved the use of large negative numbers. Section 5 describes the procedures used to clean, analyze, and plot the data. After obtaining and aggregating the annual crime data files, the files were converted from SPSS to Excel. Next, the authors used exploratory data analysis techniques to identify gaps and anomalies in the data. Finally, section 6 offers a listing of additional tasks that should be undertaken in order to further enhance the monthly data series of Crime in the United States, including adding the ability to copy and plot the data onto another spreadsheet and developing a way to allocate statewide crime data to counties. Appendices include information on SPSS syntax for creating Excel files and the Microsoft Excel Visual Basic Macros used to clean the UCR crime data. Tables, figures, appendix

