This document presents an unsupervised and analytical workflow for clustering a large collection of forensic images by using classic clustering on deep feature representation of the images along with domain-related data to group them together.
The authors of this brief document seek to resolve the problem of efficiently curating the large collections of forensic images that could contribute to the quality of research in many domains. For their project, the authors’ main sources for images were forensic anthropology centers and crime scenes; the dataset, collected at the University of Tennessee’s Anthropology Research Center, contains one million images that were collected over eight years. The authors’ purpose was to develop and present an unsupervised and analytical workflow for clustering a large collection of forensic images by using classic clustering on deep feature representation of the images in addition to domain-related data to group them together. The authors show the workflow they developed, and discuss its development and methodology. They note that the model is pre-trained on ImageNet and produced a 2048-length feature vector for each image which is then reduced to 256 via PCA; they created vectors for weather, geographic and other external data, as well as image feature representations, in order to successfully cluster a large temporal forensic dataset in an unsupervised manner. The authors’ findings show that by adding weather features, the clustering precision increased to 89% from the initial approach that yielded only 64% precision.
Similar Publications
- Application of X-ray photoelectron spectroscopy to examine surface chemistry of cancellous bone and medullary contents to refine bone sample selection for nuclear DNA analysis
- Step toward Roadside Sensing: Noninvasive Detection of a THC Metabolite from the Sweat Content of Fingerprints
- Plant Seed Species Identification from Chemical Fingerprints: A High-Throughput Application of Direct Analysis in Real Time Mass Spectrometry