This paper addresses issues in the retention and analysis of large image collections by examining issues with a forensic unlabeled dataset of over 1M human decomposition photos.
In many domains, large image collections are key ways in which information about relevant phenomena is retained and analyzed, yet it remains challenging to use such data in research and practice. The authors’ aim is to investigate this problem in the context of a forensic unlabeled dataset of over 1M human decomposition photos. To make this collection usable by experts, various body parts first need to be identified and traced through their evolution despite their distinct appearances at different stages of decay from "fresh" to "skeletonized". The authors developed an unsupervised technique for clustering images that builds sequences of similar images representing the evolution of each body part through stages of decomposition. Evaluation of the authors’ method on 34,476 human decomposition images shows that their method significantly outperforms the state-of-the-art clustering method in this application. (Published abstract provided)