Collaborative learning of semi-supervised clustering and classification for labeling uncurated data

NCJ Number

305932

Date Published

2020

Author(s)

Sara Mousavi; Dylan Lee; Tatianna Griffin; Dawnie Steadman; Audris Moc

Length

5 pages

Annotation

The authors report on their design and implementation of the Plud system, which provides an iterative semi-supervised workflow to minimize the effort spent by an expert and, because it does not make any assumption about its input, can handle realistic large collections of images regardless of their size and type.

Abstract

Domain-specific image collections present potential value in various areas of science and business but are often not curated nor have any way to readily extract relevant content. To employ contemporary supervised image analysis methods on such image data, they must first be cleaned and organized, and then manually labeled for the nomenclature employed in the specific domain, which is a time consuming and expensive endeavor. Plud is an iterative sequence of unsupervised clustering, human assistance, and supervised classification. With each iteration 1) the labeled dataset grows, 2) the generality of the classification method and its accuracy increases, and 3) manual effort is reduced. We evaluated the effectiveness of our system, by applying it on over a million images documenting human decomposition. In the authors’ experiment comparing manual labeling with labeling conducted with the support of Plud, they found that it reduces the time needed to label data and produces highly accurate models for this new domain. (Published abstract provided)

Date Published: January 1, 2020

Downloads

HTML
PDF

Collaborative learning of semi-supervised clustering and classification for labeling uncurated data

Downloads

Related Topics

Similar Publications

Collaborative learning of semi-supervised clustering and classification for labeling uncurated data

Additional Details

Downloads

Related Topics

Similar Publications