U.S. flag

An official website of the United States government, Department of Justice.

Expert Algorithm for Substance Identification Using Mass Spectrometry: Application to the Identification of Cocaine on Different Instruments Using Binary Classification Models

NCJ Number
308801
Journal
Journal of the American Society for Mass Spectrometry Volume: 34 Issue: 7 Dated: 2023 Pages: 1235-1247
Date Published
2023
Length
8 pages
Annotation

This article describes how general linear modeling (GLM) of a selection of the most abundant normalized fragment ion abundances of replicate mass spectra can be used in conjunction with binary classifiers to enable specific and selective identifications with reportable error rates of spectra from other laboratories.

Abstract

This is the second of two manuscripts describing how general linear modeling (GLM) of a selection of the most abundant normalized fragment ion abundances of replicate mass spectra from one laboratory can be used in conjunction with binary classifiers to enable specific and selective identifications with reportable error rates of spectra from other laboratories. Here, the proof-of-concept uses a training set of 128 replicate cocaine spectra from one crime laboratory as the basis of GLM modeling. GLM models for the 20 most abundant fragments of cocaine were then applied to 175 additional test/validation cocaine spectra collected in more than a dozen crime laboratories and 716 known negative spectra, which included 10 spectra of three diastereomers of cocaine. Spectral similarity and dissimilarity between the measured and predicted abundances were assessed using a variety of conventional measures, including the mean absolute residual and NIST’s spectral similarity score. For each spectral measure, GLM predictions were compared to the traditional exemplar approach, which used the average of the cocaine training set as the consensus spectrum for comparisons. In unsupervised models, EASI provided better than a 95% true positive rate for cocaine with a zero percent false positive rate. A supervised binary logistic regression model provided 100 percent accuracy and no errors using EASI-predicted abundances of only four peaks at m/z 152, 198, 272, and 303. Regardless of the measure of spectral similarity, error rates for identifications using EASI were superior to the traditional exemplar/consensus approach. As a supervised binary classifier, EASI was more reliable than using Mahalanobis distances. (Published Abstract Provided)

Date Published: January 1, 2023