U.S. flag

An official website of the United States government, Department of Justice.

Random Forest Processing of Direct Analysis in Real-Time Mass Spectrometric Data Enables Species Identification of Psychoactive Plants From Their Headspace Chemical Signatures

NCJ Number
Acs Omega Volume: 4 Issue: 13 Dated: 2019 Pages: 15636-15644
Date Published
9 pages

Since the United Nations Office on Drugs and Crime has designated several "legal highs" as "plants of concern" because of the dangers associated with their increasing recreational abuse, the current study demonstrated that several of these products have unique but consistent headspace chemical profiles and that multivariate statistical analysis processing of their chemical signatures can be used to accurately identify the species of plants from which the materials are derived.


Since the routine identification of these products is hampered by the difficulty in distinguishing them from innocuous plant materials such as foods, herbs, and spices, the headspace volatiles of several species were analyzed in the current study by direct analysis in real-time high-resolution mass spectrometry (DART-HRMS). These species include Althaea officinalis, Calea zacatechichi, Cannabis indica, Cannabis sativa, Echinopsis pachanoi, Lactuca virosa, Leonotis leonurus, Mimosa hositlis, Mitragyna speciosa, Ocimum basilicum, Origanum vulgare, Piper methysticum, Salvia divinorum, Turnera diffusa, and Voacanga africana. The results of the DART-HRMS analysis revealed intraspecies similarities and interspecies differences. Exploratory statistical analysis of the data using principal component analysis and global t-distributed stochastic neighbor embedding showed clustering of like species and separation of different species. This led to the use of supervised random forest (RF), which resulted in a model with 99 percent accuracy. A conformal predictor based on the RF classifier was created and proved to be valid for a significance level of 8 percent with an efficiency of 0.1, an observed fuzziness of 0, and an error rate of 0. The variables used for the statistical analysis processing were ranked in terms of the ability to enable clustering and discrimination between species using principal component analysis-variable importance of projection scores and RF variable importance indices. The variables that ranked the highest were then identified as m/z values consistent with molecules previously identified in plant material. This technique shows proof-of-concept for the creation of a database for the detection and identification of plant-based legal highs through headspace analysis. (publisher abstract modified)

Date Published: January 1, 2019