Objective: To develop and validate algorithms for predicting 30-day fatal and nonfatal opioid-related overdose using statewide data sources including prescription drug monitoring program data, Hospital Discharge Data System data, and Tennessee (TN) vital records. Current overdose prevention efforts in TN rely on descriptive and retrospective analyses without prognostication.
Materials and Methods: Study data included 3 041 668 TN patients with 71 479 191 controlled substance prescriptions from 2012 to 2017. Statewide data and socioeconomic indicators were used to train, ensemble, and calibrate 10 nonparametric "weak learner" models. Validation was performed using area under the receiver operating curve (AUROC), area under the precision recall curve, risk concentration, and Spiegelhalter z-test statistic. Results: Within 30 days, 2574 fatal overdoses occurred after 4912 prescriptions (0.0069%) and 8455 nonfatal overdoses occurred after 19 460 prescriptions (0.027%). Discrimination and calibration improved after ensembling (AUROC: 0.79-0.83; Spiegelhalter P value: 0-.12). Risk concentration captured 47-52% of cases in the top quantiles of predicted probabilities. Discussion: Partitioning and ensembling enabled all study data to be used given computational limits and helped mediate case imbalance. Predicting risk at the prescription level can aggregate risk to the patient, provider, pharmacy, county, and regional levels. Implementing these models into Tennessee Department of Health systems might enable more granular risk quantification. Prospective validation with more recent data is needed. Conclusion: Predicting opioid-related overdose risk at statewide scales remains difficult and models like these, which required a partnership between an academic institution and state health agency to develop, may complement traditional epidemiological methods of risk identification and inform public health decisions.