Since research is needed to identify optimal scenarios for algorithm use in assessment development, the authors compared regression models (logistic, boosted, and penalized) to more advanced, techniques (neural networks, support vector machines, random forests, and K-nearest neighbors); while also introducing ‘stacking’, a method that combines algorithms to create an optimized model.
Using a multi-state sample of 258,464 youth assessments, the varied prediction scenarios by sample size and base rate. Although performance generally improved with greater sample size, a set of ‘top performing’ algorithms was identified. Among top performers, a ‘saturation point’ was observed, where algorithm type had little impact when samples exceeded 5,000 subjects. Although performance generally improved with greater sample size, a set of ‘top performing’ algorithms was identified. Among top performers, a ‘saturation point’ was observed, where algorithm type had little impact when samples exceeded 5,000 subjects. In an era of big data and artificial intelligence, it is tantalizing to explore new approaches. Although the authors do not hasten exploration, their findings demonstrate that sample size trumps algorithm type. Agencies and providers should consider this finding when adopting or developing tools, as algorithms that offer transparency may also be top performers. (Publisher abstract provided)
Downloads
Similar Publications
- Forensic Comparison and Matching of Fingerprints: Using Quantitative Image Measures for Estimating Error Rates Through Understanding and Predicting Difficulty
- A Newly Developed AI-Assisted Tool for the Collection of Cranial Landmark Data
- Training and Technical Assistance Increase the Fidelity of Implementation of a Universal Prevention Initiative in Rural Schools: Results from a 3-Year Cluster-Randomized Trial