FAQ

Common questions about the prediction models, fairness analysis, and data.

Binary rearrest outcome at 1, 2, and 3 years after parole release. The model outputs a probability (0–1), not a categorical risk label. Rearrest is an observable justice system event, not a comprehensive measure of behavior.

A proper scoring rule that measures how close predicted probabilities are to actual outcomes. Range 0–1, lower is better. Unlike accuracy, it rewards both correct ranking AND well-calibrated probabilities. A model that says “30% risk” should see about 30% of those individuals rearrested.

Accuracy depends on a threshold and treats all errors equally. Brier score evaluates the full probability output. A model can have high accuracy but terrible calibration (e.g., predict 0.99 for everyone in a 30% base-rate population — 70% “accurate” but useless probabilities).

Y1 = rearrest within 1 year (N=18,028). Y2 = rearrest in year 2, conditional on NOT being rearrested in year 1 (N=12,651). Y3 = rearrest in year 3, conditional on no prior rearrest (N=9,398). This conditioning prevents label leakage.

Small: 0.001–0.006 Brier points better than logistic regression. The gain is real but modest. Most of the predictive signal is captured by linear models.

Both variants are tested. Removing race from training does not meaningfully change performance or fairness gaps because other features carry correlated signal.

Differences in error rates (FPR, FNR) between demographic subgroups at a given threshold. These gaps change with the threshold — there is no single threshold that minimizes all gaps simultaneously.

Gang affiliation (SHAP 0.163), associated with a +19.1 percentage point increase in Y1 rearrest rate. Young age (18–22) and supervision risk score are also strong.

No. This is a research prototype. It has not been validated for operational use in any justice system. Labels are rearrest outcomes from a single jurisdiction and time period.

Yes, MIT License. Raw data is not redistributed; it is available through the NIJ Challenge portal.