New York Criminal Court Analysis

Two independent analyses of public New York State court records: a machine-learning model estimating conviction likelihood from 1.6 million arraignment records, and a before-after comparison of pretrial release rates around bail-law amendments using 854,000 cases.

1.6M
Arraignment Records
0.86
Best Model AUROC
854K
Pretrial Cases
61
Counties Analyzed

Two Research Branches

Branch 1: Conviction Prediction

Using arraignment-time information alone — county, charge severity, arrest type, demographics — how well can a model estimate whether a case ends in conviction?

Key finding: Geography dominates. NYC courts have a 25% conviction rate; non-NYC courts have 69%. The best model reaches 79% accuracy and AUROC 0.86.

Explore findings

Branch 2: Pretrial Release Impact

Did pretrial release rates change differently for charge categories targeted by New York's May 2022 and June 2023 bail-law amendments?

Key finding: Firearm charges showed the largest shift — release rates dropped 20–32 pp more than baseline after May 2022.

Explore pretrial impact

At a Glance

Model Performance Comparison

Three classifiers evaluated on 271,701 held-out test cases

Higher is better for AUROC and PR-AUC. Lower is better for Brier Score. Dummy baseline uses training-set class distribution.

Monthly Arraignment Volume

OCA-STAT records by year-month cohort, 2021–2025

Strongest Patterns

Geography

NYC courts: ~25% conviction rate. Non-NYC courts: ~69%. This is the single strongest predictor in the dataset, dwarfing all other factors.

Charge Severity

Felony cases: ~66% conviction rate. Violations: ~25%. Charge severity is the second-strongest signal after geography.

County Variation

Model AUROC ranges from 0.52 (Schoharie) to 0.77 (Monroe). Accuracy varies substantially across New York's 61 counties.

Explore the Analysis

Race-Adjusted Association

Three independent methods examine whether race correlates with conviction after controlling for case characteristics. The Black–White gap is ~2.6 pp across all methods.

View race analysis

Methodology

Scikit-learn classifiers with post-hoc calibration. 19 arraignment-time features. Full model specifications, metrics, and library versions.

View methods

FAQ

What is OCA-STAT? What does the model predict? What are the limitations? Short answers to common questions.

Read FAQ

Limitations

The predictive model uses only arraignment-time information. It excludes plea negotiations, evidence strength, attorney quality, and criminal history detail. The pretrial analysis is observational, not a controlled experiment. Both branches cover New York State only (2021–2025). Model accuracy varies by county and demographic group.