Methodology

Every model, algorithm, library, and configuration used in the juryrig project. All predictive models are scikit-learn classifiers. Python 3.10+, managed via uv.

Models

1. HistGradientBoostingClassifier (primary)

Histogram-based gradient boosting from scikit-learn. Uses default hyperparameters (no custom tuning). Post-hoc sigmoid calibration applied via CalibratedClassifierCV.

SettingValue
Hyperparametersscikit-learn defaults
Calibrationsigmoid (Platt scaling) via CalibratedClassifierCV, cv=3
PreprocessingOrdinalEncoder for categorical, SimpleImputer for missing
Random state7

2. LogisticRegression (baseline)

Linear classifier from scikit-learn.

SettingValue
Solversaga
max_iter2000
tol1e-3
C1.0 (default)
PreprocessingMaxAbsScaler + OneHotEncoder, SimpleImputer
Random state7

3. DummyClassifier (floor baseline)

strategy="prior" — predicts training-set class distribution for every sample. Establishes a performance floor.

4. Race-Association Regression Models

Separate logistic regressions used for adjusted race-association analysis (not for prediction):

SpecificationControlsHyperparameters
Core-adjusted County, charge severity, arrest type, age, gender, ethnicity LogisticRegression(max_iter=300, C=1.0, solver="saga", random_state=7)
Charge-detail Core controls + law section, charge weight, attempt flag Same

Both use OneHotEncoder(drop="first") for categorical encoding.

Calibration

Calibration is applied to the HistGradientBoostingClassifier only.

SettingValue
MethodSigmoid (Platt scaling)
Cross-validation folds3
Implementationscikit-learn CalibratedClassifierCV

Calibration evaluation uses 10 quantile-based bins (pd.qcut) to compute bin-wise average predictions vs. average outcomes.

Evaluation Metrics

MetricRoleDefinition
AccuracyPrimary classificationShare of cases classified correctly
AUROCRanking qualityArea under the ROC curve
PR-AUCPositive-class rankingArea under the precision-recall curve
Brier ScoreCalibration qualityMean squared error of predicted probabilities (lower is better)

Model selection uses Brier score on the evaluation split. Subgroup metrics computed by Race, Ethnicity, Gender, age_bucket, and Region.

Feature Policy

19 features from OCA-STAT arraignment-time fields. The default configuration (include_all) includes Gender, Ethnicity, and Race.

Target

CategoryValues
PositivePlea, Verdict-TFG
NegativeDismissed, Dism-ACD, Verdict-ACQ
ExcludedPending, warrant, transfer, other, unknown states

Leakage-Excluded Fields

Outcome and post-arraignment fields excluded from features:

Feature List

Court TypeRegionDistrictCountyCourt Arresting AgencyArrest TypeArraign YearArraign Month Top Charge at ArraignmentSeverityWeightLaw Article.SectionAttempt FlagGenderEthnicity RaceArrest Age

Data Sources

DatasetSourceGrainCoverage
OCA-STAT NY Office of Court Administration One defendant-docket 2021–2025, monthly refresh
DCJS/OCA Supplemental Pretrial NY Division of Criminal Justice Services / OCA One criminal cycle 2019–2024 (Oct 2025 release)

Both datasets are public. Raw CSVs and processed outputs are not committed to the repository; only aggregate analysis results are published.

Libraries

LibraryPurpose
scikit-learnAll classifiers, calibration, preprocessing, metrics
pandasData manipulation and feature engineering
numpyNumerical operations
pyarrowParquet data storage
matplotlibFigure generation (SVG output)

Model Card

FieldValue
Snapshot date2026-03-07
Run ID20260307_224037
Modeled rows1,609,252
Test rows271,701
Feature count19
Maturity threshold≥ 90% disposed share by cohort
Best modelHistGradientBoostingClassifier
Best AUROC0.8644
Best Brier0.1496