Parity Mode (fp.exe vs fppy)¶
This page is the operator playbook for parity runs and scenario changes.
fp-wraptr parity uses the vendored minimal fppy execution/parity core (module name fp_py) and compares engine outputs on the PABEV.TXT contract.
By default, fp parity runs the fppy backend with fppy.eq_flags_preset: parity (EQ backfill enabled and SETUPSOLVE honored). You can override this via the scenario's fppy.eq_flags_preset.
Timeouts (fppy)¶
Full-horizon parity runs can take a while under the parity preset. If you hit status=engine_failure with an error like fppy mini-run timed out, raise the timeout in your scenario YAML:
eq_flags_preset=parity default and opt-out¶
- Default behavior for
fp parityisfppy.eq_flags_preset: parity. - The parity preset targets setupsolve-style solve semantics (including EQ backfill + iteration controls) so fppy execution matches parity expectations more closely.
- This preset is the main reason baseline parity runs can now reach exact/near-exact alignment.
- To opt out, set the scenario YAML explicitly to
default:
- Use
defaultonly when intentionally testing legacy/non-parity behavior; parity artifacts and dashboard views will still surface the active preset so results remain auditable.
Identity Overlay (Why Parity Can Diverge At A Boundary Quarter)¶
When fppy.eq_flags_preset=parity is active, fp-wraptr writes a wrapper deck
(work_fppy/fppy_fminput.txt) that injects a pre-SOLVE “identity overlay”
extracted from FM/fminput.txt (work_fppy/fppy_identity_overlay.txt).
This overlay exists because fp.exe runs with compiled model state; many scenario
decks omit large blocks of baseline CREATE/GENR/IDENT/LHS definitions
that still affect stored series and therefore PRINTVAR ... LOADFORMAT output.
Important details:
- The overlay now preserves
SMPL ...;statements from the baseline deck so windowedCREATE/GENRblocks keep their intended scope. - After inserting the overlay, fp-wraptr restores the scenario deck’s current
SMPLwindow immediately beforeSOLVE. Without this, a baseline overlaySMPLcan “leak” into the solve window and distort boundary-quarter values (for example trend-break helper series likeD2,CNST2L2,TBL2), which can cascade into sign-flip hard fails.
If parity suddenly diverges at the first history/forecast boundary quarter:
- Inspect
work_fppy/fppy_fminput.txtaround the inserted overlay block. - Confirm
work_fppy/fppy_identity_overlay.txtincludes the expectedSMPLstatements from the baseline deck. - Confirm the last
SMPLbefore theSOLVE ...;line inwork_fppy/fppy_fminput.txtmatches the intended solve window (often the scenarioforecast_startquarter when running--quick).
pd.eval function-surface workaround (why it exists)¶
fppy evaluates FP expressions through pd.eval(..., engine="python"). In this path,
direct numpy module-attribute calls such as np.log(...) can trigger resolver errors on
some pandas/numpy stacks (for example TypeError: '>' not supported ...).
To keep parity runs deterministic, expression rewriting routes LOG/EXP/ABS through a plain
function surface (fp.log/fp.exp/fp.abs) instead of np.* names. This behavior is intentional
and regression-tested (tests/test_fppy_expressions.py).
Upstream reference: pandas-dev/pandas#58041
One-scenario parity run¶
For quick smoke checks, run with --quick (gates comparisons through the scenario forecast_start quarter):
Quick smoke with strict gate behavior:
Note: parity reads either PABEV.TXT or PACEV.TXT when present. fp-wraptr resolves both filenames automatically, so both are accepted in engine artifacts.
With bounded-drift guardrails:
Gate parity comparisons through an end period¶
Use --gate-pabev-end to compare PABEV.TXT only through a specified quarter, which is useful for fast smoke gates or focusing on near-term forecast windows.
Equivalent via run command:
uv run fp run examples/baseline.yaml --backend both --with-drift
uv run fp run examples/baseline_smoke.yaml --backend both --with-drift --parity-quick
Save a parity run as golden artifacts:
Regression-check current run against a saved golden baseline:
--save-golden and --regression are intentionally mutually exclusive in one invocation to avoid trivial self-comparison.
--save-golden <dir>stores:<dir>/<scenario_name>/parity_report.json<dir>/<scenario_name>/work_fpexe/PABEV.TXT<dir>/<scenario_name>/work_fppy/PABEV.TXT<dir>/<scenario_name>/gate.json--regression <dir>fails the CLI when new findings appear versus golden:- new
missing_leftvariables - new
missing_rightvariables - new hard-fail cells (
variable + period + reason) - new diff variables
Where artifacts are written¶
Each parity run writes a timestamped directory under --output-dir (default artifacts), for example:
artifacts/<scenario>_<timestamp>/parity_report.jsonartifacts/<scenario>_<timestamp>/work_fpexe/PABEV.TXTartifacts/<scenario>_<timestamp>/work_fppy/PABEV.TXTartifacts/<scenario>_<timestamp>/work_fpexe/fp-exe.stdout.txtartifacts/<scenario>_<timestamp>/work_fpexe/fp-exe.stderr.txtartifacts/<scenario>_<timestamp>/work_fppy/fppy.stdout.txtartifacts/<scenario>_<timestamp>/work_fppy/fppy.stderr.txt
Hard-fail triage and support-gap mapping¶
One-command loop (parity run + triage artifacts + optional regression gate):
uv run fp triage loop examples/baseline_smoke.yaml --with-drift --output-dir artifacts/triage_loop_demo
For fast smoke, add --quick:
uv run fp triage loop examples/baseline_smoke.yaml --with-drift --quick --output-dir artifacts/triage_loop_quick
With regression gate against a saved golden baseline:
uv run fp triage loop examples/baseline_smoke.yaml --with-drift --regression artifacts/parity-golden
Generate full hard-fail cell triage for an existing run:
Generate deterministic issue buckets from fppy’s report (if present):
Generate unsupported-statement impact mapping (ranked by hard-fail impact):
This emits:
artifacts/<scenario>_<timestamp>/support_gap_map.csvartifacts/<scenario>_<timestamp>/support_gap_top.md
Detect solve-window structural fill issues (zero_fill and carry-forward flatline_fill):
This emits:
artifacts/<scenario>_<timestamp>/zero_forecast_offenders.csv
Interpreting parity_report.json (difference report)¶
Divergence interpretation matrix¶
status |
exit_code |
Meaning | Operator action |
|---|---|---|---|
ok |
0 |
Hard-fail checks passed and numeric gate passed. | Approve parity gate for this scenario/run fingerprint. |
hard_fail |
3 |
Semantic mismatch (missing/discrete/signflip). |
Stop release; investigate hard-fail rows first. |
gate_failed |
2 |
Numeric diffs exceeded tolerance gate. | Review top diffs and period-level tails; tune/fix inputs or model changes. |
drift_failed |
2 |
Drift guardrail failed (--with-drift) even when basic gate passed. |
Treat as gate failure; inspect drift_check.fail_reasons and trend metrics. |
fingerprint_mismatch |
5 |
Current input fingerprint does not match lockfile. | Re-baseline fingerprint after confirming intentional scenario/input changes. |
engine_failure |
4 |
Engine execution failed. | Fix runtime/environment issue; rerun parity. |
missing_output |
4 |
Expected PABEV.TXT artifact was not produced. |
Fix execution/output path issue; rerun parity. |
Metric checklist (read in order)¶
statusandexit_codedecide pass/fail for automation and release gating.- Hard-fail severity:
pabev_detail.hard_fail_cell_count(or aliashard_fail_cells_count)pabev_detail.hard_fail_cellssample for first triage- Missing-variable parity health:
pabev_detail.missing_leftandpabev_detail.missing_right(compare lengths and names)- Numeric divergence magnitude:
pabev_detail.max_abs_diffpabev_detail.median_abs_diffpabev_detail.p90_abs_diff- Drift guardrail health (
--with-driftonly): drift_check.statusdrift_check.fail_reasonsdrift_check.max_abs_observeddrift_check.quantile_growth_factor
Hard-fail invariants (missing/discrete/signflip) are non-negotiable and are enforced regardless of tolerance or drift settings.
Scenario-change policy (operator)¶
If a scenario input set changes (deck edits, exogenous changes, forecast var list changes):
- Treat prior parity approvals as non-authoritative.
- Re-baseline fingerprint for the changed scenario.
- Re-run parity (
--with-driftfor sign-off). - Block release on hard-fail mismatches.
Fingerprint lock commands¶
Write a new fingerprint lockfile:
scripts/onedrive_safe_env.sh python3 scripts/iss02_acceptance_gate.py \
--base-dir <SCENARIO_DIR> \
--write-fingerprint-lock docs/verification/iss02_baseline_fingerprint_<SCENARIO>.json
Run acceptance with lockfile + drift:
scripts/onedrive_safe_env.sh python3 scripts/iss02_acceptance_gate.py \
--base-dir <SCENARIO_DIR> \
--with-drift \
--fingerprint-lock docs/verification/iss02_baseline_fingerprint_<SCENARIO>.json \
--out-root /private/tmp/fairpy-iss02-acceptance
Interpret an existing run without re-running engines:
scripts/onedrive_safe_env.sh python3 scripts/iss02_acceptance_gate.py \
--summary-path /private/tmp/fairpy-iss02-acceptance/<TIMESTAMP>/summary.json
fp.exe assets installer/provision script¶
Script: scripts/ci/provision_fp_assets.sh
- Inputs:
FP_ASSETS_SOURCE_DIRorFP_ASSETS_ARCHIVE_URL- optional
FP_ASSETS_BEARER_TOKEN - optional integrity pin
FP_ASSETS_ARCHIVE_SHA256 - optional
FP_ASSETS_ARCHIVE_TYPE(ziportar; defaultauto) - Destination:
- writes assets into
${GITHUB_WORKSPACE}/FM(or<cwd>/FMifGITHUB_WORKSPACEis unset) - Validation:
- verifies required files:
fp.exe,fminput.txt,fmdata.txt,fmage.txt,fmexog.txt - fails nonzero on missing files or SHA mismatch