Sample Construction
QCThe published sample flow keeps the review path explicit: raw FDIC financial rows, parsed required fields, deduplicated bank-quarter keys, and the final panel used for exports.
| Stage | Rows | Banks | Quarter Span |
|---|
Duplicate bank-quarter keys are audited in duplicate_keys.csv; column-level coverage is reported separately in missingness_report.csv.
Distributions
Bank Asset Distribution
Log₁₀ total assets ($000s). Median bank has $214M in assets; the distribution is heavily right-skewed.
NIM Distribution
Net interest margin (%). Winsorized at 1st/99th percentiles. Mean = 3.66%, Median = 3.62%.
Asset Growth vs. Deposit Growth
2D histogram of quarterly growth rates. Strong positive correlation along the diagonal; off-diagonal mass reflects funded vs. unfunded growth.
Darker cells indicate higher observation density. Growth rates winsorized at 1st/99th percentiles.
Size Decile Summary
| Decile | Avg NIM | Avg ROA | Avg Ln(Assets) | Banks | Obs |
|---|
Geography
DescriptiveState view: HQ state from the bank-quarter panel. CBSA view: branch-deposit-weighted exposure built from annual SOD and carried forward by SOD year. These are descriptive cuts of the same panel, not separate causal designs.
State tiles are colored by average NIM. Hover or focus a tile for details.
CBSA Ranking
Reviewer-facing CBSA view from the branch-deposit-weighted exposure export.
Default screen suppresses thin-coverage and small-sample outliers before ranking.
| Rank | CBSA | Type | Avg NIM | Avg ROA | Avg Assets ($M) | Banks | Matched Dep Share |
|---|
Data Sources
Federal Data Pipeline
| Source | Agency | Frequency | Coverage |
|---|---|---|---|
| BankFind Financials | FDIC | Quarterly | 2010–2025 |
| Summary of Deposits (SOD) | FDIC | Annual | 2010–2025 |
| Institution History | FDIC | Event-based | Merger/acquisition events |
| FRED Rates | Federal Reserve | Daily/Quarterly | Fed funds, prime, treasury yields |
373,220 bank-quarter observations. 8,032 unique FDIC-insured commercial banks. All data are free and publicly available.
Reproducibility
Full Pipeline
The pipeline is scripted end to end, but full reruns still depend on external-data availability and the documented setup prerequisites. It covers raw FDIC data ingestion, panel construction, variable engineering, model estimation, and robustness checks.