Scope freeze — drought YEARLOCATIONDEX (county × week USDM)
Frozen 2026-06-26. First YearLocationdex — Mike's 2-layer concept (area × time),
the slot is a (place, time) cell addressable by either axis. Built after the
docs/dex-triage-surfaced.md "★ NEXT BUILD" approval.
What it is
The US Drought Monitor (USDM) is not a single-axis phenomenon: it is a place × time grid:
US counties × weekly map-dates. None of the four prior families fits cleanly —
Locationdex forces place-only, Yeardex forces year-only. So drought is the first
YearLocationdex: stored as ONE flat place-sorted spine-parquet, rows = county × week,
sorted by (fips, week_date) so a county's whole history reads contiguously, with a
by-year index alongside so the year cross-section is direct too. One source of truth,
both axes fast, no duplicated machinery. (Mike, 2026-06-26: "which one combines the
most simplicity with the fastest access to data" → single place-sorted spine + by-year
index, not two physically-nested families.)
Source + bright line
- IN: USDM (US Drought Monitor), historical archive at NCEI/NIDIS CDC:
https://www.ncei.noaa.gov/pub/data/nidis/CDC/historical/. USDM is an assessment of realized conditions, admissible on the FEMA administrative-record precedent (feedback_measured_reality_only). Same basis that admittedfema,nuforc. - OUT: SPEI-1M-* files in the same directory — derived index with a modeled evapotranspiration (PET) term. Held out by the bright line. The grid reads USDM only.
The two values per cell (data is data)
The historical archive carries USDM at a single threshold (MOD = D1+,
moderate-or-worse drought), NOT the full D0–D4 split per week. Confirmed empirically:
weekly cell is binary, and the annual COUNT_D1D4 summary equals the count of
weeks-flagged. Two products exist:
USDM-MOD-YYYY.csv— county × week binary D1+ flag (0/1). Coverage 2000–2024.USDM-MOD-YYYY-PRCNT.csv— county × week percent of county area in D1+ (0–100). Coverage 2000–2021 (no PRCNT files for 2022–2024).
Per feedback_data_is_data_partial_coverage (Mike 2026-06-26, "data is data"), the
spine carries both: every cell holds the binary flag (full span) plus area_pct_d1
where it exists; area_pct_d1 is null for 2022–2024 until a backfill source appears.
We do not drop the richer percent value because it stops three years early.
Storage
data/yearlocation_storehouse/drought/drought_spine.parquet— the flat grid, sorted(fips, week_date). Columns:fips(5-digit, zero-padded),county_name,state,year,week_date,in_drought_d1(0/1),area_pct_d1(float | null).data/yearlocation_storehouse/drought/drought_byyear_index.json— per-year metadata (n_cells, n_counties, n_weeks, county-weeks-in-drought, n_area_pct, week span) so the time axis is direct without a second sorted copy.- Spine-parquet, NOT file-per-slot: ~4M cells would bloat the shared
storehouse_index. Ownyearlocation_storehousebase dir keeps it out of the event index entirely.
Deferred (separate decisions)
- Live edge. No current-year county file exists yet (NCEI posts the annual file
after the fact), and the live
usdm_droughtPG feed is state-level (coarser, area- percent D0–D4), which will not merge cleanly into a county grid. The live edge is its own decision — likely a county-AOI pull from the USDM data services API. Built historical first; live edge settled after. Existingusdm_droughtpaper-feed wiring left untouched. - County centroids. v1 identifies the place by FIPS + name + state (a county is a polygon, not a point). A FIPS→centroid lookup for mapping is a later enrichment.
- Per-week D0–D4 category. Not in the historical archive (only the D1+ threshold + annual category summaries). The richer per-week split is available from the USDM API and could be a future column; backfillable, not a v1 blocker.
Tests
tests/test_monitor/test_drought_yearlocationdex.py — week-column detection, FIPS zero-padding,
melt/long reshape, binary+percent join, by-year index totals.