Scope freeze — World Bank Yeardex (forest_cover + energy_use)
Frozen 2026-06-19. Two new Yeardex kinds off the World Bank Indicators API
(https://api.worldbank.org/v2/). Mike's categorization ruling (2026-06-19): the
three indicators the platform already pulls are three different phenomena, so
forest cover and energy use each become their own Yeardex kind; population is
kept as a shared per-country denominator/context field, not a phenomenon kind.
Why a deep pull (not the live slice)
The live world_bank source (WorldBookFetcher) holds only 15 countries (the
G20 majors) — a curated slice for the ticker/PWA, not the World Bank universe.
Per feedback_no_spiking_global (fill a gap with the real measurement source,
never settle for the thin staged tail), Brick B does a fresh API backfill of
all ~266 country + aggregate entities for each indicator. Same precedent as
the DONKI deep backfill (5046c3f): the staged tail was thin, so we pulled the
real catalog. The live world_bank platform source is left untouched.
Measured reality (IN)
All three indicators are reported statistics-of-record, same administrative-record
bucket as fema and the landuse/dairy Yeardexes:
- Forest area (% of land area)
AG.LND.FRST.ZS— FAO Forest Resources Assessment, national reported. Measurement of land that exists. - Energy use per capita (kg of oil equivalent)
EG.USE.PCAP.KG.OE— IEA / national energy balances, reported. Measurement of energy actually consumed. - Population, total
SP.POP.TOTL— census-anchored counts-of-record (context).
OUT: nothing in these series is a projection or scenario. (World Bank does publish climate-projection products elsewhere; none are pulled here.)
Slot model (Yeardex: slot = YEAR)
Per docs/yeardex-framework.md and the landuse precedent: one slot per calendar
year; the slot accretes that year's full country × value matrix. The geography
axis is country (ISO3) instead of US state/region. Aggregates (World, regions,
income groups; World Bank region.id == "NA") are kept but flagged
is_aggregate=true so paper code can include or exclude them cleanly. A year is
not a place, so observation lat/lon/location stay NULL; country lives in extra_json.
forest_coverkind — value = forest % of land area; years 1990–2023.energy_usekind — value = energy use per capita (kg oil eq); years 1990–2023.- Population is joined onto every slot of both kinds as the per-country
denominator (
population_by_country), never its own kind. Years 1960–2023.
Distinct from the planned energy_stats kind (US SEDS/RECS sectoral consumption):
that is US-state energy consumption totals; this is global per-capita energy use.
Spine sources (Brick B)
Three clean spine datasources via ensure_source (active=False, no live edge —
World Bank revises annually; refresh = re-run the reload + backfill):
worldbank_forestmetricforest_pctunitpct_landworldbank_energymetricenergy_use_percapunitkg_oil_eqworldbank_populationmetricpopulationunitpersons
One observation per (indicator, country, year); raw rows archived to DuckDB before the PG DELETE+INSERT; parquet roster written. The Yeardex reader groups them into year slots.
Bricks
- A — this freeze.
- B —
scripts/reload_worldbank_yeardex.py: deep API pull, 3 spine sources. - C/D —
src/terrapulse/monitor/worldbank_yeardex.py: twoKindConfigs (catalog-only, no sweep),get_years→ year slots,build_dossier,backfill_and_store. Year storehouse (data/year_storehouse/), shared with landuse/dairy.
Sanity anchors (Brick D)
- ~266 entities × 34 yr ≈ 9k forest, ≈ 9k energy; population ~266 × 64 ≈ 17k.
- Forest: Brazil ~59%, Canada ~39%, Australia ~17% (2020s).
- Energy: high for Canada/US/Gulf states, low for Sub-Saharan Africa.
- Latest year present = 2023 for forest/energy, 2023 for population.