Listening for events…

TerraPulse Data Dimensions

Living document. Last updated: 2026-03-18.

814K+ observations. 23 metrics. 11 categories. 44 countries. 76 years of temporal depth.


The Observation Cube

Every observation in TerraPulse lives in a multi-dimensional space:

Time × Location × Flavor = the observation cube

Each record carries these dimensions at varying levels of completeness, creating a rich but heterogeneous dataset that requires careful normalization for cross-source analysis.


Dimension 1: Time (100% coverage)

All 814K+ rows have timestamp_utc. This is the universal join key — everything can be aligned to a shared timeline.

Property Value
Coverage 100% of all observations
Span 76 years (1949–2026)
Earliest 1949-12-31 (ONI climate index)
Latest Current (real-time ingestion)
Distinct days 2,612

Temporal Granularity by Source

Granularity Sources
10-minute GeoSphere TAWES stations
15-minute USGS Water (streamflow)
Hourly NOAA SWPC (Kp index, solar flux), INCA
~Real-time USGS Earthquake, Safecast, Open-Meteo
Daily NASA POWER reanalysis, USDM drought snapshots
Weekly USDM drought (published Thursdays)
Monthly Climate indices (MEI, ONI)
Annual World Bank indicators

Temporal Depth by Metric

Metric Earliest Latest Depth
climate_oni 1949 2025 76 years
wb_SP.POP.TOTL 1960 2022 63 years
climate_mei_v2 1979 2025 47 years
wb_AG.LND.FRST.ZS 1990 2022 33 years
wb_EG.USE.PCAP.KG.OE 1990 2022 33 years
drought_area_pct 2010 2026 16 years
earthquake_magnitude 2021 2026 5 years
nasa_power_t2m 2023 2026 3 years
All others 2026 2026 Days–weeks (real-time only)

Dimension 2: Location (93% spatial coverage)

756K rows (93%) have latitude/longitude. The remaining 7% are inherently non-point data.

Property Value
Rows with lat/lon 756,128 (93%)
Rows geocoded 357,006 (44%)
Distinct countries 44
Distinct states 115
Coordinate system WGS84 (EPSG:4326)
Storage PostGIS Geography(Point, 4326)

Spatial Coverage by Category

Category Rows Has Lat/Lon Has Geocode Coverage
Hydrology 418K 100% 48% US (California focus)
Radiation 223K 100% 64% Global (citizen science, Japan-heavy)
Seismic 97K 100% 10% Global
Temperature 13K 100% 12% 10 global cities + Austria
Air Quality 2.2K 100% 72% 10 global + 8 European cities
Water/Ocean 5K 32% 4% US coasts + 10 ocean regions + 10 rivers
Satellite 1K 100% 63% 5 climate regions (Arctic, Greenland, Amazon, Gulf, SE Asia)
Space Weather 37K 2% 1% Non-spatial (Sun-Earth system)
Drought 13K 0% 0% US states (area-based, not point)
Climate Indices 3K 0% 0% Global indices (not point-localizable)
Socioeconomic 3K 0% 0% Country-level (15 countries)

Non-Point Data

Some metrics are inherently non-spatial or use different spatial models:

  • Drought — area percentages per US state (D0–D4 levels). Location is state in extra_json.
  • Climate Indices — global ocean-atmosphere oscillations (ENSO). No meaningful point location.
  • World Bank — country-level aggregates. country_code in extra_json.
  • Space Weather — Sun-Earth system events. Kp index, solar flux, CME trajectories exist in heliographic coordinates, not Earth surface.
  • Tides — station-based but lat/lon not yet populated (station_id in extra_json).

Geocoding Gap

Only 44% of rows have reverse-geocoded country/state/city fields. This is because:

  1. Geocoding only runs on new observations (since the feature was added)
  2. Backfilled historical data (earthquakes, drought, etc.) was not geocoded
  3. Rate limits on Nominatim prevent bulk retroactive geocoding

Fix: A batch geocoding job could fill in the ~400K spatial rows missing geo hierarchy.


Dimension 3: Flavor (23 metrics, 11 categories)

"Flavor" is what is being measured — the semantic meaning of the observation value.

Category Breakdown

Category Metrics Total Rows Unit(s) Deepest History
Hydrology streamflow_00060 418K ft³/s 2026 (real-time only)
Radiation radiation 223K CPM 1970–2026
Seismic earthquake_magnitude 97K magnitude 2021–2026
Space Weather space_kp_index, space_solar_flux_10cm, donki_cme_speed, donki_flare_class, donki_radiation_belt 37K Kp, sfu, km/s, class, event 2026
Temperature temperature_2m, nasa_power_t2m, geosphere_temperature 13K °C 2023–2026
Drought drought_area_pct 13K % 2010–2026
Water/Ocean water_level, wave_height, river_discharge 5K m, m, m³/s 2026
Climate Indices climate_oni, climate_mei_v2 3K index 1949–2025
Socioeconomic wb_SP.POP.TOTL, wb_AG.LND.FRST.ZS, wb_EG.USE.PCAP.KG.OE 3K varies 1960–2022
Air Quality us_aqi, geosphere_pm25 2.2K AQI, µg/m³ 2026
Satellite sar_scene 1K MB 2026

Value Semantics

Each metric's value field means something different:

Metric Value Meaning Scale
earthquake_magnitude Richter magnitude 0–9+ (logarithmic)
streamflow_00060 Water discharge rate 0–100K+ ft³/s
radiation Counts per minute 0–1000+ CPM
temperature_2m Air temperature -40 to +50 °C
us_aqi Air Quality Index 0–500 (EPA scale)
space_kp_index Geomagnetic activity 0–9 Kp
drought_area_pct Percent of state in drought 0–100%
donki_cme_speed CME velocity 100–3000 km/s
donki_flare_class Flare intensity (numeric) 0.1–10000 (A=0.1, X10=10000)
wave_height Significant wave height 0–15+ m
water_level Tidal water level -2 to +3 m (relative to MLLW)
river_discharge River flow rate 0–100K+ m³/s
climate_oni El Niño/La Niña index -3 to +3
climate_mei_v2 Multivariate ENSO Index -3 to +3
sar_scene SAR scene file size 100–2000 MB
geosphere_pm25 PM2.5 concentration 0–500 µg/m³
nasa_power_t2m Daily mean temperature -40 to +50 °C
wb_SP.POP.TOTL National population Millions
wb_AG.LND.FRST.ZS Forest area percentage 0–100%
wb_EG.USE.PCAP.KG.OE Energy use per capita 0–10000 kg oil equiv

Dimension 4: Source (16 active)

Source = provenance. Who measured this, how, and with what instrument.

Source Slug Type Authority
USGS Earthquake usgs_earthquake Government US Geological Survey
USGS Water Services usgs_water Government US Geological Survey
Open-Meteo Weather open_meteo Open source Open-Meteo GmbH
Open-Meteo Air Quality open_meteo_aqi Open source Open-Meteo GmbH
Open-Meteo Flood/River open_meteo_flood Open source GloFAS/Copernicus
Open-Meteo Marine open_meteo_marine Open source Open-Meteo GmbH
Safecast Radiation safecast Citizen science Safecast Foundation
World Bank Climate world_bank International org World Bank Group
USDM Drought Monitor usdm_drought Government NDMC/USDA/NOAA
NOAA Tides & Currents noaa_tides Government NOAA CO-OPS
NASA POWER nasa_power Government NASA Langley
NOAA Climate Indices noaa_climate_indices Government NOAA PSL/CPC
NOAA Space Weather noaa_space_weather Government NOAA SWPC
ASF Sentinel-1 SAR asf_sentinel Government NASA ASF DAAC / ESA
NASA DONKI nasa_donki Government NASA CCMC
GeoSphere Austria geosphere Government GeoSphere Austria

Dimension 5: Extra (in extra_json)

Every observation carries an extra_json TEXT field with source-specific metadata not captured in the normalized columns. This is the "long tail" of dimensions.

Key Extra Fields by Category

Category Extra Fields Available
Seismic event_id, place, felt, cdi, mmi, alert, tsunami, sig, depth_km
Space Weather activity_id, linked_events, instruments, predicted_kp, earth_arrival
Hydrology site_code, site_name, variable_name
Temperature city, humidity_pct, wind_speed, precipitation, solar_irradiance
Satellite scene_id, orbit, path_number, polarization, browse_url, download_url
Drought state, d0_pct, d1_pct, d2_pct, d3_pct, d4_pct
Socioeconomic country_code, country_name, indicator_name
Air Quality pm10, ozone, no2, so2, co

Missing Dimensions (Future)

Altitude/Depth

  • Earthquake depth_km exists in extra_json but not as a column
  • GeoSphere stations have elevation metadata
  • NASA POWER is at 2m above ground
  • Ocean data could carry depth

Severity/Significance

  • Earthquakes have sig composite score
  • Drought has D0–D4 categorical levels
  • AQI has EPA category thresholds
  • Space weather has G1–G5 storm scale
  • A universal "significance score" (0–1 normalized) could enable cross-category alerting

Forecast vs Observed

  • GeoSphere PM2.5 is a forecast
  • DONKI CME has predicted Kp vs SWPC observed Kp
  • Open-Meteo weather is current observations
  • Distinguishing forecast from observation is important for accuracy analysis

Data Quality

  • USGS streamflow has qualifiers (provisional, approved)
  • Safecast radiation is citizen science (variable quality)
  • USGS earthquake has status (automatic, reviewed)
  • A normalized quality dimension could weight observations in analysis

Cross-Dimensional Analysis Opportunities

Time × Flavor (Trend Analysis)

  • Temperature trends across cities (NASA POWER, 2023–2026)
  • Earthquake frequency by magnitude band over years
  • Drought severity progression per state (2010–2026)
  • ENSO cycle phase identification from ONI/MEI

Location × Flavor (Spatial Correlation)

  • AQI vs temperature at co-located monitoring points
  • Streamflow vs precipitation in the same watershed
  • Earthquake clustering by geographic region
  • Radiation levels near nuclear facilities vs background

Time × Location × Flavor (The Full Cube)

  • "During El Niño years, how does California drought severity change?"
  • "Do M5+ earthquake clusters correlate with any preceding seismic pattern?"
  • "When Kp exceeds 5, what happens to GPS accuracy in the following 24 hours?"
  • "Is wildfire-season AQI getting worse year over year in specific cities?"

Flavor × Flavor (Cross-Metric Correlation)

  • ENSO → Drought: ONI values vs drought_area_pct (lagged by 3–6 months)
  • CME → Kp: donki_cme_speed vs space_kp_index (lagged by 1–3 days)
  • Temperature → AQI: nasa_power_t2m vs us_aqi (same-day, same location)
  • Solar Flux → Temperature: space_solar_flux_10cm vs global temperature (11-year cycle)
  • Streamflow → Precipitation: streamflow_00060 vs NASA POWER precip (same watershed)

Normalization Gaps to Address

  1. Geocode backfill — 400K+ spatial rows missing country/state/city
  2. Tides need lat/lon — station coordinates available in metadata but not populated
  3. Drought needs lat/lon — state centroid coordinates could be assigned
  4. World Bank needs lat/lon — country centroid coordinates could be assigned
  5. Depth/elevation — promote earthquake depth_km to a column
  6. Forecast flag — add is_forecast boolean to distinguish predictions from observations
  7. Quality score — normalized 0–1 quality based on source-specific flags
Live Feed