Data Lab / Granger V4 — Full Environmental Network with CO2, CH4, Tidal Proxy, Pressure
Granger V4: Full Environmental Network with CO2, CH4, Tidal Proxy, Pressure
Author: Claude (TerraPulse Paper Machine, Elise)
Status: Draft
Created: 2026-04-06
GitHub Issue: #44
Prior art: workspaces/granger-causality/ (V1-V3)
Motivation
V3 of the Granger-network analysis ran 56 directed pairs across 8 metrics
(earthquakes, temperature, air quality, water level, wave height, sunspots,
solar wind, solar wind Bz) over a 112-day window. One edge survived
Bonferroni correction: Tides -> Waves at lag 2 days.
Issue #44 asks: can we expand to 12+ metrics, bringing in CO2, methane,
a lunar tidal-forcing proxy, and surface pressure, and with 120+ days of
overlap, test whether the carbon cycle couples to any geophysical domain at
daily timescales?
Honesty audit
Of the 4 new metrics requested, only two have enough daily data in the
analysis window (2025-09-15 to 2026-04-05, 203 days) to test:
| Metric | Days available | Verdict |
|---|---|---|
co2_daily | 146 | Eligible (non-stationary, first-differenced) |
lunar_tidal_proxy | 203 | Eligible |
surface_pressure_hpa | 11 | Excluded (sensor only began reporting 26 Mar) |
ch4_monthly | 23 (monthly) | Excluded (monthly cadence, not daily) |
dst_index | 29 | Excluded (backfill only covers March 2026) |
solar_xray_flux | 20 | Excluded (5-min cadence but only since 17 Mar) |
After this audit, we test a 13-metric daily network (V3's 8 plus
streamflow, earthquake max magnitude, solar wind speed, CO2, lunar tidal):
that is 13 × 12 = 156 directed pairs. Every one of them has >=60 days of
co-temporal overlap; every one gets the same joint F-test at lag 7.
Method
- Data extraction (
scripts/extract.py): daily aggregates from
PostgreSQL over 2025-09-15 through 2026-04-05. Long-format parquet.
- Stationarity (ADF): five metrics (co2, streamflow, temperature_2m,
us_aqi, water_level) fail ADF at alpha=0.05 and are first-differenced.
- Granger joint F-test at a fixed max lag of 7 days is the primary
inference. We also report a per-pair lag scan (1..7) for sensitivity,
but that is never the test used to claim significance.
- Bonferroni correction: alpha_corrected = 0.05 / 156 = 3.2e-4.
- Shuffle-null: permute target series, rerun joint test, count
Bonferroni survivors; expected near 0.
- AR(7) self-loop: R^2 of each series predicted from its own past.
- Strongly-connected components on the surviving edge set.
Results
Eligible vs excluded
- Eligible: 13 metrics, 156 directed pairs.
- Excluded: 4 metrics (
surface_pressure,dst_index,solar_xray_flux,
ch4_monthly): too few days of daily data in the analysis window.
Their pairs are not counted toward the Bonferroni denominator because
we never attempted them.
Primary inference (joint test at lag 7)
- 14 nominal significant edges (p < 0.05).
- 3 survive Bonferroni correction at alpha_corr = 3.2e-4:
streamflow -> earthquakes_n(F=21.19, p≈0, N=201)earthquakes_n -> temperature_2m(F=5.62, p=7e-6, N=201)water_level -> wave_height(F=5.62, p=7e-6, N=201)- Shuffle-null survivors: 0.
- Strongly-connected components with more than one node: none.
CO2 and lunar tidal results
- The carbon cycle (
co2_daily, first-differenced) does not Granger-predict
any of the 12 other metrics at Bonferroni significance. Nearest approach:
earthquakes_n -> co2_daily at p=0.037 (does not survive correction).
lunar_tidal_proxydoes not Granger-predict earthquakes, water level, or
space weather indices at Bonferroni significance. Two nominal-only edges:
lunar_tidal -> earthquake_mag_max (p=0.013) and `lunar_tidal ->
temperature_2m` (p=0.030). Both vanish after correction.
- This is a clean null for the CO2 - geophysics coupling hypothesis
at daily time scales and a near-null for lunar tidal forcing.
Self-predictability (AR(7) R^2)
Hub-like metrics with high self-predictability:
lunar_tidal: 1.00 (deterministic astronomy; sanity check)sunspot_number: 0.87earthquakes_n: 0.75solar_wind_speed: 0.55
Metrics with low self-predictability (earthquake_mag_max, solar_wind_bz,
temperature_2m after differencing) are near-white: their futures are
approximately unrelated to their own pasts at daily resolution.
Sensitivity to lag choice
| Lag | Nominal | Bonferroni |
|---|---|---|
| 1 | 15 | 1 |
| 3 | 11 | 2 |
| 5 | 15 | 2 |
| 7 | 14 | 3 |
The three edges surviving at lag 7 all involve mechanically coupled ocean /
terrestrial processes. The streamflow -> earthquakes edge is the strongest
and most robust, but it is very unlikely to reflect causation. Streamflow
responds to rainfall which correlates with barometric pressure fronts; the
earthquake count is a global daily tally dominated by Pacific Rim activity.
What Granger is flagging is almost certainly a shared seasonal / storm-track
confounder.
Mechanistic interpretation of the 3 surviving edges
- water_level -> wave_height (lag 7): tidal residuals and storm surge
share a common driver (coastal wind stress and pressure), and the NOAA
tide-gauge series leaks wave-induced setup information. Physically plausible.
- streamflow -> earthquakes_n: spurious, seasonal confounder. The
global earthquake catalog has no physical mechanism for being driven by
USGS flow gauges on US rivers.
- earthquakes_n -> temperature_2m: also spurious. Global daily
earthquake counts vary mildly with season because of how detection
thresholds interact with station uptime; temperature_2m (after
differencing) covaries with synoptic weather on similar time scales.
Conclusion
Adding CO2 and a lunar tidal-potential proxy to the TerraPulse Granger
network does not produce new Bonferroni-surviving cross-domain edges.
The carbon cycle at daily cadence is uncoupled (in the predictive sense)
from the geophysical and space-weather metrics in our inventory. Lunar
tidal forcing does not Granger-cause earthquake counts or max magnitudes
over this 200-day window.
Of the four new metrics issue #44 requested, two (surface_pressure,
ch4_monthly) do not yet have enough daily data to test, one (dst_index)
has only 29 days, and only two (co2_daily, lunar_tidal_proxy) are
testable. A follow-up version (V5) should revisit after 6+ months of
surface-pressure data have accumulated.
Artifacts
scripts/extract.py: SQL to parquet, honest coverage audit.scripts/analyze.py: ADF, differencing, joint and lag-scan Granger,
shuffle-null, AR(7), SCC, Bonferroni.
scripts/visualize.py: network PNG, heatmap PNG, Plotly HTML.data/timeseries.parquet: long-format daily series.data/coverage.json: per-metric and pairwise overlap audit.data/results.json: full pair-test output.paper/paper.tex: revtex4-2 PRD twocolumn manuscript.www/granger-network-v4.html: interactive network.www/granger-heatmap-v4.html: interactive p-value matrix.
References
- Granger, C. W. J. (1969). Investigating causal relations by econometric
models and cross-spectral methods. Econometrica, 37(3), 424-438.
- Dickey, D. A. & Fuller, W. A. (1979). Distribution of the estimators for
autoregressive time series with a unit root. JASA, 74, 427-431.
- Prior workspace:
workspaces/granger-causality/(V1 hourly, V2 daily
190-day, V3 daily 112-day).
Author: claude
Published: 2026-04-06 · Updated: 2026-04-06
Data files: coverage.json, results.json, timeseries.parquet
Scripts: analyze.py, extract.py, visualize.py