Granger V4: Full Environmental Network with CO2, CH4, Tidal Proxy, Pressure

Author: Claude (TerraPulse Paper Machine, Elise)

Status: Draft

Created: 2026-04-06

GitHub Issue: #44

Prior art: workspaces/granger-causality/ (V1-V3)

Motivation

V3 of the Granger-network analysis ran 56 directed pairs across 8 metrics

(earthquakes, temperature, air quality, water level, wave height, sunspots,

solar wind, solar wind Bz) over a 112-day window. One edge survived

Bonferroni correction: Tides -> Waves at lag 2 days.

Issue #44 asks: can we expand to 12+ metrics, bringing in CO2, methane,

a lunar tidal-forcing proxy, and surface pressure, and with 120+ days of

overlap, test whether the carbon cycle couples to any geophysical domain at

daily timescales?

Honesty audit

Of the 4 new metrics requested, only two have enough daily data in the

analysis window (2025-09-15 to 2026-04-05, 203 days) to test:

Metric	Days available	Verdict
`co2_daily`	146	Eligible (non-stationary, first-differenced)
`lunar_tidal_proxy`	203	Eligible
`surface_pressure_hpa`	11	Excluded (sensor only began reporting 26 Mar)
`ch4_monthly`	23 (monthly)	Excluded (monthly cadence, not daily)
`dst_index`	29	Excluded (backfill only covers March 2026)
`solar_xray_flux`	20	Excluded (5-min cadence but only since 17 Mar)

After this audit, we test a 13-metric daily network (V3's 8 plus

streamflow, earthquake max magnitude, solar wind speed, CO2, lunar tidal):

that is 13 × 12 = 156 directed pairs. Every one of them has >=60 days of

co-temporal overlap; every one gets the same joint F-test at lag 7.

Method

Data extraction (scripts/extract.py): daily aggregates from

PostgreSQL over 2025-09-15 through 2026-04-05. Long-format parquet.

Stationarity (ADF): five metrics (co2, streamflow, temperature_2m,

us_aqi, water_level) fail ADF at alpha=0.05 and are first-differenced.

Granger joint F-test at a fixed max lag of 7 days is the primary

inference. We also report a per-pair lag scan (1..7) for sensitivity,

but that is never the test used to claim significance.

Bonferroni correction: alpha_corrected = 0.05 / 156 = 3.2e-4.
Shuffle-null: permute target series, rerun joint test, count

Bonferroni survivors; expected near 0.

AR(7) self-loop: R^2 of each series predicted from its own past.
Strongly-connected components on the surviving edge set.

Results

Eligible vs excluded

Eligible: 13 metrics, 156 directed pairs.
Excluded: 4 metrics (surface_pressure, dst_index, solar_xray_flux,

ch4_monthly): too few days of daily data in the analysis window.

Their pairs are not counted toward the Bonferroni denominator because

we never attempted them.

Primary inference (joint test at lag 7)

14 nominal significant edges (p < 0.05).
3 survive Bonferroni correction at alpha_corr = 3.2e-4:
streamflow -> earthquakes_n (F=21.19, p≈0, N=201)
earthquakes_n -> temperature_2m (F=5.62, p=7e-6, N=201)
water_level -> wave_height (F=5.62, p=7e-6, N=201)
Shuffle-null survivors: 0.
Strongly-connected components with more than one node: none.

CO2 and lunar tidal results

The carbon cycle (co2_daily, first-differenced) does not Granger-predict

any of the 12 other metrics at Bonferroni significance. Nearest approach:

earthquakes_n -> co2_daily at p=0.037 (does not survive correction).

lunar_tidal_proxy does not Granger-predict earthquakes, water level, or

space weather indices at Bonferroni significance. Two nominal-only edges:

lunar_tidal -> earthquake_mag_max (p=0.013) and `lunar_tidal ->

temperature_2m` (p=0.030). Both vanish after correction.

This is a clean null for the CO2 - geophysics coupling hypothesis

at daily time scales and a near-null for lunar tidal forcing.

Self-predictability (AR(7) R^2)

Hub-like metrics with high self-predictability:

lunar_tidal: 1.00 (deterministic astronomy; sanity check)
sunspot_number: 0.87
earthquakes_n: 0.75
solar_wind_speed: 0.55

Metrics with low self-predictability (earthquake_mag_max, solar_wind_bz,

temperature_2m after differencing) are near-white: their futures are

approximately unrelated to their own pasts at daily resolution.

Sensitivity to lag choice

Lag	Nominal	Bonferroni
1	15	1
3	11	2
5	15	2
7	14	3

The three edges surviving at lag 7 all involve mechanically coupled ocean /

terrestrial processes. The streamflow -> earthquakes edge is the strongest

and most robust, but it is very unlikely to reflect causation. Streamflow

responds to rainfall which correlates with barometric pressure fronts; the

earthquake count is a global daily tally dominated by Pacific Rim activity.

What Granger is flagging is almost certainly a shared seasonal / storm-track

confounder.

Mechanistic interpretation of the 3 surviving edges

water_level -> wave_height (lag 7): tidal residuals and storm surge

share a common driver (coastal wind stress and pressure), and the NOAA

tide-gauge series leaks wave-induced setup information. Physically plausible.

streamflow -> earthquakes_n: spurious, seasonal confounder. The

global earthquake catalog has no physical mechanism for being driven by

USGS flow gauges on US rivers.

earthquakes_n -> temperature_2m: also spurious. Global daily

earthquake counts vary mildly with season because of how detection

thresholds interact with station uptime; temperature_2m (after

differencing) covaries with synoptic weather on similar time scales.

Conclusion

Adding CO2 and a lunar tidal-potential proxy to the TerraPulse Granger

network does not produce new Bonferroni-surviving cross-domain edges.

The carbon cycle at daily cadence is uncoupled (in the predictive sense)

from the geophysical and space-weather metrics in our inventory. Lunar

tidal forcing does not Granger-cause earthquake counts or max magnitudes

over this 200-day window.

Of the four new metrics issue #44 requested, two (surface_pressure,

ch4_monthly) do not yet have enough daily data to test, one (dst_index)

has only 29 days, and only two (co2_daily, lunar_tidal_proxy) are

testable. A follow-up version (V5) should revisit after 6+ months of

surface-pressure data have accumulated.

Artifacts

scripts/extract.py: SQL to parquet, honest coverage audit.
scripts/analyze.py: ADF, differencing, joint and lag-scan Granger,

shuffle-null, AR(7), SCC, Bonferroni.

scripts/visualize.py: network PNG, heatmap PNG, Plotly HTML.
data/timeseries.parquet: long-format daily series.
data/coverage.json: per-metric and pairwise overlap audit.
data/results.json: full pair-test output.
paper/paper.tex: revtex4-2 PRD twocolumn manuscript.
www/granger-network-v4.html: interactive network.
www/granger-heatmap-v4.html: interactive p-value matrix.

References

Granger, C. W. J. (1969). Investigating causal relations by econometric

models and cross-spectral methods. Econometrica, 37(3), 424-438.

Dickey, D. A. & Fuller, W. A. (1979). Distribution of the estimators for

autoregressive time series with a unit root. JASA, 74, 427-431.

Prior workspace: workspaces/granger-causality/ (V1 hourly, V2 daily

190-day, V3 daily 112-day).