Dream Papers — the legendary tier

Created 2026-05-20. Long-horizon, decisive-scale paper concepts that aim past "publishable" toward "definitive." Companion to paper-revisits.md, paper-coverage-gaps.md, and research-arcs-and-idle-projects.md.

Why this file exists

TerraPulse's day-to-day pipeline produces ~7 papers per Elise→Mike assembly-line session: tight scope, clean nulls, single-source feature pieces. That's the bread-and-butter and it's working.

This file is for the other thing — the papers whose ambition matches the platform's actual reach. We ingest 113+ metrics across 129 fetchers spanning radio propagation (back to 2008), atmospheric soundings (back to 1958), seismicity, magnetic-field positions (back to 1589 in historical reconstructions), sunspots (SILSO, back to 1817), neutron monitors (NMDB, live but currently 4 NH stations), tornado records (back to 1950), hydrology, lightning, wildfires, gravitational-wave events (GWOSC, back to 2005), and the entire NWS alert catalog. A handful of those assets in combination can address questions that have been argued in the literature for decades.

A dream paper:

Tests a contested or long-standing question at decisive scale.
Either result (positive or null) reshapes how the question is asked.
Has a 5-50 year time horizon of data, not 5-50 weeks.
Could be cited continuously for a decade.
Has natural V2/V3 trigger conditions so the contribution compounds.

We are not aiming for "good." We are aiming for awe-inspiring scope — the work that someone five years from now points at when describing what TerraPulse is for.

How a pitch graduates to the active queue

A pitch in this file is not the same as a paper queued in paper-coverage-gaps.md. To promote:

Power check. Apply the feedback_paper_queueing_skepticism memory rule. If the detection floor at full available N still doesn't beat the smallest effect the literature reports, the pitch stays a pitch.
Trigger condition met. Some pitches are infra-gated (need fetcher work), some are calendar-gated (need more data), some are ready now.
Methodology pre-registered. Methods lock before extraction code is written. If the V1 has natural V2/V3 hooks, add a paper-revisits.md entry at promotion time.
Scope fit. Confirm the pitch is one paper, not a sub-project that needs decomposition into 3.

Promoted pitches stay in this file with status promoted (GitHub issue number recorded) so the chain is visible.

Per-pitch skeleton

ID — DP-NNN, stable forever
Headline question — one sentence
Why legendary — the awe-inspiring framing in 2-3 sentences
Data inputs — sources, expected N, time horizon
Methodology sketch — broad strokes; full pre-registration happens at promotion
Power / detection floor — honest math
Feasibility tier — HEADROOM / NULL-LIKELY / STRUCTURAL / Infra-gated
Promotion trigger — what changes to move this to the queue
Risks & failure modes — most likely ways this disappoints
Companion artifacts — feature article? explainer? interactive viz? press piece?
V2/V3 hooks — pre-committed extensions for after V1 ships
Status — pitch / scoped / promoted / shipped / abandoned

Active pitches

DP-001 — The IGRA × SPC century test

Headline question: Can yesterday's atmospheric sounding predict today's tornado outbreak, and how does that predictive skill vary by region and decade?
Why legendary: This is the definitive version of a question every operational meteorologist has been asking since the 1950s. Past studies have tested fragments — one region, one decade, one parameter — but no one has put the entire IGRA archive (1958-present, twice-daily, ~1500 stations globally, ~700 active US) against the entire SPC tornado record (1950-present) with fully reproducible analysis. The result is either the largest empirical confirmation of CAPE-based forecasting ever published, or a stratification map of where and when sounding-based prediction stops working — and both become canonical references for the next decade.
Data inputs:
- IGRA soundings 1958-2026: ~50,000 sounding-days per US station × ~700 US stations = ~35M station-days (already in PG after 102M-row backfill)
- SPC tornado record 1950-2026: ~80K tornadoes
- Overlap window 1958-2026: 68 years
- Derived parameters: CAPE, LI, helicity, 0-6km shear, EHI, precipitable water
Methodology sketch: For each tornado day at a given county, identify the spatially-nearest IGRA station whose 12Z (morning) sounding falls within the prior 24 hours. Compute the panel of sounding parameters. Fit ROC curves for outbreak prediction at 6h, 12h, 24h lead times. Stratify by region (5 NWS regions), decade (7 decades), and storm-mode (supercell vs QLCS where known). Compare regional skill curves over time.
Power / detection floor: Literature CAPE→outbreak effect sizes are large (Cohen's d > 0.5 typically) on station-scale samples of ~10³. With N in the millions across decades, detection floor for regional/decadal differences is tiny (d < 0.05 detectable). The challenge is not power; it is study design and the SPC underreporting bias before ~1990.
Feasibility tier: HEADROOM.
Promotion trigger: Ready to promote now. Bounded only by an honest scope decision: one paper or a 3-paper series (V1 = national skill curve over decades; V2 = regional breakdown; V3 = storm-mode stratification)?
Risks & failure modes:
- SPC pre-1990 underreporting may force restriction to 1990-2026 (still 36 years).
- Storm-mode metadata is incomplete — mode-stratified analyses may have to restrict to post-2003.
- "Definitive" framing requires careful comparison to existing operational verification studies (NSSL, SPC's own).
Companion artifacts: /articles/-graduatable feature paper with interactive lead-time-by-region map. Press piece anchored on the regional-decadal shift. Explainer in docs/paper-explainers/ for general audience.
V2/V3 hooks:
- V2: same analysis with composite sounding parameters (SCP, STP, EHI) instead of single-variable.
- V3: replicate with ECMWF model soundings for forecast-skill rather than analysis-skill.
Status: pitch. Front-runner for first dream-paper promotion.

DP-002 — Tornado Alley is moving east

Headline question: At what velocity, in which direction, and with what variance has the centroid of US tornado activity moved decade-by-decade since 1950?
Why legendary: The eastward-shift claim is in popular press (NYT, WaPo, Atlantic) and contested in the literature (Gensini & Brooks 2018, Agee et al. 2016), but the version that gets cited everywhere is a single-decade comparison or a single-statistic summary. This paper publishes the full annual trajectory — centroid (lat, lon), variance ellipse (major axis, minor axis, rotation angle), and tornado-day-frequency map — as one continuous record with an interactive slider. Definitive enough that it becomes "the figure" everyone re-uses.
Data inputs:
- SPC tornado record 1950-2025: ~80K tornadoes with lat/lon, EF/F rating, date, path length
- US Census population density 1950-2020 (decadal) for observer-bias correction
Methodology sketch: Annual centroid as path-length-weighted geographic mean (path-length weighting upweights long-track tornadoes which are less subject to reporting bias). Annual variance ellipse via 2D Gaussian fit. Track centroid trajectory; fit linear trend (lat/yr, lon/yr) over the full record and rolling 30-year windows. Test for change-point in trajectory via BCP or similar. Repeat with raw counts (no path-length weighting) for sensitivity.
Power / detection floor: Centroid uncertainty at N=1200/year is small (~5km in each axis). Detectable trajectory velocities are well below the 10-20 km/year that popular press has reported.
Feasibility tier: HEADROOM. Data is ingested and clean.
Promotion trigger: Ready now. Could be a 2-week paper if scoped right.
Risks & failure modes:
- Observer bias before ~1990 may dominate the apparent shift. Path-length weighting helps but doesn't eliminate.
- "Moving east" framing may turn out to be partly a Texas-Oklahoma decline rather than an eastward expansion — the paper has to be honest about that decomposition.
Companion artifacts: Showcase /articles/ feature with an animated slider showing the centroid + variance ellipse year by year. The visually-arresting TerraPulse paper.
V2/V3 hooks:
- V2: parallel analysis for severe-hail and severe-wind reports.
- V3: couple to IGRA CAPE climatology to ask whether the centroid shift tracks the CAPE climatology shift.
Status: scoped (2026-06-18). Promoted; pre-registration frozen at docs/scope-tornado-alley-east.md, workspace tornado-alley-east. Headline population = EF2+ (Mike's call) to defeat the weak-tornado reporting-inflation confound; EF1+/All shown as the bias-sensitivity ladder. Cluster bootstrap by convective day for independence.

DP-003 — The cosmic-ray weather coupling

Headline question: Across 30+ years of neutron-monitor data and global atmospheric reanalysis, does cosmic-ray flux modulate low-cloud cover with the magnitude Svensmark proposed?
Why legendary: The Svensmark hypothesis (galactic cosmic rays ionize the atmosphere → seed cloud condensation nuclei → modulate low-cloud cover → influence climate) has been debated for 25 years. CLOUD experiment results (CERN) are partial. Observational replication at decisive scale, with full pre-registration of methodology, would either settle a quarter-century debate or definitively bound it. Either result is publishable in a real journal and citable for the next decade.
Data inputs:
- NMDB neutron monitors: fetcher live (4,438 rows, 2026-04-02 → 2026-05-20) but station selection is currently 4 NH-only sites (Jungfraujoch 46.55°N, Oulu 65.05°N, Apatity 67.57°N, Thule 76.5°N — polar/mid only, no equatorial, no southern). Station expansion is the binding cosmic-side constraint.
- Open-Meteo cloud cover + humidity: present, but only ~5 years of TerraPulse direct ingest
- ERA5 reanalysis: 1940-present at 0.25° resolution for global cloud cover; would need a separate ingest path
- IGRA humidity profiles 1958-present (already ingested) as secondary check
- SILSO sunspot number (live, 801,822 rows back to 1817-12-31) + DSCOVR solar wind + space_solar_flux_10cm for confounding control
Methodology sketch: Pair neutron-monitor anomaly (deviation from 11-year-cycle baseline) with low-cloud anomaly at matched latitude bands. Lagged cross-correlations at lags 0-30 days and 0-12 months. Decompose by latitude (polar/mid/tropical) since the Svensmark prediction is latitude-dependent. Bonferroni across 5 latitude bands × 3 cloud levels. Stratify by Forbush-decrease events as a within-time-series natural experiment.
Power / detection floor: Svensmark claims ~3% modulation of low-cloud cover. At ~30 years × ~10 stations × monthly resolution, detection floor at α=0.01 Bonferroni-15 should be ~0.5-1% modulation. The contested signal sits at the upper end of our detectable range — sensitive enough to confirm or to bound. Caveat: with the current 4-NH-station NMDB configuration, latitude stratification collapses to polar+mid only; V1 scope would need to either expand station selection or honestly restrict to the NH polar/mid band.
Feasibility tier: Infra-gated (station expansion + ERA5 cloud cover). Downgraded from "no NMDB infra at all" to "NMDB live, latitude coverage and cloud-cover product still gating."
Promotion trigger:
- NMDB station selection expanded to include ≥1 equatorial (|φ| < 30°) and ≥1 southern-hemisphere monitor; OR V1 scope restricted to NH polar/mid only with the coverage limit acknowledged up front
- ERA5 cloud cover ingest path decided (or scoped restriction to 2020-2026 if going with Open-Meteo alone)
Risks & failure modes:
- ERA5 may be its own multi-month ingest project — could push promotion 6-12 months out.
- Literature so contested that any result will be re-argued. Pre-registration is the answer; commit methods publicly before extraction.
- Solar-cycle confounding is brutal — F10.7 and neutron count are themselves correlated.
Companion artifacts: Feature paper. Long-form explainer (Svensmark, CLOUD, Forbush events all need plain-English treatment). Most likely paper of the lineup to attract academic citation.
V2/V3 hooks:
- V2: extend to high-clouds and total-column cloud-water.
- V3: couple to IGRA-derived condensation-nuclei proxies if proxies are defensible.
Status: pitch, infra-gated. Highest scientific stakes of the lineup.

DP-004 — Do earthquakes ring the Earth in our radio data?

Headline question: After the largest earthquakes in the WSPR archive (M≥7.5), are the Earth's free-oscillation normal modes detectable as periodic perturbations in WSPR signal-to-noise?
Why legendary: This is the question nobody has thought to ask. Earthquakes excite Earth's normal modes (0S2 at ~54 min, 0S0 at ~20 min, higher modes shorter) which ring for days-to-weeks at micrometer-scale displacement. WSPR is a global radio propagation network with millisecond-precision timing across thousands of stations. The premise — that ionospheric coupling to mechanical oscillations is detectable in S/N — is borderline science-fiction. If it works, it is a new geophysical observable derived from amateur radio. If it doesn't work, the detection floor is the paper: "the smallest mechanical oscillation TerraPulse-class radio networks can sense."
Data inputs:
- WSPR archive 2008-2026: hundreds of millions of spots, thousands of stations
- USGS earthquake catalog: ~270 M≥7.0 events during WSPR overlap; ~40 M≥7.5
- Known normal-mode periods from seismological literature (catalog of ~50 modes)
Methodology sketch: Stack WSPR S/N residuals (after removing diurnal and solar-cycle baseline) in the 48-hour window following each M≥7.5 event. Compute periodogram across the stack at 1-second resolution. Compare power at known normal-mode periods vs randomized null distributions (event-time shuffling). Pre-register the 3-5 strongest expected modes (0S2, 0T2, 1S0) as targets. Bonferroni across modes × latitude bins.
Power / detection floor: Almost certainly null. Normal-mode amplitudes are ~10⁻⁹ m displacement; even theoretical ionospheric coupling is far below WSPR's sensitivity. Expected detection floor at the largest events is something like 100-1000× above any signal actually present.
Feasibility tier: NULL-LIKELY — but the detection floor is the headline.
Promotion trigger: Ready now. Could be a 1-month paper if treated honestly as "we look, we don't find, here is what the floor tells us." Danger is over-claiming on any near-floor signal.
Risks & failure modes:
- Reviewers will object to the premise. Pre-registration and explicit detection-floor reporting are the defenses.
- Confounders: M≥7.5 events disturb the ionosphere directly (TID effects, infrasound, electromagnetic precursors are well-documented). Disentangling normal-mode signal from direct ionospheric coupling is the hard part.
- Framing has to be "we set a bound," not "we found the Earth ringing in radio data." The latter would be sensational and almost certainly wrong.
Companion artifacts: Feature article framed as "the question nobody asked." This is the memetic paper of the lineup — even the null is share-worthy.
V2/V3 hooks:
- V2: relax to M≥6.5 to increase N (~1500 events) at the cost of weaker signal per event.
- V3: targeted search at single very-large events (Tohoku 2011 — partial WSPR archive; future M≥9 events as they occur).
WSPR paper family (2026-06-09 dedup): DP-004 shares only the WSPR archive with DP-009, GH #132, and GH #200 — the hypotheses are distinct, not duplicates. DP-004 = seismic normal-modes ringing in radio; DP-009 = ionospheric TEC closure; #132 = corridor SNR vs discrete Kp storms (V1/V2); #200 = corridor SNR vs continuous Bz (V3 of #132). Cross-linked, kept separate.
Status: pitch. Most memetic, lowest scientific probability of positive result. The honest null is the whole point.

DP-005 — The RF Anthropocene

Headline question: What fraction of the long-term variance in the WSPR signal-to-noise record is anthropogenic radio-frequency interference, and how has that fraction grown over 18 years?
Why legendary: Every other group studying WSPR S/N treats RFI as noise to be filtered. This paper makes it the signal. We decompose 18 years of global WSPR spots into solar (F10.7, sunspot), ionospheric (Kp, Dst), atmospheric (IGRA pressure/humidity at receiver), and residual — and argue the residual is the human-RF footprint of the Anthropocene. Methods-heavy, but the result — a global RF-pollution time series at decadal resolution — is unique to TerraPulse and immediately useful to spectrum managers, radio astronomers, and the amateur-radio community.
Data inputs:
- WSPR archive 2008-2026: hundreds of millions of spots
- F10.7, sunspot number, Kp, Dst from NOAA/NASA archives
- IGRA atmospheric profiles for receiver-station geographies
- Geographic metadata (urban / rural / coastal) per WSPR station
Methodology sketch: Multivariate structural time-series decomposition. For each WSPR band (160m, 80m, 40m, 20m, 17m, 15m, 12m, 10m, 6m), fit S/N = f(solar) + f(ionospheric) + f(atmospheric) + ε. The ε residual is the candidate anthropogenic component. Stratify by station latitude, longitude, urban/rural class. Track residual variance over time per band.
Power / detection floor: Plenty. Challenge is methods, not N. Defensibility of "this residual is anthropogenic" rests on the rigor of confounder modeling.
Feasibility tier: HEADROOM for descriptive analysis. STRUCTURAL for the causal claim — the contribution is the time-series itself, not a hypothesis test.
Promotion trigger: Ready now. Promote when capacity allows; this is a longer-than-7-day paper (probably 3-4 weeks).
Risks & failure modes:
- "Anthropogenic" attribution is contested without independent ground-truth. Frame as "unexplained residual" + "consistent with anthropogenic patterns" rather than direct causal claim.
- Amateur-radio community will read this carefully. Methods have to be airtight.
Companion artifacts: Feature paper. The methods-citable paper — other groups doing WSPR analysis will use the decomposition framework.
V2/V3 hooks:
- V2: replicate with other citizen-science radio networks (RBN, PSKReporter) for cross-validation.
- V3: couple to satellite-observed nighttime-lights as an urbanization proxy.
Status: pitch. Most methodologically unique to TerraPulse.

DP-006 — The 400-year magnetic envelope

Headline question: Across four centuries of historical reconstruction plus modern measurement, what does the joint trajectory of the North magnetic pole, South magnetic pole, and magnetic equator say about the dynamics of Earth's core field?
Why legendary: Geomagneticists have produced these reconstructions (GUFM1 1590-1990, IGRF 1900-present, WMM 2000-present), but the synthesis is typically buried in technical papers or model-release notes. A single TerraPulse paper that publishes the full 400-year trajectory of all three loci — with annual or decadal resolution, with the modern 5× acceleration set against the four-century baseline — becomes the canonical lay-accessible reference. Visually stunning. Memorable. Citable to general audiences.
Data inputs:
- N magnetic pole positions 1589-2024 (already ingested: 8,374 positions per paper-coverage-gaps.md)
- S magnetic pole positions 1590-2024 (GUFM1 + IGRF, ingest path TBD)
- Magnetic equator latitude (modeled, longitude-dependent) 1590-2024
- IGRF model coefficients for cross-validation
Methodology sketch: Compute drift velocity (km/yr) and acceleration (km/yr²) for each locus over rolling 30-year windows. Cross-correlate N-pole and S-pole motion (are they antipodal? Out-of-phase?). Compare modern (post-1990) drift to four-century baseline. Visualize as paired animated trajectories with shared time axis.
Power / detection floor: Not a hypothesis-test paper. Descriptive at decisive scale. Novelty is the joint visualization and the four-century baseline.
Feasibility tier: STRUCTURAL / HEADROOM. N-pole data already ingested; S-pole and equator may need ingest work.
Promotion trigger: Ready now after the S-pole ingest path is decided. Open question: do we have a fresh angle, or is this a re-presentation of geomagnetic literature? The TerraPulse angle could be (a) the visualization itself, (b) coupling to other TerraPulse data (does pole acceleration correlate with anything atmospheric we measure?), (c) the lay-accessible synthesis.
Risks & failure modes:
- Geomagneticists have done this. The TerraPulse contribution has to be either visualization-first or synthesis-first, not novel-discovery-first.
- The "coupling to other TerraPulse data" angle (option b) is the riskiest — most likely null, and the lay-accessible synthesis (option c) is the safer bet.
Companion artifacts: Showcase feature article. Pair with the magnetic-drift paper from paper-coverage-gaps.md (#206) as a V1 → V2 sequence: #206 = N-pole only, modern era; DP-006 = all three loci, four centuries.
V2/V3 hooks:
- V2 (couple to TerraPulse data): test pole-acceleration vs atmospheric / weather indices.
- V3 (paleomagnetic): incorporate paleomagnetic field reconstructions back to 1000 BCE if defensible data exists.
Canonical home for the magnetic-pole idea (2026-06-09 dedup): the casual pitch paper-ideas.md #3 ("compass sprinting toward Russia") folded into this V1→V2 sequence: GH #206 = V1 (N+S pole, modern-era verification), DP-006 = V2 (all three loci, four centuries). Not duplicates; a deliberate sequence.
Status: pitch. Most aesthetically driven of the lineup.

DP-007 — Two centuries of solar activity × 68 years of atmospheric structure

Headline question: Across the 68-year overlap of the SILSO sunspot record and the IGRA atmospheric sounding archive, does decadal-scale solar activity (sunspot number, F10.7, TSI proxies) leave a measurable fingerprint on the climatology of CAPE, lifted index, precipitable water, and tropopause height — after honest control for ENSO, PDO, and QBO?
Why legendary: Solar-cycle modulation of surface climate is debated but extensively studied. Solar-cycle modulation of upper-air thermodynamic structure — the quantities that actually drive severe weather — has barely been touched at decisive scale. SILSO daily sunspot numbers back to 1817 give us 200+ years of solar context; IGRA from 1958 gives 68 years of sounding climatology; the overlap (68 years × ~700 US stations × twice-daily) is 6 full solar cycles at panel scale. Either a real signal upgrades the cosmic-climate coupling literature, or a decisive null bounds it — and the null itself is publishable because no one has run this comparison at this N before.
Data inputs:
- SILSO sunspot daily (live, 801,822 rows back to 1817-12-31); use 1958-2026 subset for IGRA overlap
- IGRA soundings 1958-2026 (live, 102M-row backfill complete) — CAPE, LI, PWAT, tropopause-pressure
- NASA POWER (live, 11K rows back to 2022) for modern TSI cross-check
- DSCOVR solar wind + space_solar_flux_10cm (modern era only, confounder control)
- Climate-mode indices (ENSO/MEI, PDO, QBO from noaa_climate_indices) as covariates
Methodology sketch: For each IGRA station with ≥40 years of coverage, build decadal climatology of CAPE/LI/PWAT/tropopause. Regress station-decadal anomalies against smoothed sunspot number (13-month running mean), partialing out ENSO+PDO+QBO. Bonferroni across stations × parameters. Stratify by latitude band and season. Pre-register a directional prediction (solar-max → lower-stratosphere warming → higher tropopause → ?-direction-on CAPE).
Power / detection floor: Per-station N=68 years × 730/year ≈ 50K soundings. Six full solar cycles. Detection floor at α=0.01 Bonferroni-2800 is roughly 1-2% modulation of CAPE per station, much smaller for stacked panels. Beats the ~5-10% modulation amplitudes in the contested literature.
Feasibility tier: HEADROOM. Data is ingested.
Promotion trigger: Ready now. Scope decision: 1 paper (CAPE only, station-stacked) vs 4-paper series (one per sounding parameter).
Risks & failure modes:
- ENSO/PDO/QBO confounding is severe and the literature gets accused of finding the cycle in the noise. Pre-registration of the multivariate decomposition is the answer.
- Pre-1990 IGRA station selection may shift the panel composition over decades; controlling for the time-varying station network is non-trivial.
- "Solar activity" as a single index hides which mechanism (UV stratospheric heating vs cosmic-ray flux vs TSI) is doing the work; this paper bounds the aggregate, not the mechanism.
Companion artifacts: Feature paper with sunspot-cycle slider overlaid on CAPE-climatology US map. Paper-revisits hook: V2 with explicit cosmic-ray-flux substitution once NMDB station network expands (couples to DP-003).
V2/V3 hooks:
- V2: re-run with cosmic-ray flux as the solar index (requires NMDB station expansion).
- V3: extend solar side to TSI reconstructions back to 1850 once an ingest path exists (Wu & Krivova or NRLTSI2 composite).
Solar-cycle family (2026-06-09 dedup; extended 2026-06-09 evening): DP-007 is the upper-air-structure chapter of the broad paper-ideas.md #11 omnibus. Sibling chapters: DP-003 (cosmic-ray × cloud cover), the shipped wspr-solar-cycle-bug (WSPR-band selection effect), and the thermospheric-density chapter — the Starlink-fleet-drag V2 seed in Future pitch seeds (V1 starlink-thermosphere-storm-pilot shipped). Each chapter is a different observable of the same cycle (atmospheric structure / cloud cover / radio propagation / thermospheric density); distinct scopes, kept separate. #11 itself is downgraded to a possible end-of-line synthesis, not net-new analysis.
Status: pitch. Highest-payoff cosmic dream paper that's runnable today.

DP-008 — Earth-Moon-Sun syzygy × global seismicity, 75-year USGS catalog

Headline question: Across 75 years of the global USGS earthquake catalog, is M≥6 earthquake frequency measurably enhanced at lunar/solar syzygy (new/full moon, perihelion, eclipses) after honest control for catalog-completeness bias by year and region?
Why legendary: Lunar/tidal triggering of earthquakes is one of the oldest unsolved questions in seismology. The literature is split — some studies find ≤1% effect at shallow continental faults, most find null at global scale, and the null-vs-positive question has been argued since the 1890s. TerraPulse has the complete USGS catalog plus an already-ingested lunar_solar_tidal_potential time series; the V1 here is "the most rigorous global test of the lunar-syzygy hypothesis ever published." Either result is publishable and citable for the next decade.
Data inputs:
- USGS earthquakes M≥6 1950-2026: ~11K events (~150/year × 75 years, M6 catalog substantially complete since ~1976)
- Lunar/Solar Tidal Potential (live, 24,846 rows back to 2025-08-31; will need backfill to match earthquake horizon — note this is the binding constraint for V1)
- For V2: JPL Horizons planetary ephemeris (NOT YET INGESTED) for planet-syzygy V2
Methodology sketch: Phase-bin each M≥6 event by lunar phase angle and tidal-potential percentile at the hypocenter at the event time. Schuster test for non-uniformity at full N and within strata. Stratify by depth (≤70km / 70-300km / >300km), tectonic regime (subduction / continental / strike-slip), and magnitude bin. Pre-register the directional prediction: shallow continental events should show stronger phase coupling than deep events, if tidal triggering is real.
Power / detection floor: ~11K M≥6 events, N≥3,000 in shallow-only stratum. Schuster detection floor for a 1% phase-locked enhancement is roughly N≈10,000 at α=0.01 Bonferroni-12 — we just clear it at the global aggregate and probably miss it at the strongest stratum. The honest framing is "decisive at global aggregate, suggestive at stratum." Confounding by Earth-tide semi-diurnal frequencies needs explicit handling.
Feasibility tier: HEADROOM for V1 (lunar only). Infra-gated for V2 (JPL Horizons planetary ephemeris).
Promotion trigger: Lunar/solar tidal potential time series backfilled to cover the M≥6 USGS catalog horizon (currently the table starts 2025-08-31; needs to extend back to at least 1976). This is a small fetcher-extension task, not a new fetcher.
Risks & failure modes:
- Catalog completeness pre-1976 biases the result and forces a restriction window; the longer-window framing has to be honest about the cutoff.
- Existing literature is dense; the contribution has to be the scale + the pre-registration + the regime stratification, not the existence of the test.
- Confounders — solid-Earth tides have a real seismic-noise signature at semi-diurnal frequency that overlaps the lunar-phase signal.
Companion artifacts: Feature article with phase-binned histogram + tidal-potential map. Memetic potential because the popular framing ("does the moon cause earthquakes?") is enduringly catchy and our answer is rigorous.
V2/V3 hooks:
- V2: planet syzygy (Jupiter-Sun-Earth alignment, etc.) once JPL Horizons full ephemeris is ingested.
- V3: triggered-event sub-study at eclipse times (rare events, careful tail-test methodology).
Status: shipped (2026-07-02). V1 built and drafted: workspace lunar-syzygy-seismicity, pre-registration docs/scope-lunar-syzygy-seismicity.md (frozen + ratified before extraction). Clean global null across 4,729 declustered M≥6 mainshocks (1973–2026): no lunar-phase, syzygy, semidiurnal, or tidal-amplitude modulation in any depth stratum; nothing survives Bonferroni-9. Declustering erased the weak raw signal (syzygy p=0.09→0.68), confirming it as aftershock clustering. Honest framing = a ~4% bound, not a refutation: blind to the ~1% fault-resolved effect of Cochran et al. (2004), which is the pre-committed V2 (GCMT focal mechanisms + resolved fault-plane shear stress on shallow thrusts). The dissolved gate: tidal potential is computed per-event from the ephemeris (measured Moon/Sun positions), so no "backfill the tidal table" project was needed.

DP-009 — Ionospheric TEC × WSPR signal-to-noise: closing the radio-propagation loop

Headline question: Across the 2008-2026 WSPR archive and global GNSS-derived ionospheric Total Electron Content (TEC) maps, how much of daily corridor SNR can be predicted from independently-measured TEC — and what does the residual reveal that WSPR sees that TEC does not?
Why legendary: WSPR is widely used as an inferential proxy for ionospheric state — the SNR field is interpreted backwards to infer absorption, propagation conditions, and (in some work) space-weather coupling. GNSS-derived TEC from the IGS network is the direct satellite-anchored measurement of the same medium. The two have never been put against each other at decisive scale: 18 years × hundreds of millions of WSPR spots × global TEC grids at ~15-min cadence. If TEC strongly predicts SNR, this confirms WSPR as a useful inversion target for amateur-radio-scale ionospheric monitoring. If TEC weakly predicts SNR, the residual is the discovery — WSPR is sensitive to something TEC alone doesn't capture (D-region absorption, lower-altitude scattering, instrument-specific effects). Either result is operationally citable.
Data inputs:
- WSPR archive 2008-2026 (live, hundreds of millions of spots)
- GNSS-derived TEC maps (IGS / CDDIS, NOT YET INGESTED — infra-gated; ~15-min cadence; global 5°×2.5° grid since 1998)
- DSCOVR solar wind + space_kp_index (live) for space-weather confounder
- IGRA tropopause for atmospheric-side covariate
Methodology sketch: For each WSPR corridor × band × day, integrate mean TEC along the great-circle propagation path from GNSS-TEC maps. Regress daily corridor SNR against integrated TEC, controlling for solar zenith angle and F10.7. Stratify by corridor latitude, band (160m through 6m), and season. Quantify the residual variance after TEC accounting — this residual is what WSPR sees that TEC alone doesn't. Pre-register a baseline expectation from IRI-2020 model coupling.
Power / detection floor: Both datasets are large; the physics is real. Effect sizes for TEC→absorption coupling are likely d > 1 at the strongest bands. Detection floor is set by methods (great-circle integration, TEC interpolation), not by N.
Feasibility tier: Infra-gated. Requires GNSS-TEC fetcher.
Promotion trigger: GNSS-TEC fetcher merged with ≥3 years of global TEC maps in observations; OR V1 scope restricted to a single GNSS station's TEC time series (a much cheaper ingest) for a station-baseline pilot.
Risks & failure modes:
- GNSS-TEC ingest is a real fetcher project — IGS data formats (IONEX) are non-trivial, hourly maps × ~28 years × global = significant data volume.
- Some literature already covers WSPR×TEC at small scale; the TerraPulse contribution must be the 18-year horizon and the residual-decomposition methodology.
- "Residual is the discovery" framing requires honest accounting — instrument drift over 18 years is a serious confounder.
Companion artifacts: Methods-citable paper, useful to amateur-radio community and HF-aviation operators. Pairs naturally with DP-005 (RF Anthropocene) as a methods doublet.
V2/V3 hooks:
- V2: couple to existing ionospheric model (IRI-2020) as the "what TEC alone explains" baseline.
- V3: targeted sub-study at solar-flare events (DONKI-anchored) as natural experiments.
WSPR paper family (2026-06-09 dedup): see DP-004's family note. DP-009 (TEC closure) is distinct from DP-004 (normal modes), GH #132 (Kp storms, V1/V2), and GH #200 (continuous Bz, V3). Shared archive, separate questions.
Status: pitch, infra-gated.

Future pitch seeds (not yet expanded)

These are concepts to develop into full pitches as time and energy allow. Expand any that we want to take seriously next session:

The COVID atmospheric signature, 2020-2026. Whether the 2020 NO₂/AQI lockdown reduction left a persistent fingerprint, controlling for sensor changes.
Wildfire-AQI dose-response curve, contiguous US. GFW fire detections × Open-Meteo AQI plume-tracking, fit dose-response per region. Health-relevant.
The 75-year NWS alerts climatology. 713K records; the system itself evolved over time. Structural finding likely is the headline.
Solar maximum 2024-2025 retrospective synthesis. Multi-source TerraPulse view of the most-instrumented solar maximum: WSPR, GLM, IGRA, magnetic indices.
Megaregion fingerprints / inverse problem. Reconstruct the US urban map from environmental sensors alone (AQI, lightning, RF noise, traffic-correlated signals).
Induced seismicity at basin scale. USGS streamflow × USGS earthquakes per major US basin. Oklahoma wastewater + Permian as positive controls.
The TerraPulse annual climate report. Synthesis paper: what did our 113 metrics say about the planet this year, with year-over-year comparisons. Becomes a recurring series.
The 50-year urban heat-island climatology. Open-Meteo + historical reanalysis at all major US metros, with concrete trend curves per city.
The longest-baseline volcano-coupling test. Global eruption catalog (Smithsonian GVP) × stratospheric aerosol proxies × WSPR S/N anomalies in eruption-plume corridors.
Solar-cycle modulation of thermospheric density via Starlink fleet drag (V2 of the shipped pilot). The dream-tier sequel to starlink-thermosphere-storm-pilot (V1, shipped 2026-06-09, ce68d6e, an honest single-storm null). V1 proved the instrument — thousands of decaying Starlinks pooled into one fleet-median orbital-decay-rate sensor — but a 75-day window holding one moderate storm has no power. V2 is the legendary-scale version: track fleet-wide d(mean_motion)/dt across a full ~11-year solar cycle, against SILSO sunspots / F10.7, and recover the slow "breathing" of the thermosphere (density rising at solar max, falling at solar min) at fleet scale. Trigger / gating: needs years of TLE history — either a Space-Track / CelesTrak historical GP-archive backfill (external, large), or simply accrued monitor time, since V1 is a standing monitor that grows into this every day. Why dream-tier: satellite drag as a density proxy is established, but never at >10⁴-object fleet scale across a cycle with full pre-registration; and it is the natural on-ramp to the decadal CO₂ thermospheric-contraction climate signal (the genuine prize, even further out). Sibling of the solar-cycle family (see DP-007 family note). Promote when the history depth (backfilled or accrued) first spans ≥1 solar maximum-to-minimum swing.

Promoted (V1 in progress or shipped)

DP-002 — "Tornado Alley is drifting SSE" — shipped 2026-06-18 (8dc8673), workspace tornado-alley-east, live at /articles/tornado-alley-east.
DP-008 — "Does the Moon trigger great earthquakes?" — shipped 2026-07-02, workspace lunar-syzygy-seismicity, article /articles/lunar-syzygy-seismicity. Global null (~4% bound) on 4,729 M≥6 mainshocks; V2 = GCMT fault-resolved shear stress on shallow thrusts.

Maintenance

Add new pitches as pitch status entries with the next available DP-NNN ID.
Update Status on promotion (pitch → scoped → promoted → shipped); never delete pitches, even if abandoned (mark abandoned with one-line reason).
When a dream paper is promoted to the active queue, add a corresponding entry to paper-coverage-gaps.md (queued section) and — if it has V2/V3 hooks — to paper-revisits.md at promotion time, not at ship time.
Seasonal review: at the start of each Elise→Mike session, reread this file and ask whether any pitch's trigger has fired.
Treat the future-seeds list as live: ideas that come up in chat should land here (or as new active pitches) within the same session.