Paper Coverage Gaps
Updated 2026-05-19. Inventory of which TerraPulse data categories have appeared in published
/articles/features vs. which remain unmined. Companion toresearch-arcs-and-idle-projects.mdandPAPER-MACHINE-CONTEXT.md.
How to read the power column
Each recommendation is annotated with a power expectation:
- HEADROOM — overlap window is large, expected effect size exceeds the detection floor at α=0.01 Bonferroni for the available N. A positive result is plausible.
- NULL-LIKELY — overlap window or expected effect size puts the result near the detection floor. Worth running only if the structural finding is itself the headline; do not queue for "social-media reach."
- STRUCTURAL — the headline is a coverage / data-availability finding, not a statistical test. The paper bounds what TerraPulse can do; the null is the point.
See the feedback_paper_queueing_skepticism memory rule for the discipline behind this column. Power-checks happen before extraction code is written.
Published coverage by data category
| Category | Papers | Role in those papers |
|---|---|---|
| WSPR radio propagation | 6 | Primary signal (V1–V4 tornado precursor, aircraft, solar-cycle bug) |
| Earthquake events (USGS) | 4 | Primary subject (Sanriku, Kermadec, Reno, Cascadia fact-check) |
| Solar / space weather | 1 | Primary subject (April 2026 solar watch) |
| Radiation (Safecast) | 1 unpublished (#202) | Bonferroni-null + structural latitude finding — see workspace safecast-forbush-radiation-modulation |
| GLM lightning (GOES) | 1 | Control variable only (V4 null) |
| IGRA soundings | 0 published as primary | Attempted as control in #181 / #200 — null due to WSPR/IGRA window non-overlap |
| Aircraft positions | 1 | Primary subject (WSPR aircraft pre-registered null) |
| UFO / fireball / Starlink | 1 | Primary subject (sky anomalies) |
Data categories with zero papers as primary subject
- USGS Water Services — California streamflow, ~1000 rows/5 min. Multi-year history.
- Open-Meteo — weather, AQI, UV, soil moisture across 10 cities, 10-min cadence.
- World Bank climate indicators — 15 countries × 5 indicators × ~6 years. Only cross-national / long-horizon source we ingest.
- Curated inactive backlog — Mike's deliberate research queue of zero-row sources (data.gov, NPS DataStore, USGS ScienceBase, PDS). See
project_curated_sourcesmemory. Completely unmined.
Combinations never tried with active data
- Earthquakes × hydrology — induced seismicity, post-EQ groundwater anomalies (USGS Water). Power: HEADROOM in localized fracking/wastewater regions; NULL-LIKELY for global aggregate.
- IGRA × tornadoes as primary subject — CAPE / LI lead time. Blocked from prior attempts (#181, #200) only because WSPR-anchored windows didn't overlap IGRA history; pure IGRA × SPC has no such constraint. Power: HEADROOM — decades of SPC × full IGRA history is a large N.
- GLM as primary subject — lightning climatology, diurnal cycle, severe-weather coupling. So far used only as a V4 null control. Power: HEADROOM — pure climatology with 9-row-per-minute data.
- AQI × anything — wildfire smoke, diurnal commuter patterns, transboundary plumes. Power: HEADROOM — large expected effect sizes (wildfire days vs baseline often 5–10×).
- Solar / cosmic-ray × Safecast — Ran as #202 (2026-05-19). NULL-LIKELY confirmed: clean Bonferroni-null + structural latitude finding. Removed from active queue; see workspace
safecast-forbush-radiation-modulationanddocs/paper-explainers/202-safecast-forbush.md. Future cosmic-ray work needs NMDB neutron-monitor ingest, not Safecast.
Thematic blind spots
- Hydrology — zero papers
- Air quality / pollution — zero papers
- Radiation / cosmic rays as primary signal — bounded by #202 for the Safecast-CPM channel; NMDB ingest would reopen this
- Cross-national / global-development climate — zero papers (every published paper is US- or single-event-centric)
Recommended next queue
Annotated with power expectation. Power-check before queueing OR running — surface NULL-LIKELY predictions to Mike before writing extraction code. Always grep workspaces/ for existing drafts before opening a new issue — duplicates (e.g., NEO×fireball, ENSO×drought) cost time.
Queued (open issues, updated 2026-05-20)
- Pure IGRA × SPC tornado lead-time — Power: HEADROOM. Issue #204. Unblocks the failed-control arc from #181 / #200 by removing the WSPR window constraint. 65-year overlap (1958–2023), thousands of paired soundings. Literature CAPE→outbreak effect sizes well above detection floor.
- North magnetic pole drift acceleration — Power: HEADROOM. Issue #206. 8,374 N-pole positions 1589–2024; documented 5× acceleration post-1990. Untouched in workspaces. Visually compelling, social-media-feature potential.
- NWS alerts climatology — Power: HEADROOM. Issue #207. 713K records across 77 alert types; diurnal + day-of-week + seasonal structure. Tests both real weather (severe-weather diurnal) and reporting bias (long-fuse day-of-week). Largest single-source catalog we haven't analysed.
Lower priority but still open
- GLM diurnal climatology — Power: HEADROOM. Lightning flash diurnal cycle is a textbook result; we'd be replicating with our own data plus exploring TerraPulse-specific cuts (continental US vs Caribbean, summer vs winter, terrain stratification). Large N, clean physics, no overlap-window issues. Methods-y headline.
- Streamflow × earthquakes (induced seismicity) — Power: HEADROOM in specific basins (Oklahoma wastewater, Permian), NULL-LIKELY globally. Frame as a regional study from the start.
- World Bank cross-national disaster-outcome panel — Power: HEADROOM at the long-term-trend level (6-year × 15-country panel is enough for descriptive statistics and clear cross-national differences); NULL-LIKELY for short-term forcing tests.
- NMDB neutron-monitor ingest, then Safecast V2 — Power: HEADROOM for the NMDB-driven Forbush paper; reopens #202 with the right tool. Requires platform work (new fetcher) before any paper.
- AQI V2 (fall 2026) — Power: HEADROOM. Re-run the originally pre-registered 6-D KMeans clustering from #203 once 12 months of full-constituent ingest has accumulated (expected ~60-80 anomalies). Defer until ~2026-09-01.
Retired / done
- CNEOS fireball × meteor showers (#205) — Ran 2026-05-20, R0+R1 shipped + accepted (
1518211→9406b37). Clean null after N correction: raw 1,394 rows → deduped N=355 unique events. Schuster p=0.22 globally, 0/12 showers survive Bonferroni at h ∈ {3,5,7,10}d. Detection floor (ratio ≥1.96) above the optical-bolide literature Geminid range (few-tens-of-percent); the null is consistent with both a genuinely sporadic kt-energy population and an undetectable optical-scale signature at our N. Post-2014 modern-era subset (N=268, p=0.069) confirms robustness. V2 path: N≈600 by 2034. - AQI anomaly characterization (#203) — Ran 2026-05-19, R0+R1 shipped (
f0efe77→e4bcb3a). Structural finding: multi-constituent ingest only since 2026-03-16. 28 ≥3σ anomalies catalogued, modest bimodality (ΔBIC=4.64), 2/4 photochemical case studies. Seedocs/paper-explainers/203-aqi-anomalies.md. - Safecast × Forbush (#202) — ran 2026-05-19, R1 revised. NULL-LIKELY in retrospect was correct call going in; recoverable structural finding (no polar/equatorial Safecast coverage) and all-five-drivers-wrong-sign pattern are the genuine outputs. Lesson banked: power-check before running.
- NEO close-approach × fireball debunk — Already published in workspace
neo-flyby-fireball-correlation(status: accepted). Headline: tracked NEOs and atmospheric fireballs are statistically distinct populations (Cohen's d > 1.2). Re-opening this as a fresh paper would duplicate the existing result. Banked here so the idea doesn't resurface. - ENSO × US drought correlation — Already in draft as workspace
enso-drought-correlation. Check workspace state before queueing a V2.