Research arcs and idle projects
A companion to weekly-cadence.md. The cadence covers the must-dos. This doc covers what to dip into during the rest of the day — slow-burn, multi-day projects that pay off the long-term TerraPulse ambition without draining tokens on any single session.
Part 1 — How paper topics actually get chosen
The short version: Brad files the issues, I execute against them. But that's not the whole picture, because the which of what Brad files isn't random. The V-series papers follow a structure that's worth making explicit so you and Brad can debate the next move with shared vocabulary.
The V-series arc
Every line of research at TerraPulse follows the same falsification ladder:
V1 — Observe. A signal shows up in the data. The paper documents it: what we saw, when, how strong. Often INCONCLUSIVE in some narrow sense because the V1 is allowed to over-claim — that's what V2+ is for.
V2 — Replicate. Re-run on independent data (different time window, different geography, different stations). If the effect size shrinks dramatically, the V1 was probably overfit. The TerraPulse rule is: any V1 with
|d|>0.5atN<200gets a mandatory V2 with bigger N.V3 — Control. Introduce a known confounder and check if the signal survives after controlling for it. The Lifted Index control (#181 → #183) was a V3 against the WSPR precursor — H1 (survives control) vs H0 (LI was the actual cause).
V4 — Mechanism. What physically produces the signal? The WSPR V4 paper tested D-layer disturbance vs cell-local QRN as competing mechanisms.
V5 — Generalize. Does it hold for adjacent classes? The V5 paper checked if the WSPR precursor was tornado-specific or just any-severe-weather. Answer: any-severe.
Each step is a test of the most threatening alternative explanation to whatever the previous step claimed. Brad's instinct picks the threat to test next. The skill — and frankly the moat — is in knowing which threat is most credible at each step.
What I bring vs what Brad brings
- Brad brings: the science instinct, the choice of which V-step is most worth running, the framing of hypothesis vs null, the pre-registered claim threshold, the historical context for what's already been shown in related fields.
- I bring: mechanical extraction, statistical execution to spec, honest verdict against the threshold, paper-writing to revtex template, never upgrading INCONCLUSIVE to H1 just because a result "looks interesting."
- Mike brings: editorial review, public-voice polish, audit trail.
Candidate-threats first pass (middle ground)
To save Brad cycles without removing his role, Claude produces a candidate-threats list within ~24 hours of any V-paper shipping. Brad reads it, picks the threat (or rejects all of them in favor of his own), Mike weighs in on what would actually publish well, then a new issue gets filed for the next V-step.
The list is a first draft, not a recommendation. Claude's role is to enumerate plausible alternative explanations and assess operational feasibility. The judgment of which threat is most credible to test stays with Brad — that's the part that doesn't automate and shouldn't.
Format. Posted as a comment on the just-shipped paper's issue. Sections:
- What V1 claimed. One sentence restating the finding so the threats are anchored to it.
- Credible threats (~3-5). Each gets: name, one-paragraph mechanism for how it could fully or partially explain the V1 effect, what data would test it, whether we have that data today, rough effort estimate.
- Less-credible threats (~3-5). Brief — one line each. Listed for completeness so Brad can confirm we're not blind to them.
- Weak threats (~3-5). Even briefer. Listed so Brad can see what Claude already discounted and challenge if he disagrees.
- Claude's gut pick. One sentence: which credible threat Claude would test first if forced. Brad ignores this freely; it just makes Claude's reasoning legible.
What Brad does with it. Rank, edit, dismiss, or replace. The output of Brad's pass is one filed issue: the next V-paper, with hypothesis + method + threshold per the existing PMA issue template.
What this is NOT. It is not a paper proposal. It is not a finding. It is not Claude picking the next research direction. It is a structured first draft to make Brad's threat-selection step faster.
How new V-arcs get started
A new arc starts when one of three things happens:
- A new datasource comes online and unlocks a test that couldn't be run before. IGRA radiosondes unlocked CAPE × WSPR (#182) and LI control (#183). When NASA FIRMS gets onboarded, a wildfire-WSPR coupling arc becomes possible.
- A live event is captured that's worth a standalone analysis. Enid EF-4 in 2026-04 was the template — paper #33 written same-day from the live GLM capture.
- A prior V-series result gets challenged externally — though this hasn't happened yet, it would force a V-next.
How long-term ambitions shape the queue
The site exists to monetize climate intelligence on X. That biases the paper queue toward findings that are:
- Visual — a figure that reads in a single glance is worth 10 papers of pure tables.
- Surprising but defensible — "WSPR precursor independent of LI in the highly-unstable regime" is shareable; "we ran a t-test and p=0.04" is not.
- Repeatable — a finding we can re-run as new data arrives is a renewable content asset.
- Linked to active weather — papers shipped within hours of a real-world event get exponentially more reach than retrospectives.
This is why the V-series gets prioritized over one-off curiosity papers: each entry in a V-arc compounds the credibility of the prior entries.
Part 2 — Multi-day idle projects
These are projects to dip into when the day's cadence block is done and there's spare capacity. Rules:
- Touch one project per idle slot. Don't context-switch.
- Each session produces a visible artifact: a commit, a filed issue, a checked-off subtask.
- Time-box to 30 min per slot. If it bleeds beyond that, leave a note and pick up next slot.
- All work happens in main; no long-lived branches.
- If a project completes, the entry here gets a strike-through and a link to the closing commit/issue.
Roster
1. Knowledge graph completeness
Scope: ~0.09% of last-7d rows are still untagged. The bulk is per-station metrics (mag_field_<STATION>, future per-station sounding metrics, etc.) that aren't matched by current auto_tag_rules. Walk the tag tree, identify gaps, propose new rules.
Per-slot deliverable: add or refine one regex rule, verify next audit shows coverage tick up.
Done when: tag coverage is ≥ 99.99% for four consecutive Monday audits.
Dependencies: none.
2. Datasource documentation pass
Scope: Many datasources.notes are null or auto-generated stubs. Each curated source deserves one human-readable paragraph: what it is, who runs it, what the data answers, known quirks, refresh cadence, license/attribution.
Per-slot deliverable: populate notes for 3-5 sources, commit.
Done when: every curated source (non-AutoSense) has a real notes field.
Dependencies: none.
3. Backfill gap matrix
Scope: Each curated source has a theoretical history (USGS earthquake → 1900; SPC tornado → 1950; IGRA → 1950s) and an actual coverage in TerraPulse. Build a markdown matrix: source × historical-start × current-start × gap-years × backfill-feasibility.
Per-slot deliverable: add one row to the matrix per session.
Done when: all curated sources are documented; Brad can pick the next backfill target from a single table.
Dependencies: none.
4. Cross-source correlation explorer
Scope: Proactively compute pairwise correlations across metric pairs to surface unexpected couplings as "research candidates" Brad can turn into proper paper issues. NOT papers themselves — just signal scouting.
Per-slot deliverable: one correlation matrix slice (e.g., all seismic × all space-weather), one one-paragraph writeup of the strongest non-trivial pair, committed to docs/research-candidates/.
Done when: open-ended; this project produces a renewable stream of candidate ideas.
Dependencies: the new (source_id, timestamp_utc) index — already in place.
Risk: must not present candidates as findings. Frame as "this pair correlates at r=X over period Y; needs a real paper to determine if it's spurious."
5. API surface polish
Scope: Every /api/v1/ endpoint should have a docstring, an example response, and a responses schema for OpenAPI. Critical for eventual monetization — the API is the product.
Per-slot deliverable: polish 2-3 endpoints.
Done when: the /docs page renders cleanly for an external developer with no internal context.
Dependencies: none.
6. Loader resilience survey
Scope: Audit each curated fetcher for retry/timeout/circuit-breaker patterns. Some inherit from BaseFetcher and get the standard treatment; some are older one-offs. Build a compliance matrix; migrate non-compliant fetchers to the standard.
Per-slot deliverable: audit 5 fetchers, migrate 1.
Done when: every curated fetcher uses the BaseFetcher retry/timeout pattern AND is wired through the LoaderRunner framework.
Dependencies: none.
7. Untapped public datasets survey
Scope: What major public climate datasets are NOT yet in TerraPulse? Examples: NOAA HRRR (sub-hourly CAPE — would unblock the #182 mechanism question), ERA5 reanalysis, MODIS LST, GHCN-Daily, NLDAS-2, CMIP6 outputs, GLDAS, etc. Build the inventory + onboarding effort estimate.
Per-slot deliverable: one datasource profiled (URL, format, auth, volume, refresh, relevance to active V-arcs).
Done when: 30 candidate sources are profiled; Brad has a prioritized roadmap.
Dependencies: none.
8. Live-event readiness drills
Scope: The Enid EF-4 paper got written the day of the storm because the GLM listener was live and the analysis pipeline already existed. We should be able to do the same for: hurricane landfall, M6+ earthquake near population, X-class solar flare, geomagnetic storm Kp≥8, major volcanic eruption. For each, draft the analysis-template-ready-to-go so the live event just needs to drop in.
Per-slot deliverable: one event-type template scaffolded under workspaces/templates/<event-type>/.
Done when: the five event types above each have a runnable template.
Dependencies: the relevant fetchers must be live and tested.
9. Memory hygiene
Scope: The .claude/projects/.../memory/ directory accumulates entries. Some go stale, some duplicate. Weekly pass to consolidate and remove.
Per-slot deliverable: review 5 memory entries, prune or merge.
Done when: ongoing, never "done." Aim for the index staying under 20 entries.
Dependencies: none.
10. Public-site SEO + brand fundamentals
Scope: terrapulse.info needs the basics for organic discovery and credibility. Open Graph tags, structured-data markup for papers (DataCatalog / ScholarlyArticle schema.org), sitemap.xml, robots.txt, canonical URLs, social preview images.
Per-slot deliverable: one fundamental added or improved.
Done when: site passes Lighthouse audit at ≥90 SEO and a manual eyeball check on Twitter Card preview / LinkedIn / Reddit.
Dependencies: none.
Part 3 — How to pick the next idle slice
When daily cadence is done and there's capacity, pick the next slice using this rule, in order:
- Any project a follow-up issue is blocking? If yes, that one. (E.g., if I file an issue under #1 and a follow-up needs the project to advance, prioritize it.)
- Any project where a one-slot push would close it? Closing a project is more valuable than opening a new one.
- Any project visibly stale (no commit in 14 days)? If yes, that one. Stale projects feel abandoned.
- Otherwise round-robin through the roster by the order above.
No project should run > 30 minutes per slot. If something needs more, file it as its own dedicated issue with a "needs deep-work session" label and resume during a planned block.
Part 4 — Token & cadence budget
The weekly cadence (Mon-Fri blocks) totals ~2-3 hours of focused work. Multi-day project slices add another ~30 min/day where capacity allows. So a healthy week looks like:
| Day | Cadence block | Idle project slice (optional) |
|---|---|---|
| Mon | Data integrity audit (~30 min) | 1 slice (~30 min) |
| Tue | Editorial review (~20 min) | 1 slice (~30 min) |
| Wed | Public surface QA (~20 min) | 1 slice (~30 min) |
| Thu | Infrastructure & cost (~30 min) | 1 slice (~30 min) |
| Fri | Brand readout (~30 min) | — (focus on readout) |
That's ~4-5 hours/week. Anything beyond that is paper-work (i.e., actually running a PMA paper from an issue), which is a separate budget item triggered by Brad filing an issue.
If a week's events drain tokens unusually hard (a live event, a paper, a backfill), idle slices get skipped first. Cadence blocks are the floor; idle slices are the ceiling.