Surfaced while processing the triage doc
Started 2026-06-26. Counterpart to docs/dex-source-triage.md. The triage doc is the
plan (every source → its best dex). This doc is the parking lot: things that come up
while working the triage backlog that are worth doing but would derail the current item
if chased now. We note them here, keep moving, and come back when the triage pass is done.
Rule of thumb: if it's a ruling about which dex a source belongs to, it goes in the triage doc. If it's a cross-cutting issue, a "we should revisit X", a reclassification of something already shipped, or an infra/tooling snag, it goes here.
Open — revisit when the triage pass is done
Reclassifications of already-shipped kinds
transient→ CelestialObjectDex? Shipped as an Eventdex (ede746e) but it is object-shaped: one ZTF source accreting a light-curve array. That is exactly the CelestialObjectDex slot shape. Candidate to reclassify once the family is more settled. Mike's call (one phenomenon = one category).docs/celestialdex-framework.mdlists it.transientis still the open one here (satcat resolved — see below).
Infra / tooling
- AutoSense obs-count mirage in the untriaged pile. Came up triaging CASTNET
(
castnet-outgoing-data, 2.49M obs): the obs are AutoSense catalog stubs (rows pointing at.zipfilenames,metric=catalog_csv_row/catalog_file), NOT real measurements — zero values/coords/measurement-times. So a high obs count in Section 3 of the backlog does NOT mean the data is ingested. Many of the 355 untriaged sources are likely the same: AutoSense crawled the directory, never parsed the payload. When triaging, check the metric — if it'scatalog_*, the real build still has to fetch+parse the actual files. Don't trust row counts as "we have this data." - Where is the file-per-slot → spine-parquet threshold for Eventdex kinds?
Came up building
nuforc(80,324 slots). No crisp rule exists:lightning(133k) is file-per-slot (pre-ruling); the severe-wx split (1.9M) is spine-parquet;nuforc(80k) I put on spine-parquet too, following the severe-wx precedent and the still-pending index fix. So the de-facto line sits somewhere between FEMA (~5k, file-per-slot) and nuforc (80k, spine-parquet). Worth Mike picking a clean cutoff once the scalable per-kind index lands, so the choice stops being ad-hoc. - Storehouse index cost for file-per-slot kinds.
storehouse_index.json~101 MB / 297k entries;rebuild_index_from_diskrescans every dossier across ALL kinds (~13 min, lightning's 133k slots dominate), no atomic write, race-prone vs the live scheduler's rebuild. New families (Locationdex, CelestialObjectDex) sidestep it with their own storehouse + scoped index. Fix queued: atomic write + incremental/spine-parquet rebuild. Detail in memoryproject_storehouse_index_cost. Not blocking dex work — just a tax on the big shared event index.
Sources that need Mike before they can be placed
radiation_monitorfiner sources. Shipped on EPA RadNet city-centroid geocodes (b841077); Mike will link more detailed radiation sources (exact monitor coords, gamma gross-count series) to refine. Revisit is a refinement, not a re-placement.- EU JRC tail (§E), per-source exceptions. The block was ruled OUT as model/scenario, but a handful are genuinely measured research datasets. Pull one into B/D only if Mike flags it by name and it clears the bright line on inspection. Don't crawl the whole tail.
Resolved (kept for the trail)
celestrak_satcatfamily question → CelestialObjectDexsatcat. Mike ruled (2026-06-26): a satellite is a persistent object with an orbital history, so it is a CelestialObjectDex slot (object), not an Eventdex reentry. Built 69,352 objects, spine-parquet. A reentry is one event in the object's lifecycle. New deferred-v2: the orbital-element history array (period/altitude drift from the livecelestrakGP feed, the Starlink-pilot source) — v1 carries only the launch/decay lifecycle + current elements. New extension:satcatis the first CelestialObjectDex kind to use spine-parquet (neo is file-per-slot at 18.7k); the two-mode-by-scale rule is now in the framework doc.