Listening for events…

Scope freeze — coral_reef_station Locationdex (6th Locationdex kind)

Decided 2026-06-27 (Mike pasted the data.gov record + ruled "build small standalone Locationdex now"). Source = EPA's 2011 probabilistic coral-reef condition survey along the southern coast of Puerto Rico. The 6th Locationdex kind (docs/locationdex-framework.md) after neutron_monitor, tide_gauge, magnetic_observatory, streamgauge, radiation_monitor. The slot is a PLACE: one reef survey station.

This one broke the session's NARS water-quality pattern (it's coral-reef ecology, not water chemistry; one region; one year), so the shape was an explicit categorization call — Mike chose a standalone Locationdex over parking it.

Slot

One (reef station) slot. Slot id is namespaced by survey: <survey_id>-<station> (e.g. pr2011-1), so stations from different surveys never collide in the file-per-slot store. Region

  • survey year are slot fields, not the slot key.

This is an EXPANDING kind (Mike, 2026-06-27: "this will definitely not be the last coral reef survey"). The build is a multi-survey registry from the start (SURVEYS dict in the build script): adding a same-format survey = one registry entry; a survey whose sheets differ in shape gets its own per-sheet reader feeding the shared aggregators + dossier (written against the real file, not a guessed schema). build() clears and fully rebuilds the kind dir from source each run, so renamed/removed slots never linger. First survey = PR 2011 (64 stations).

Source

Data.gov record: https://catalog.data.gov/dataset/2011-pr-survey-data. Seven per-taxon xlsx workbooks on EPA's pasteur host (10.23719/1407509). v1 carries the three headline reef-condition layers + station info; secondary taxa deferred.

Measured reality — IN (bright line feedback_measured_reality_only)

Every value carried is a direct field measurement, all IN:

  • Stony coral (per colony): % live tissue, bleached / diseased tallies, height, max diameter, colony count, taxa richness, colony density.
  • Fish (per species, belt transect): counts by size class → total individuals, species richness, density.
  • Rugosity: draped-length / linear-distance transect ratio (structural complexity index).

The survey's design-based regional condition characterization (the probabilistic population estimate) is not in these raw files and is not carried — consistent with holding out the NARS condition estimates.

Per-station aggregation / gotchas frozen here

  • Density uses the single transect-area value, not a row sum. Every colony in a station shares the one survey transect's area (25 m² for coral, 100 m² for fish); density = count / that area, not count / (rows × area). Getting this wrong would deflate density by the colony count. If a station ever carries more than one distinct transect area, the build sums the distinct values (one transect per area).
  • Fish total = sum across every size-class bin, across all species rows (counts are spread over 25 size bins <5 cm90-95 cm).
  • Bleached / Diseased / Clionid are "Yes" / blank flags → percent-of-colonies tallies.
  • Station ids are integers in the sheets → normalized to a stable string slot key (1.0"1").
  • xlsx via openpyxl (read-only, data_only); curl cache-first (sandbox urllib hang).

Storage

Locationdex sibling storehouse data/location_storehouse/coral_reef_station/, file-per-slot (64 slots is negligible for the shared storehouse_index), via the event_storehouse write-dossier + disk-rebuilt-index machinery (base_dir=location_storehouse). Built by scripts/build_coral_reef_station_locationdex.py. 64 station slots, all with the coral layer (coral % live ≈ 79–90%, bleaching 0–10%, fish richness 14–22 spp, rugosity index 1.07–1.63).

Deferred (not in v1)

  • Secondary taxa layers: invertebrates, gorgonians/sponges (SpGorg), Palythoa — the other three workbooks.
  • Sibling EPA regional reef surveys (other years / jurisdictions) that would turn this into a multi-region reef-station network instead of a single 2011 PR snapshot.
  • Spatial sweep / cross-match (swept=false, cross_match="deferred-v2-locationdex"), as with the other Locationdex kinds.
  • Exact per-transect replication (the v1 density assumes one transect per area value).
Live Feed