diff --git a/.claude/MONDAY_CHECKLIST.md b/.claude/MONDAY_CHECKLIST.md new file mode 100644 index 0000000..fb4948c --- /dev/null +++ b/.claude/MONDAY_CHECKLIST.md @@ -0,0 +1,117 @@ +# Monday morning checklist — EIA ingest verification + +Written 2026-05-16 (Saturday). The user will return Monday after the +scheduled weekly ingest runs at **Mon 03:30** via the systemd user timer +`ingest-eia-energy-layers.timer`. + +## What's new since last session + +This was the first weekly run after wiring up a **second EIA endpoint**: +`electricity/facility-fuel` (Form EIA-923, monthly per-plant generation in +MWh). Previously only `electricity/operating-generator-capacity` was ingested. + +**Critical unknown:** the facility-fuel column mapping in `build_flat_tables` +was written from EIA's API docs without confirmation of actual JSON key +casing. EIA-923 endpoints were 503/timing-out all day Saturday so we couldn't +smoke-test. `raw_properties` JSONB is the safety net. + +The longitude-sign bug (historical lower-48 stored positive in 2008-01 → +2010-11) was fixed in `build_flat_tables` and applied in-place to the live +table on Saturday. Monday's rebuild should produce identical corrected data. + +## Step 1 — Did the timer fire? Did it succeed? + +```bash +systemctl --user status ingest-eia-energy-layers.service +journalctl --user -u ingest-eia-energy-layers.service --since "yesterday" | tail -50 +ls -lt output/ingest_*.log | head -3 +``` + +Look for `Active: inactive (dead)` with `Main PID: ... (code=exited, status=0/SUCCESS)`. +Anything else = job failed; read the log. + +## Step 2 — Check both tables populated + +```sql +-- Expected: ~4.7M rows, 2008-01 → ~2026-03 +select count(*), min(period), max(period) +from public.energy_eia_operating_generator_capacity_flat; + +-- Expected (if facility-fuel ingest succeeded): millions of rows, ~2001-01 → ~2026-02 +select count(*), min(period), max(period) +from public.energy_eia_facility_fuel_flat; +``` + +## Step 3 — Verify facility-fuel column mapping + +This is the one that needs human eyes. Run: + +```sql +select plant_id, plant_name, state_id, state_name, + prime_mover_code, prime_mover_desc, + energy_source_code, energy_source_desc, + generation_mwh, gross_generation_mwh, + raw_properties +from public.energy_eia_facility_fuel_flat +limit 3; +``` + +**If typed columns are populated:** mapping is correct, ship it. + +**If typed columns are NULL but `raw_properties` has data:** EIA's actual JSON +keys differ from my guesses. Inspect `raw_properties` to find the real keys +(probably some combination of camelCase vs lowercase or hyphens), then patch +the SELECT in [ingest_eia_energy_layers.py](../ingest_eia_energy_layers.py) +at the `if "energy_eia_electricity_facility_fuel" in available:` block +(~line 870). After patching, rebuild just the flat table without re-ingesting: + +```bash +set -a && . ~/.zsh_secrets && set +a +python3 ingest_eia_energy_layers.py --skip-ingest --endpoint facility-fuel +``` + +Wait — `--skip-ingest` bypasses ingest but `build_flat_tables` runs from +the *intermediate raw table* which gets pruned at end-of-run by +`keep_only_target_flat_table`. So after a successful weekly run, the raw +facility-fuel table is gone. To patch flat columns without a full re-ingest, +you'll need to re-fetch the raw data: + +```bash +python3 ingest_eia_energy_layers.py --endpoint facility-fuel +``` + +That re-ingests *only* facility-fuel (does not touch OGC), then rebuilds +both flat tables with the corrected SELECT. + +## Step 4 — Possible failure modes & responses + +| Symptom | Diagnosis | Action | +|---|---|---| +| Service status = failed, log shows `503` from facility-fuel | EIA-923 still down | Wait, manually re-run `python3 ingest_eia_energy_layers.py --endpoint facility-fuel` when EIA recovers | +| Service status = failed, log shows error in OGC ingest | EIA-860 down or different bug | Diagnose from the traceback; OGC has run successfully many times so likely transient | +| Service succeeded, OGC row count looks right, facility-fuel table missing | Endpoint silently failed but didn't propagate — should not happen with current code | Check log carefully; bug in the new error handling | +| Service succeeded, both tables present, facility-fuel columns NULL | JSON key casing wrong | See Step 3 patch path | + +## Key paths + +- Script: [`ingest_eia_energy_layers.py`](../ingest_eia_energy_layers.py) +- Wrapper: `~/.local/bin/ingest-eia-energy-layers-weekly` +- Service: `~/.config/systemd/user/ingest-eia-energy-layers.service` +- Timer: `~/.config/systemd/user/ingest-eia-energy-layers.timer` +- Per-run logs: `output/ingest_YYYYMMDD_HHMMSS.log` (kept by wrapper) +- Sample/narrative output: [`output/operating_generator_capacity_sample.txt`](../output/operating_generator_capacity_sample.txt) +- Facility-fuel narrative (pre-ingest): [`output/facility_fuel_pending_narrative.txt`](../output/facility_fuel_pending_narrative.txt) +- Env vars sourced from: `~/.zsh_secrets` + +## What does NOT need doing + +- OGC longitude fix is already deployed (in script + applied in-place Saturday). No re-run needed. +- systemd unit files: no changes required, the new code path uses the same wrapper. +- The `--endpoint` flag was added Saturday for ad-hoc per-dataset runs. Useful for re-running just facility-fuel without disturbing OGC. + +## Open thread to close out after verification + +Update [`output/facility_fuel_pending_narrative.txt`](../output/facility_fuel_pending_narrative.txt) +once facility-fuel is actually ingested. Replace the "Pending" framing with +the real row count, period range, and any column-mapping notes from Step 3. +Mirror the format of `operating_generator_capacity_sample.txt`.