5.3 KiB
Monday morning checklist — EIA ingest verification
Written 2026-05-16 (Saturday). The user will return Monday after the
scheduled weekly ingest runs at Mon 03:30 via the systemd user timer
ingest-eia-energy-layers.timer.
What's new since last session
This was the first weekly run after wiring up a second EIA endpoint:
electricity/facility-fuel (Form EIA-923, monthly per-plant generation in
MWh). Previously only electricity/operating-generator-capacity was ingested.
Critical unknown: the facility-fuel column mapping in build_flat_tables
was written from EIA's API docs without confirmation of actual JSON key
casing. EIA-923 endpoints were 503/timing-out all day Saturday so we couldn't
smoke-test. raw_properties JSONB is the safety net.
The longitude-sign bug (historical lower-48 stored positive in 2008-01 →
2010-11) was fixed in build_flat_tables and applied in-place to the live
table on Saturday. Monday's rebuild should produce identical corrected data.
Step 1 — Did the timer fire? Did it succeed?
systemctl --user status ingest-eia-energy-layers.service
journalctl --user -u ingest-eia-energy-layers.service --since "yesterday" | tail -50
ls -lt output/ingest_*.log | head -3
Look for Active: inactive (dead) with Main PID: ... (code=exited, status=0/SUCCESS).
Anything else = job failed; read the log.
Step 2 — Check both tables populated
-- Expected: ~4.7M rows, 2008-01 → ~2026-03
select count(*), min(period), max(period)
from public.energy_eia_operating_generator_capacity_flat;
-- Expected (if facility-fuel ingest succeeded): millions of rows, ~2001-01 → ~2026-02
select count(*), min(period), max(period)
from public.energy_eia_facility_fuel_flat;
Step 3 — Verify facility-fuel column mapping
This is the one that needs human eyes. Run:
select plant_id, plant_name, state_id, state_name,
prime_mover_code, prime_mover_desc,
energy_source_code, energy_source_desc,
generation_mwh, gross_generation_mwh,
raw_properties
from public.energy_eia_facility_fuel_flat
limit 3;
If typed columns are populated: mapping is correct, ship it.
If typed columns are NULL but raw_properties has data: EIA's actual JSON
keys differ from my guesses. Inspect raw_properties to find the real keys
(probably some combination of camelCase vs lowercase or hyphens), then patch
the SELECT in ingest_eia_energy_layers.py
at the if "energy_eia_electricity_facility_fuel" in available: block
(~line 870). After patching, rebuild just the flat table without re-ingesting:
set -a && . ~/.zsh_secrets && set +a
python3 ingest_eia_energy_layers.py --skip-ingest --endpoint facility-fuel
Wait — --skip-ingest bypasses ingest but build_flat_tables runs from
the intermediate raw table which gets pruned at end-of-run by
keep_only_target_flat_table. So after a successful weekly run, the raw
facility-fuel table is gone. To patch flat columns without a full re-ingest,
you'll need to re-fetch the raw data:
python3 ingest_eia_energy_layers.py --endpoint facility-fuel
That re-ingests only facility-fuel (does not touch OGC), then rebuilds both flat tables with the corrected SELECT.
Step 4 — Possible failure modes & responses
| Symptom | Diagnosis | Action |
|---|---|---|
Service status = failed, log shows 503 from facility-fuel |
EIA-923 still down | Wait, manually re-run python3 ingest_eia_energy_layers.py --endpoint facility-fuel when EIA recovers |
| Service status = failed, log shows error in OGC ingest | EIA-860 down or different bug | Diagnose from the traceback; OGC has run successfully many times so likely transient |
| Service succeeded, OGC row count looks right, facility-fuel table missing | Endpoint silently failed but didn't propagate — should not happen with current code | Check log carefully; bug in the new error handling |
| Service succeeded, both tables present, facility-fuel columns NULL | JSON key casing wrong | See Step 3 patch path |
Key paths
- Script:
ingest_eia_energy_layers.py - Wrapper:
~/.local/bin/ingest-eia-energy-layers-weekly - Service:
~/.config/systemd/user/ingest-eia-energy-layers.service - Timer:
~/.config/systemd/user/ingest-eia-energy-layers.timer - Per-run logs:
output/ingest_YYYYMMDD_HHMMSS.log(kept by wrapper) - Sample/narrative output:
output/operating_generator_capacity_sample.txt - Facility-fuel narrative (pre-ingest):
output/facility_fuel_pending_narrative.txt - Env vars sourced from:
~/.zsh_secrets
What does NOT need doing
- OGC longitude fix is already deployed (in script + applied in-place Saturday). No re-run needed.
- systemd unit files: no changes required, the new code path uses the same wrapper.
- The
--endpointflag was added Saturday for ad-hoc per-dataset runs. Useful for re-running just facility-fuel without disturbing OGC.
Open thread to close out after verification
Update output/facility_fuel_pending_narrative.txt
once facility-fuel is actually ingested. Replace the "Pending" framing with
the real row count, period range, and any column-mapping notes from Step 3.
Mirror the format of operating_generator_capacity_sample.txt.