Add EIA SEDS ingestion support

This commit is contained in:
2026-05-17 18:52:29 -07:00
parent 614b10b43f
commit 48f23af5b0
2 changed files with 152 additions and 26 deletions

View File

@@ -115,3 +115,42 @@ Update [`output/facility_fuel_pending_narrative.txt`](../output/facility_fuel_pe
once facility-fuel is actually ingested. Replace the "Pending" framing with
the real row count, period range, and any column-mapping notes from Step 3.
Mirror the format of `operating_generator_capacity_sample.txt`.
## New endpoint added — SEDS (State Energy Data System)
Wired up 2026-05-17. Endpoint: `seds` (annual frequency,
`https://api.eia.gov/v2/seds/data/`). Probed live, columns verified, smoke
test of 50 rows landed in `public.energy_eia_seds_flat` with all typed
columns populated.
**Verified JSON keys** (no sector field — sector is encoded in `seriesId`):
`period` (YYYY), `seriesId`, `seriesDescription`, `stateId`,
`stateDescription`, `value`, `unit`.
**Total volume:** ~2.57M rows across 65 years (19602024), ~40k rows/year.
Ingested year-by-year via the generalized `fetch_eia_pages_by_period` to
stay under EIA's 503 threshold (same pattern as the monthly endpoints).
**What to verify Monday:**
```sql
-- Expected: ~2.5M+ rows, 1960 → 2024
select count(*), min(year), max(year)
from public.energy_eia_seds_flat;
-- Spot-check that typed columns landed (not all NULL)
select period, year, series_id, state_id, value, unit
from public.energy_eia_seds_flat
order by random()
limit 5;
```
If row count is way under 2.5M, suspect a mid-run failure — check the log
for `503` errors on the `seds` endpoint and re-run with
`python3 ingest_eia_energy_layers.py --category state_energy --endpoint seds`.
**Product/API docs for reference:**
- Product page: https://www.eia.gov/state/seds/
- Technical notes: https://www.eia.gov/state/seds/seds-technical-notes-complete.php
- API documentation: https://www.eia.gov/opendata/documentation.php