got the ingest for energy eia data. created txt files of their descriptions
This commit is contained in:
134
output/operating_generator_capacity_sample.txt
Normal file
134
output/operating_generator_capacity_sample.txt
Normal file
@@ -0,0 +1,134 @@
|
||||
================================================================================
|
||||
EIA Operating Generator Capacity — Sample Rows + Narrative
|
||||
Generated 2026-05-16 from public.energy_eia_operating_generator_capacity_flat
|
||||
================================================================================
|
||||
|
||||
WHAT THIS DATA IS
|
||||
-----------------
|
||||
This table is a flat, queryable view of EIA's "operating-generator-capacity"
|
||||
endpoint (https://api.eia.gov/v2/electricity/operating-generator-capacity/).
|
||||
The underlying source is Form EIA-860, which inventories every electric
|
||||
generator in the United States that is reported as operating (or recently
|
||||
operating) by its owner.
|
||||
|
||||
Each row represents one generator's reported status in one month. A single
|
||||
power plant typically has multiple generators, so a plant like Plant Barry in
|
||||
Alabama appears as several rows per month — one for each generator unit
|
||||
(generator_id 1, 2, 3, ...). The same generator reappears every month it
|
||||
remains in the inventory, so the table is a time series of (plant × generator
|
||||
× month) records.
|
||||
|
||||
WHAT IT TELLS US
|
||||
----------------
|
||||
For each generator, in each reporting month:
|
||||
- Where it is (state, balancing authority, exact latitude/longitude)
|
||||
- Who owns or operates it (entity_id, entity_name)
|
||||
- What fuel/energy source it uses (energy_source_code + descriptive name)
|
||||
- How it generates electricity (prime_mover_code, e.g. ST=steam turbine,
|
||||
HY=hydro, IC=internal combustion, WT=wind turbine)
|
||||
- Its current operating status (status code, see below)
|
||||
- What sector it serves (utility, IPP, industrial, commercial, etc.)
|
||||
|
||||
What it does NOT tell us is how much electricity the generator actually
|
||||
produces in that month — that data comes from a separate EIA endpoint
|
||||
("facility-fuel", Form EIA-923), captured in a sibling table.
|
||||
|
||||
STATUS CODES IN THIS TABLE
|
||||
--------------------------
|
||||
OP Operating 4,229,083 rows
|
||||
SB Standby / backup 339,057 rows
|
||||
OS Out of service 99,816 rows
|
||||
OA Out of service (annual) 28,769 rows
|
||||
|
||||
SUMMARY STATISTICS
|
||||
------------------
|
||||
Total rows: 4,696,725
|
||||
Distinct generators (by plant_id × generator_id): ~75k
|
||||
Distinct plants (plant_id): 15,791
|
||||
Distinct states/territories: 51
|
||||
Distinct months covered: 218
|
||||
Period range: 2008-01 → 2026-02
|
||||
Rows with lat/lon geometry: 4,685,500 (99.76%)
|
||||
Distinct fuel codes: 38
|
||||
|
||||
TOP 10 FUELS BY ROW COUNT
|
||||
-------------------------
|
||||
Natural Gas 1,301,782
|
||||
Water (hydro) 908,741
|
||||
Distillate Fuel Oil* 767,207
|
||||
Solar 624,113
|
||||
Landfill Gas 317,709
|
||||
Wind 245,214
|
||||
Bituminous Coal 108,352
|
||||
Subbituminous Coal 75,587
|
||||
Electricity used for energy storage 43,833
|
||||
Geothermal 41,066
|
||||
|
||||
* EIA stores this as "Disillate Fuel Oil" (sic). The misspelling is in
|
||||
EIA's source data, not introduced by ingest. Preserved verbatim.
|
||||
|
||||
FIRST 5 ROWS (earliest period, ordered by plant_id)
|
||||
---------------------------------------------------
|
||||
period | plant_id | plant_name | state | entity_name | gen_id | status | fuel | pm | latitude | longitude
|
||||
---------+----------+--------------+-------+------------------+--------+--------+------------------+----+-----------+-----------
|
||||
2008-01 | 2 | Bankhead Dam | AL | Alabama Power Co | 1 | OP | Water | HY | 33.218889 | -87.579722
|
||||
2008-01 | 3 | Barry | AL | Alabama Power Co | 1 | OP | Bituminous Coal | ST | 31.004167 | -88.013889
|
||||
2008-01 | 3 | Barry | AL | Alabama Power Co | 2 | OP | Bituminous Coal | ST | 31.004167 | -88.013889
|
||||
2008-01 | 3 | Barry | AL | Alabama Power Co | 3 | OP | Bituminous Coal | ST | 31.004167 | -88.013889
|
||||
2008-01 | 3 | Barry | AL | Alabama Power Co | 4 | OP | Bituminous Coal | ST | 31.004167 | -88.013889
|
||||
|
||||
(Both plants are in Alabama; Bankhead Dam is a hydro facility on the Black
|
||||
Warrior River, Plant Barry is a coal-fired steam plant near Mobile. Both
|
||||
were operating in January 2008.)
|
||||
|
||||
LAST 5 ROWS (latest period, ordered by plant_id)
|
||||
------------------------------------------------
|
||||
period | plant_id | plant_name | state | entity_name | gen_id | status | fuel | pm | latitude | longitude
|
||||
---------+----------+------------+-------+----------------------------+--------+--------+---------------------+----+-----------+-------------
|
||||
2026-02 | 1 | Sand Point | AK | Sand Point Generating, LLC | 1 | SB | Disillate Fuel Oil | IC | 55.339722 | -160.497222
|
||||
2026-02 | 1 | Sand Point | AK | Sand Point Generating, LLC | 2 | OP | Disillate Fuel Oil | IC | 55.339722 | -160.497222
|
||||
2026-02 | 1 | Sand Point | AK | Sand Point Generating, LLC | 3 | OP | Disillate Fuel Oil | IC | 55.339722 | -160.497222
|
||||
2026-02 | 1 | Sand Point | AK | Sand Point Generating, LLC | 5.1 | OP | Disillate Fuel Oil | IC | 55.339722 | -160.497222
|
||||
2026-02 | 1 | Sand Point | AK | Sand Point Generating, LLC | WT1 | OS | Wind | WT | 55.339722 | -160.497222
|
||||
|
||||
(Sand Point is a small remote-Alaska community station with five generators:
|
||||
four diesel internal-combustion units and one wind turbine. The wind turbine
|
||||
is currently out of service.)
|
||||
|
||||
KNOWN DATA-QUALITY QUIRKS IN EIA'S SOURCE DATA
|
||||
----------------------------------------------
|
||||
- Historical longitude sign bug (FIXED at ingest time, 2026-05-16).
|
||||
For reporting periods 2008-01 through 2010-11, EIA stored lower-48
|
||||
longitudes as positive numbers (Bankhead Dam was +87.579722 instead
|
||||
of -87.579722). EIA cleaned this up in their own data starting
|
||||
2010-12, but the historical periods still had the bug. The flat
|
||||
table's build step now applies:
|
||||
|
||||
CASE WHEN longitude > 0 AND state_id <> 'AK'
|
||||
THEN -longitude ELSE longitude END
|
||||
|
||||
and rebuilds geom from the corrected coordinates. Alaska is
|
||||
excluded because some Aleutian plants (~11k bug-era rows) are
|
||||
legitimately east of the dateline with positive longitudes.
|
||||
Affected non-AK rows fixed: 403,558. After the fix, every plant
|
||||
in the table is at a geographically plausible US location.
|
||||
|
||||
- Fuel description "Disillate Fuel Oil" (missing 't', should be
|
||||
"Distillate") — EIA's spelling, preserved as-is in energy_source_desc.
|
||||
|
||||
REFRESH CADENCE
|
||||
---------------
|
||||
A systemd user timer rebuilds this table every Monday at 03:30 local time
|
||||
via ~/.local/bin/ingest-eia-energy-layers-weekly. The ingest fetches the
|
||||
full dataset per month (Jan 2008 → current) and rebuilds the flat table
|
||||
from scratch each run.
|
||||
|
||||
JOIN KEY FOR DOWNSTREAM ANALYSIS
|
||||
--------------------------------
|
||||
plant_id (text) joins to the forthcoming energy_eia_facility_fuel_flat
|
||||
table (Form EIA-923), which provides monthly net + gross generation in MWh
|
||||
for the same plants. Together, the two tables answer:
|
||||
|
||||
- WHERE energy is generated (this table, with lat/lon)
|
||||
- WHAT is generated and by whom (this table, with fuel + entity)
|
||||
- HOW MUCH is generated each month (facility_fuel_flat, in MWh)
|
||||
Reference in New Issue
Block a user