Enhance documentation with detailed findings from analysis report

- Add clustered vs isolated facility comparison to README
- Expand infrastructure insights with hyperscaler energy strategies
- Document additional database tables (opposition cases, IM3 projections, utility rates)
- Enhance research ideas with specific watershed names and grid saturation data
- Add data quality notes about EIA longitude corrections
- Reference loaded but unused tables for future analysis
This commit is contained in:
2026-05-27 11:36:50 -07:00
parent 3758dcc02a
commit 46c8c58545
3 changed files with 158 additions and 7 deletions

View File

@@ -412,6 +412,93 @@ Tables are organized into four categories:
---
### Other Tables
#### `opposition_cases_geocoded`
**Rows**: 18
**Purpose**: Geocoded community-opposition cases against data center builds
**Key Columns**:
- `case_id` (TEXT) - Unique identifier
- `developer` (TEXT) - Proposed developer/operator
- `investment_billions` (DOUBLE PRECISION) - Investment amount in billions
- `outcome` (TEXT) - Case outcome (approved, rejected, pending)
- `governance_response` (TEXT) - Government response
- `latitude`, `longitude`, `geom`
**Source**: Compiled from news archives
**Notes**: Loaded but currently unused - see research-ideas.md for proposed analyses
#### `census_tract_huc8_link`
**Rows**: 806
**Purpose**: Tract↔HUC8 spatial overlap table
**Key Columns**:
- `geoid` (TEXT) - Census tract GEOID
- `huc8` (TEXT) - HUC8 watershed code
- `overlap_pct` (DOUBLE PRECISION) - Percentage of tract overlapping watershed
**Notes**: Useful for downstream tract-level water-stress joins; limited to tracts containing data centers
#### `im3_state_projected_moderate_50`
**Rows**: 328
**Purpose**: PNNL IM3 projected data center siting (moderate growth, gravity weight 0.50)
**Key Columns**:
- `facility_id` (TEXT)
- `state` (TEXT)
- `cost_millions` (DOUBLE PRECISION)
- `it_mw` (DOUBLE PRECISION) - IT load in megawatts
- `cooling_water_demand_gal_per_day` (DOUBLE PRECISION)
- `latitude`, `longitude`, `geom`
**Source**: PNNL Integrated Multisector Multiscale Modeling (IM3)
**Notes**: Loaded but unused - potential for forward-projection analysis
#### `im3_projected_state_demand_summary`
**Rows**: 31
**Purpose**: State-level rollup of IM3 projected facility counts, IT MW, and cooling demand
**Key Columns**:
- `state` (TEXT)
- `facility_count` (INTEGER)
- `total_it_mw` (DOUBLE PRECISION)
- `total_cooling_demand_mgd` (DOUBLE PRECISION) - Million gallons per day
**Source**: IM3 model outputs
#### `utility_rate_tracker_2025_2028`
**Rows**: 374
**Purpose**: Utility rate-increase tracker by provider × state × service type
**Key Columns**:
- `provider` (TEXT) - Utility provider name
- `state` (TEXT)
- `service_type` (TEXT)
- `effective_date` (DATE)
- `monthly_increase_dollars` (DOUBLE PRECISION)
- `percent_increase` (DOUBLE PRECISION)
**Source**: Utility rate tracker database
**Notes**: Loaded but unused in demographic/energy analysis
#### `energy_atlas_layers_catalog`
**Rows**: ~5
**Purpose**: Metadata catalog of EIA layers ingested
**Key Columns**:
- `table_name` (TEXT)
- `source_url` (TEXT)
- `import_timestamp` (TIMESTAMP)
- `row_count` (INTEGER)
**Notes**: Created by `ingest_eia_energy_layers.py`
---
## Commonly Used Joins
### Data Center to Demographics