Document database table previews

This commit is contained in:
2026-06-09 15:04:47 -07:00
parent 6db5e0fff8
commit 176f3d1eb6
4 changed files with 1106 additions and 16 deletions

View File

@@ -167,6 +167,163 @@ ORDER BY current.pct_grid_saturated DESC;
---
## Ready-to-Run Analyses Enabled by New Context Tables
**Status**: Four one-row-per-facility context tables are now loaded and documented:
- `data_center_historical_climate`
- `data_center_usdm_drought_exposure`
- `data_center_hms_smoke_exposure`
- `data_center_election_context`
These make several publishable descriptive analyses possible without another major ingestion step.
### Climate Exposure and Cooling Burden
**Core idea**: Data centers are energy-intensive cooling loads. The historical climate table lets us ask whether facilities are already sited in hotter, wetter-bulb, or more cooling-intensive climates.
**Research Questions**:
- Are clustered facilities in hotter or more humid climate regimes than isolated facilities?
- Do hyperscalers choose cooler/non-metro climates more often than colocation providers?
- Are facilities with high `cooling_degree_days_c` or `extreme_wet_bulb_days` also near constrained grids?
- Do hotter sites overlap with lower-income or politically less powerful communities?
**Suggested Output**: `output/data_center_climate_exposure_summary.md`
**Starter Query**:
```sql
SELECT
dc.state,
COUNT(*) AS facilities,
AVG(c.mean_annual_temperature_c) AS mean_temp_c,
AVG(c.annual_cooling_degree_days_c_mean) AS annual_cdd_c,
AVG(c.extreme_wet_bulb_days) AS extreme_wet_bulb_days
FROM master_data_centers dc
JOIN data_center_historical_climate c USING (master_id)
GROUP BY dc.state
HAVING COUNT(*) >= 10
ORDER BY annual_cdd_c DESC;
```
### Drought Exposure and Water-Use Politics
**Core idea**: The USDM summary table makes drought exposure measurable at each facility, and it can be joined to HUC8 watersheds, opposition cases, and climate metrics.
**Research Questions**:
- Which major clusters have the highest share of weeks in D2+ drought?
- Are water-sensitive regions still attracting new or projected facilities?
- Are opposition cases more common where `pct_weeks_in_d2_or_worse` or `longest_d2_streak_weeks` is high?
- Do non-metro hyperscaler sites trade cheaper land/power for higher drought exposure?
**Suggested Output**: `output/data_center_drought_water_risk_summary.md`
**Starter Query**:
```sql
SELECT
w.huc8,
w.huc8_name,
COUNT(*) AS facilities,
AVG(d.pct_weeks_in_d2_or_worse) AS avg_pct_d2_or_worse,
MAX(d.longest_d2_streak_weeks) AS max_d2_streak_weeks
FROM data_center_watershed_huc8 w
JOIN data_center_usdm_drought_exposure d USING (master_id)
GROUP BY w.huc8, w.huc8_name
HAVING COUNT(*) >= 5
ORDER BY avg_pct_d2_or_worse DESC;
```
### Wildfire Smoke, Operational Resilience, and Worker Exposure
**Core idea**: Smoke exposure is a climate-adaptation issue for facility operations and for workers who build, maintain, and secure these sites.
**Research Questions**:
- Are facilities in the West and Mountain West systematically more smoke-exposed?
- Do major clusters create regional redundancy risk because many facilities share the same smoke exposure profile?
- Are smoke-exposed data centers in communities already facing higher FEMA NRI risk or lower resilience scores?
- Do smoke exposure patterns differ by operator strategy?
**Suggested Output**: `output/data_center_smoke_resilience_summary.md`
**Starter Query**:
```sql
SELECT
dc.state,
COUNT(*) AS facilities,
AVG(s.pct_days_with_any_smoke) AS avg_any_smoke_days,
AVG(s.pct_days_with_heavy_smoke) AS avg_heavy_smoke_days,
MAX(s.longest_heavy_smoke_streak_days) AS max_heavy_smoke_streak
FROM master_data_centers dc
JOIN data_center_hms_smoke_exposure s USING (master_id)
GROUP BY dc.state
HAVING COUNT(*) >= 10
ORDER BY avg_heavy_smoke_days DESC;
```
### Political Geography of Host Communities
**Core idea**: `data_center_election_context` provides a rough but reusable local political context for each facility. It is not a causal measure of support/opposition, but it can help frame siting politics and legislative outcomes.
**Research Questions**:
- Are data centers more common in precincts with stronger Democratic or Republican vote shares?
- Do clustered and isolated facilities sit in different local political environments?
- Are opposition cases associated with precinct partisanship, turnout, or close elections?
- Do state-level data center bills emerge from states where host precincts differ from statewide political averages?
**Suggested Output**: `output/data_center_political_geography_summary.md`
**Starter Query**:
```sql
SELECT
dc.state,
COUNT(*) AS facilities,
AVG(ec.democratic_votes / NULLIF(ec.total_votes, 0)) AS avg_dem_vote_share,
AVG(ec.republican_votes / NULLIF(ec.total_votes, 0)) AS avg_rep_vote_share
FROM master_data_centers dc
JOIN data_center_election_context ec USING (master_id)
WHERE ec.total_votes > 0
GROUP BY dc.state
HAVING COUNT(*) >= 10
ORDER BY facilities DESC;
```
### Compound Exposure Index
**Core idea**: Combine NRI, historical climate, drought, smoke, watershed concentration, and demographics into a transparent screening index for cumulative exposure.
**Research Questions**:
- Which facilities or clusters have high climate, drought, smoke, and FEMA risk simultaneously?
- Are compound-exposure sites demographically different from lower-exposure sites?
- Do projected IM3 facilities fall into lower- or higher-risk exposure profiles than current facilities?
**Implementation Notes**:
- Standardize each indicator as a percentile rank before combining.
- Keep the index descriptive and auditable; avoid black-box weighting.
- Report sensitivity using equal weights, environment-only weights, and infrastructure-weighted variants.
**Suggested Output**: `output/data_center_compound_exposure_index.csv`
**Starter Query**:
```sql
WITH joined AS (
SELECT
dc.master_id,
dc.name,
dc.state,
c.annual_cooling_degree_days_c_mean,
d.pct_weeks_in_d2_or_worse,
s.pct_days_with_heavy_smoke,
n."RISK_SCORE"
FROM master_data_centers dc
LEFT JOIN data_center_historical_climate c USING (master_id)
LEFT JOIN data_center_usdm_drought_exposure d USING (master_id)
LEFT JOIN data_center_hms_smoke_exposure s USING (master_id)
LEFT JOIN data_center_nri_exposure n USING (master_id)
)
SELECT *
FROM joined
ORDER BY
annual_cooling_degree_days_c_mean DESC NULLS LAST,
pct_weeks_in_d2_or_worse DESC NULLS LAST,
pct_days_with_heavy_smoke DESC NULLS LAST
LIMIT 50;
```
---
## Methodological Extensions
### 6. Time-Series Analysis of Cluster Growth
@@ -538,9 +695,10 @@ If you're interested in collaborating on any of these research directions, pleas
**Priorities for external collaboration**:
1. Power capacity data acquisition
2. Water stress/drought overlay
2. Climate, drought, smoke, and compound-exposure analysis
3. Opposition cases database compilation
4. International comparative analysis
4. Water stress/drought overlay
5. International comparative analysis
---