Cross-tabs normalized data-center operator (owner) against the leading
ACS 2024 workforce industry of each enrichment geography (ZCTA and census
tract). Emits raw-count and row-percentage CSVs for both geographies.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Move all Python scripts to scripts/, documentation to docs/, raw input
data to data/, and generated HTML/CSV outputs to output/. Update path
references in 8 scripts to use Path(__file__).parent.parent as project
root so they work correctly from the new location. Update README links
and quick-start commands accordingly. Notebooks remain at root.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Extends the demographic/RUCA/energy summary with two new sections:
- §7 quantifies each top-DC state's "share of state capacity within
50 km of a DC," surfacing NJ (83%), NV (75%), TN (70%), and OR (68%)
as the most DC-saturated grids — reframing the canonical VA-centric
story by structural entanglement rather than raw count.
- §9 inventories every table in the data_centers schema with a
one-line description, flagging cleanup candidates and unused layers
for downstream work.
Also renumbers watershed analysis to §8, adds the SEDS row to the
dataset coverage table, and narrows next-step #4 to the IM3 projection
overlay (now that the SEDS join is complete).
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Adds three coordinated changes:
- Request nameplate, summer, and winter capacity from the EIA
operating-generator-capacity endpoint and project them as typed columns
on energy_eia_operating_generator_capacity_flat. The original ingest
only pulled latitude and longitude, leaving the flat table with no MW
values despite its name.
- New cluster_analysis.ipynb joins master_data_centers to ACS-2024
demographics, USDA RUCA-2020 codes (loaded from new/), and EIA
generation capacity within 50 km of each site.
- Summary doc consolidates the headline findings: DC tracts skew higher
income / more educated / more racially diverse than US average, the
metro over-index is only 1.11x, the non-metro tail is dominated by
hyperscalers in the Columbia River corridor (OR+WA = 66% of non-metro
DCs), and Microsoft co-locates with Palo Verde Nuclear in Goodyear AZ.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>