Commit Graph

39 Commits

Author SHA1 Message Date
00653f1627 enhanced 2026-06-29 06:59:20 -07:00
2c50c969bf Add operator x dominant workforce industry crosstab script and CSVs
Cross-tabs normalized data-center operator (owner) against the leading
ACS 2024 workforce industry of each enrichment geography (ZCTA and census
tract). Emits raw-count and row-percentage CSVs for both geographies.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-28 20:57:58 -07:00
6cb7c0334c Add ZCTA-level enrichment table build script
Mirrors create_data_center_census_tract_table.py but at ZIP Code
Tabulation Area geography (2020 boundary vintage, since ZCTAs are only
redrawn each decennial census). Builds data_center_zcta_2024 (607 ZCTAs
hosting >=1 facility, joined to ACS 2024 5-year demographics) and adds
master_data_centers.zcta_geoid, parallel to the existing tract geoid
column. Used to verify the income/education premium for DC host
communities holds at ZIP-code resolution, not just census-tract
resolution, for the dc-siting-politics paper.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-20 20:59:19 -07:00
f29755faba Update README with corrected numbers from master dataset
- Per-capita burden: 115× → 35× (master dataset adds 104 larger suburban tracts)
- Host pop share: 0.86% → 2.9% of host-state residents
- Non-metro: 11% → ~10% (RUCA 2020)
- Add: 59.3% Biden 2020 in host communities; income gradient by urbanicity
- Add: top host tract (Loudoun CT 6110.20, 69 DCs, MHI $141K)
- Correct hyperscaler shares to exact figures from live DB

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-13 17:44:22 -07:00
176f3d1eb6 Document database table previews 2026-06-09 15:04:47 -07:00
6db5e0fff8 Fix path references in scripts after reorganization
Update 8 scripts to use Path(__file__).parent.parent as PROJECT_ROOT
so they resolve data/, output/, and internet_cables/ relative to the
project root rather than the caller's working directory.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-27 21:57:47 -07:00
ee5856661a Reorganize project into scripts/, docs/, data/, output/ directories
Move all Python scripts to scripts/, documentation to docs/, raw input
data to data/, and generated HTML/CSV outputs to output/. Update path
references in 8 scripts to use Path(__file__).parent.parent as project
root so they work correctly from the new location. Update README links
and quick-start commands accordingly. Notebooks remain at root.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-27 21:57:47 -07:00
a2e295d95b Update database-tables.md
update database port
2026-05-28 04:44:43 +00:00
4525ea3f97 Add LegiScan legislation ingestion and analysis queries
Adds ingest_legiscan.py to pull all US state + federal bills (2016-2026)
from the LegiScan API into legiscan_sessions and legiscan_bills tables.
Bills are keyword-tagged across 8 research categories (data_center,
ratepayer_protection, large_load, grid_impact, tax_incentive, etc.).
Loads ~1.3M bills; ~60K tagged relevant. Adds query_legiscan_bills.sql
with pre-built analysis queries including state/DC joins. Updates
database-tables.md, README.md, and research-ideas.md accordingly.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-27 21:30:31 -07:00
46c8c58545 Enhance documentation with detailed findings from analysis report
- Add clustered vs isolated facility comparison to README
- Expand infrastructure insights with hyperscaler energy strategies
- Document additional database tables (opposition cases, IM3 projections, utility rates)
- Enhance research ideas with specific watershed names and grid saturation data
- Add data quality notes about EIA longitude corrections
- Reference loaded but unused tables for future analysis
2026-05-27 11:36:50 -07:00
3758dcc02a Add documentation links to README 2026-05-27 11:33:05 -07:00
423e11083d Add comprehensive documentation: README, database tables, and research ideas 2026-05-27 11:28:14 -07:00
98f6e6e237 Add EIA and utility rate map layers 2026-05-22 21:32:15 -07:00
c81dba025b extended broadband, fema table, updated map 2026-05-22 20:14:11 -07:00
6afa97e0ba updated map 2026-05-22 15:12:34 -07:00
03239ad007 Standardize notebook table-relationship documentation cells 2026-05-22 14:21:51 -07:00
c95f22fcdb expanded voter data 2026-05-22 14:18:01 -07:00
dc8755cde0 Add FCC broadband build workflow and refresh enhanced cluster map 2026-05-22 12:51:36 -07:00
4f3dbfc7f9 smoke and drought tables 2026-05-22 10:49:22 -07:00
e48e1aef93 climate and voeter 2026-05-22 06:33:45 -07:00
3c4726a6b4 climate data from US gov. voter roll data 2026-05-22 06:33:35 -07:00
8a1a0b9aff open meteao data 2026-05-19 22:28:23 -07:00
a7121d601b Add state grid context and database inventory to DC summary
Extends the demographic/RUCA/energy summary with two new sections:
- §7 quantifies each top-DC state's "share of state capacity within
  50 km of a DC," surfacing NJ (83%), NV (75%), TN (70%), and OR (68%)
  as the most DC-saturated grids — reframing the canonical VA-centric
  story by structural entanglement rather than raw count.
- §9 inventories every table in the data_centers schema with a
  one-line description, flagging cleanup candidates and unused layers
  for downstream work.

Also renumbers watershed analysis to §8, adds the SEDS row to the
dataset coverage table, and narrows next-step #4 to the IM3 projection
overlay (now that the SEDS join is complete).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-18 09:04:56 -07:00
dda6490023 updated map and cluster analysis 2026-05-18 08:50:22 -07:00
aef9d99e18 Add enhanced data center cluster map 2026-05-18 08:37:21 -07:00
a005b2eab0 removed huc8.zip 2026-05-18 08:15:42 -07:00
eccfbdbad9 Add data center demographic, RUCA, and energy capacity analysis
Adds three coordinated changes:

- Request nameplate, summer, and winter capacity from the EIA
  operating-generator-capacity endpoint and project them as typed columns
  on energy_eia_operating_generator_capacity_flat. The original ingest
  only pulled latitude and longitude, leaving the flat table with no MW
  values despite its name.
- New cluster_analysis.ipynb joins master_data_centers to ACS-2024
  demographics, USDA RUCA-2020 codes (loaded from new/), and EIA
  generation capacity within 50 km of each site.
- Summary doc consolidates the headline findings: DC tracts skew higher
  income / more educated / more racially diverse than US average, the
  metro over-index is only 1.11x, the non-metro tail is dominated by
  hyperscalers in the Columbia River corridor (OR+WA = 66% of non-metro
  DCs), and Microsoft co-locates with Palo Verde Nuclear in Goodyear AZ.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-18 08:14:57 -07:00
e7f23a87b2 created folium map html page 2026-05-17 18:57:15 -07:00
7e46dd8578 Add spatial clustering analysis outputs 2026-05-17 18:53:38 -07:00
90e8b21423 Add master data center merge workflow 2026-05-17 18:53:16 -07:00
8fcbb18e37 Add utility rate tracker loader 2026-05-17 18:52:55 -07:00
48f23af5b0 Add EIA SEDS ingestion support 2026-05-17 18:52:29 -07:00
614b10b43f created table loader. IM3 tables. oppositoin tables 2026-05-17 16:08:46 -07:00
eecfa49779 cables and maps 2026-05-17 15:32:51 -07:00
3f7875084d claude checklist 2026-05-16 17:06:13 -07:00
75d17f8e95 got the ingest for energy eia data. created txt files of their descriptions 2026-05-16 17:05:59 -07:00
b442998eb5 update ingest_eia 2026-05-15 20:53:42 -07:00
4e6a564b6c gitignore, energy)layer fix 2026-05-15 20:49:51 -07:00
f57969c9ee first commit 2026-05-15 20:48:41 -07:00