Files
data-centers/docs/cables_concentration_report.md
dadams ee5856661a Reorganize project into scripts/, docs/, data/, output/ directories
Move all Python scripts to scripts/, documentation to docs/, raw input
data to data/, and generated HTML/CSV outputs to output/. Update path
references in 8 scripts to use Path(__file__).parent.parent as project
root so they work correctly from the new location. Update README links
and quick-start commands accordingly. Notebooks remain at root.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-27 21:57:47 -07:00

7.8 KiB
Raw Blame History

Data Centers, Submarine Cables, and the Concentrated-Costs / Dispersed-Benefits Frame

Author: David Adams · Date: 2026-05-17 Data: PostGIS data_centers DB — us_dc_sample_geocoded (1,489 DCs), data_center_census_tracts_2024 (611 tracts, ACS 2024 5-yr enriched), internet_cables (693 cables), internet_city_dominance (4,552 cities), census_tract_acs_2024_selected_states.csv (83,811 tracts, 46 states).


1. Are US data centers spatially tied to submarine cables?

Distance from each point to the nearest submarine cable line (km):

Group n Mean p10 p25 p50 p75 p90 ≤10 km ≤50 km ≤100 km ≤250 km
US data centers 1,489 358.7 21.6 163.1 276.1 477.4 867.4 5.2% 16.8% 21.4% 32.2%
US population cities 1,291 339.7 18.7 61.2 256.1 528.0 811.0 6.8% 22.5% 31.8% 49.5%

Mann-Whitney U two-sided: z = 2.66, p ≈ 0.008 — significant, but in the opposite direction. DCs are not systematically closer to cables than ordinary US cities.

Interpretation. At the national level the "cables drive DC siting" story fails. The largest clusters — Loudoun County VA (Ashburn), central Washington, Hillsboro OR, Columbus OH, Iowa — are inland, anchored to terrestrial fiber, cheap power, and tax incentives rather than submarine landings. Only 21.4% of DCs sit within 100 km of any cable.


2. Cost concentration at the state level

Measure Value
States covered 46
Gini of DC counts across states 0.648
HHI of state shares 0.080

Top states by share of US data centers:

State DCs Share Cumulative
VA 319 21.4% 21.4%
CA 129 8.7% 30.1%
TX 120 8.1% 38.1%
OR 102 6.9% 45.0%
WA 90 6.0% 51.0%
OH 69 4.6% 55.7%
AZ 60 4.0% 59.7%
IA 58 3.9% 63.6%

Five states hold half of all US data centers.


3. Cost concentration at the tract level

Much sharper than state-level:

Measure Value
DC-hosting tracts 611
DCs in those tracts 1,489
Gini of DC counts across DC-hosting tracts 0.499
HHI of DC shares across DC-hosting tracts 0.0069
Top 1% of host tracts (6 tracts) hold 14.6% of all DCs
Top 5% of host tracts (30 tracts) hold 33.3% of all DCs
Top 20% of host tracts (122 tracts) hold 60.6% of all DCs

Population scaling:

Metric Value
Population living in a DC-hosting tract 2,868,863
Total population (DC-state ACS universe) 332,343,349
% of DC-host-state residents in a DC-hosting tract 0.86%
DCs per resident, DC-hosting tracts 1 per 1,927
DCs per resident, DC-state average 1 per 223,199
Per-capita DC burden, host vs. average ~115×

4. Who bears the costs? (ACS profile of DC tracts vs. peer tracts in same states)

Field DC tracts (median) Non-DC peers (median) Δ (DC peer)
Median household income ($) 91,082 76,637 +14,446
Per-capita income ($) 48,111 38,546 +9,565
Broadband subscription (%) 94.2 92.0 +2.2
Poverty rate (%) 8.8 10.8 2.0
Non-Hispanic White (%) 52.4 64.7 12.3
Non-Hispanic Black (%) 6.7 3.9 +2.8
Hispanic/Latino (%) 11.9 9.8 +2.1
Non-Hispanic Asian (%) 5.2 1.5 +3.7

Population-weighted means in DC tracts: MHI $109,145, broadband 93.2%, poverty 11.1%. The actual residents of host communities are concentrated in affluent tech corridors (Loudoun, Silicon Valley, Seattle eastside, Hillsboro OR).

Primary-industry mix of host tracts (count of tracts):

Tracts Primary industry
351 Educational services, and health care and social assistance
133 Professional, scientific, management, administrative, and waste management services
35 Manufacturing
26 Arts, entertainment, recreation, accommodation, and food services
22 Retail trade
14 Agriculture, forestry, fishing and hunting, and mining
10 Finance and insurance, and real estate and rental and leasing
9 Construction
4 Transportation and warehousing, and utilities
3 Public administration

5. Cable-adjacent vs. inland DC tracts

≤100 km from a cable >100 km from a cable
Tracts 159 452
Data centers 319 1,170
Median household income ($) 106,406 86,289
Median broadband (%) 95.2 93.9
Median DC count 1 1

Inland DCs are roughly 3.7× the cable-adjacent count. Coastal/cable tracts skew even wealthier than inland DC tracts.


6. Benefit dispersion (broadband subscribers as a benefit proxy)

Measure Value
Estimated broadband subscribers (DC states) 119,719,313
Tracts with subscriber data 81,839
Gini of subscribers across tracts 0.253
HHI of subscribers across tracts 0.00001

Side-by-side concentration:

Series HHI
DCs across DC-hosting tracts 0.0069
Broadband subscribers across DC-state tracts 0.00001
Concentration ratio ~464× more concentrated for DCs

7. Verdict

Element of the frame Holds?
Costs concentrated geographically Yes — top 6 tracts carry 15% of DCs; <1% of host-state population lives in a DC tract; per-capita burden ~115× the average.
Driven by submarine cable infrastructure No, broadly — proximity test fails nationally; submarine cables matter for a coastal subset only. Terrestrial fiber, power, water, land, and tax incentives dominate.
Benefits dispersed among users Yes — broadband subscribers ~464× more dispersed (by HHI) than DCs.
Classic political failure mode (weak losers vs. diffuse winners) No. Host tracts skew wealthier, higher-income, higher-broadband than peers. The cost-bearing communities are affluent tech corridors with strong bargaining capacity — they tend to convert concentrated costs into concentrated rents (tax base, jobs, infrastructure concessions).

Bottom line. The structural asymmetry that defines "concentrated costs / dispersed benefits" is unambiguous in the data — DC siting is hyper-local while benefits are continental. But the predicted political dynamic doesn't fit cleanly, because the loser side here is not weak. A more targeted test would split host tracts into power-stressed exurban tracts (parts of Loudoun's edges, central Oregon, Iowa) and urban-suburban tech-corridor tracts, and look at whether the exurban subset shows the weak-loser pattern (lower income, slower broadband, higher poverty than its neighbors).


Caveats

  • The ACS universe is the 46 DC-host states (already DC-heavy); excludes states with no DCs in the sample.
  • data_center_census_tracts_2024 only contains tracts that host at least one DC, by construction.
  • Broadband-subscription rate is a coarse benefit proxy; cloud services benefit any internet user globally, not just local subscribers.
  • 45 of 1,489 DCs use city-precision fallback coordinates, so a small share of tract assignments are approximate.
  • The logical_dominance_ips field in internet_city_dominance measures IP blocks routed/hosted at each city — a supply-side measure that duplicates the DC signal, not a demand-side user-location measure. It was excluded from the benefit-dispersion calculation for that reason.

Reproducible scripts

  • load_postgis_internet_cables.py — ingest cables/landings/cities
  • make_internet_cables_map.py — render the combined Leaflet map
  • analyze_cables_concentration.py — state-level + cable-proximity analysis
  • analyze_dc_tract_concentration.py — tract-level analysis used here