195 lines
7.8 KiB
Markdown
195 lines
7.8 KiB
Markdown
# Data Centers, Submarine Cables, and the Concentrated-Costs / Dispersed-Benefits Frame
|
||
|
||
**Author:** David Adams · **Date:** 2026-05-17
|
||
**Data:** PostGIS `data_centers` DB — `us_dc_sample_geocoded` (1,489 DCs),
|
||
`data_center_census_tracts_2024` (611 tracts, ACS 2024 5-yr enriched),
|
||
`internet_cables` (693 cables), `internet_city_dominance` (4,552 cities),
|
||
`census_tract_acs_2024_selected_states.csv` (83,811 tracts, 46 states).
|
||
|
||
---
|
||
|
||
## 1. Are US data centers spatially tied to submarine cables?
|
||
|
||
Distance from each point to the nearest submarine cable line (km):
|
||
|
||
| Group | n | Mean | p10 | p25 | **p50** | p75 | p90 | ≤10 km | ≤50 km | ≤100 km | ≤250 km |
|
||
|---|---:|---:|---:|---:|---:|---:|---:|---:|---:|---:|---:|
|
||
| US data centers | 1,489 | 358.7 | 21.6 | 163.1 | **276.1** | 477.4 | 867.4 | 5.2% | 16.8% | 21.4% | 32.2% |
|
||
| US population cities | 1,291 | 339.7 | 18.7 | 61.2 | **256.1** | 528.0 | 811.0 | 6.8% | 22.5% | 31.8% | 49.5% |
|
||
|
||
Mann-Whitney U two-sided: **z = 2.66, p ≈ 0.008** — significant, but in the
|
||
*opposite* direction. DCs are **not** systematically closer to cables than
|
||
ordinary US cities.
|
||
|
||
**Interpretation.** At the national level the "cables drive DC siting" story
|
||
fails. The largest clusters — Loudoun County VA (Ashburn), central
|
||
Washington, Hillsboro OR, Columbus OH, Iowa — are inland, anchored to
|
||
terrestrial fiber, cheap power, and tax incentives rather than submarine
|
||
landings. Only 21.4% of DCs sit within 100 km of any cable.
|
||
|
||
---
|
||
|
||
## 2. Cost concentration at the state level
|
||
|
||
| Measure | Value |
|
||
|---|---:|
|
||
| States covered | 46 |
|
||
| Gini of DC counts across states | 0.648 |
|
||
| HHI of state shares | 0.080 |
|
||
|
||
Top states by share of US data centers:
|
||
|
||
| State | DCs | Share | Cumulative |
|
||
|---|---:|---:|---:|
|
||
| VA | 319 | 21.4% | 21.4% |
|
||
| CA | 129 | 8.7% | 30.1% |
|
||
| TX | 120 | 8.1% | 38.1% |
|
||
| OR | 102 | 6.9% | 45.0% |
|
||
| WA | 90 | 6.0% | 51.0% |
|
||
| OH | 69 | 4.6% | 55.7% |
|
||
| AZ | 60 | 4.0% | 59.7% |
|
||
| IA | 58 | 3.9% | 63.6% |
|
||
|
||
Five states hold **half** of all US data centers.
|
||
|
||
---
|
||
|
||
## 3. Cost concentration at the tract level
|
||
|
||
Much sharper than state-level:
|
||
|
||
| Measure | Value |
|
||
|---|---:|
|
||
| DC-hosting tracts | 611 |
|
||
| DCs in those tracts | 1,489 |
|
||
| Gini of DC counts across DC-hosting tracts | 0.499 |
|
||
| HHI of DC shares across DC-hosting tracts | 0.0069 |
|
||
| **Top 1% of host tracts (6 tracts) hold** | **14.6% of all DCs** |
|
||
| Top 5% of host tracts (30 tracts) hold | 33.3% of all DCs |
|
||
| Top 20% of host tracts (122 tracts) hold | 60.6% of all DCs |
|
||
|
||
Population scaling:
|
||
|
||
| Metric | Value |
|
||
|---|---:|
|
||
| Population living in a DC-hosting tract | 2,868,863 |
|
||
| Total population (DC-state ACS universe) | 332,343,349 |
|
||
| **% of DC-host-state residents in a DC-hosting tract** | **0.86%** |
|
||
| DCs per resident, DC-hosting tracts | 1 per 1,927 |
|
||
| DCs per resident, DC-state average | 1 per 223,199 |
|
||
| **Per-capita DC burden, host vs. average** | **~115×** |
|
||
|
||
---
|
||
|
||
## 4. Who bears the costs? (ACS profile of DC tracts vs. peer tracts in same states)
|
||
|
||
| Field | DC tracts (median) | Non-DC peers (median) | Δ (DC − peer) |
|
||
|---|---:|---:|---:|
|
||
| Median household income ($) | 91,082 | 76,637 | **+14,446** |
|
||
| Per-capita income ($) | 48,111 | 38,546 | +9,565 |
|
||
| Broadband subscription (%) | 94.2 | 92.0 | +2.2 |
|
||
| Poverty rate (%) | 8.8 | 10.8 | −2.0 |
|
||
| Non-Hispanic White (%) | 52.4 | 64.7 | −12.3 |
|
||
| Non-Hispanic Black (%) | 6.7 | 3.9 | +2.8 |
|
||
| Hispanic/Latino (%) | 11.9 | 9.8 | +2.1 |
|
||
| Non-Hispanic Asian (%) | 5.2 | 1.5 | +3.7 |
|
||
|
||
Population-weighted means in DC tracts: MHI **$109,145**, broadband **93.2%**,
|
||
poverty 11.1%. The actual residents of host communities are concentrated in
|
||
affluent tech corridors (Loudoun, Silicon Valley, Seattle eastside,
|
||
Hillsboro OR).
|
||
|
||
Primary-industry mix of host tracts (count of tracts):
|
||
|
||
| Tracts | Primary industry |
|
||
|---:|---|
|
||
| 351 | Educational services, and health care and social assistance |
|
||
| 133 | Professional, scientific, management, administrative, and waste management services |
|
||
| 35 | Manufacturing |
|
||
| 26 | Arts, entertainment, recreation, accommodation, and food services |
|
||
| 22 | Retail trade |
|
||
| 14 | Agriculture, forestry, fishing and hunting, and mining |
|
||
| 10 | Finance and insurance, and real estate and rental and leasing |
|
||
| 9 | Construction |
|
||
| 4 | Transportation and warehousing, and utilities |
|
||
| 3 | Public administration |
|
||
|
||
---
|
||
|
||
## 5. Cable-adjacent vs. inland DC tracts
|
||
|
||
| | ≤100 km from a cable | >100 km from a cable |
|
||
|---|---:|---:|
|
||
| Tracts | 159 | 452 |
|
||
| Data centers | 319 | 1,170 |
|
||
| Median household income ($) | 106,406 | 86,289 |
|
||
| Median broadband (%) | 95.2 | 93.9 |
|
||
| Median DC count | 1 | 1 |
|
||
|
||
Inland DCs are roughly **3.7×** the cable-adjacent count. Coastal/cable
|
||
tracts skew even wealthier than inland DC tracts.
|
||
|
||
---
|
||
|
||
## 6. Benefit dispersion (broadband subscribers as a benefit proxy)
|
||
|
||
| Measure | Value |
|
||
|---|---:|
|
||
| Estimated broadband subscribers (DC states) | 119,719,313 |
|
||
| Tracts with subscriber data | 81,839 |
|
||
| Gini of subscribers across tracts | 0.253 |
|
||
| HHI of subscribers across tracts | 0.00001 |
|
||
|
||
Side-by-side concentration:
|
||
|
||
| Series | HHI |
|
||
|---|---:|
|
||
| DCs across DC-hosting tracts | 0.0069 |
|
||
| Broadband subscribers across DC-state tracts | 0.00001 |
|
||
| **Concentration ratio** | **~464× more concentrated for DCs** |
|
||
|
||
---
|
||
|
||
## 7. Verdict
|
||
|
||
| Element of the frame | Holds? |
|
||
|---|---|
|
||
| Costs concentrated geographically | **Yes** — top 6 tracts carry 15% of DCs; <1% of host-state population lives in a DC tract; per-capita burden ~115× the average. |
|
||
| Driven by submarine cable infrastructure | **No, broadly** — proximity test fails nationally; submarine cables matter for a coastal subset only. Terrestrial fiber, power, water, land, and tax incentives dominate. |
|
||
| Benefits dispersed among users | **Yes** — broadband subscribers ~464× more dispersed (by HHI) than DCs. |
|
||
| Classic political failure mode (weak losers vs. diffuse winners) | **No.** Host tracts skew wealthier, higher-income, higher-broadband than peers. The cost-bearing communities are affluent tech corridors with strong bargaining capacity — they tend to convert concentrated costs into concentrated *rents* (tax base, jobs, infrastructure concessions). |
|
||
|
||
**Bottom line.** The structural asymmetry that defines "concentrated costs /
|
||
dispersed benefits" is unambiguous in the data — DC siting is hyper-local
|
||
while benefits are continental. But the predicted political dynamic doesn't
|
||
fit cleanly, because the loser side here is not weak. A more targeted test
|
||
would split host tracts into power-stressed exurban tracts (parts of
|
||
Loudoun's edges, central Oregon, Iowa) and urban-suburban tech-corridor
|
||
tracts, and look at whether the *exurban* subset shows the weak-loser
|
||
pattern (lower income, slower broadband, higher poverty than its
|
||
neighbors).
|
||
|
||
---
|
||
|
||
## Caveats
|
||
|
||
- The ACS universe is the 46 DC-host states (already DC-heavy); excludes
|
||
states with no DCs in the sample.
|
||
- `data_center_census_tracts_2024` only contains tracts that host at least
|
||
one DC, by construction.
|
||
- Broadband-subscription rate is a coarse benefit proxy; cloud services
|
||
benefit any internet user globally, not just local subscribers.
|
||
- 45 of 1,489 DCs use city-precision fallback coordinates, so a small share
|
||
of tract assignments are approximate.
|
||
- The `logical_dominance_ips` field in `internet_city_dominance` measures
|
||
IP blocks routed/hosted at each city — a supply-side measure that
|
||
duplicates the DC signal, not a demand-side user-location measure. It
|
||
was excluded from the benefit-dispersion calculation for that reason.
|
||
|
||
## Reproducible scripts
|
||
|
||
- `load_postgis_internet_cables.py` — ingest cables/landings/cities
|
||
- `make_internet_cables_map.py` — render the combined Leaflet map
|
||
- `analyze_cables_concentration.py` — state-level + cable-proximity analysis
|
||
- `analyze_dc_tract_concentration.py` — tract-level analysis used here
|