Enhance documentation with detailed findings from analysis report
- Add clustered vs isolated facility comparison to README - Expand infrastructure insights with hyperscaler energy strategies - Document additional database tables (opposition cases, IM3 projections, utility rates) - Enhance research ideas with specific watershed names and grid saturation data - Add data quality notes about EIA longitude corrections - Reference loaded but unused tables for future analysis
This commit is contained in:
25
README.md
25
README.md
@@ -40,12 +40,23 @@ Compared to the US average, data center host communities are:
|
||||
- **Better connected**: 94.9% broadband (vs. 89%)
|
||||
|
||||
### Infrastructure Insights
|
||||
- **89% of data centers are in metropolitan tracts** (vs. 80% of all US tracts)
|
||||
- **89% of data centers are in metropolitan tracts** (vs. 80% of all US tracts) - only 1.11× over-index
|
||||
- **Non-metro data centers (11%)** are dominated by hyperscalers:
|
||||
- AWS (67), Meta (22), Microsoft (10), Google (4) = 55% of non-metro facilities
|
||||
- 66% are in Oregon + Washington (Columbia River hydro corridor)
|
||||
- **Energy infrastructure**: 4 states have >2/3 of generation within 50 km of a data center:
|
||||
- **Grid saturation**: 4 states have >2/3 of generation within 50 km of a data center:
|
||||
- New Jersey: 83%, Nevada: 75%, Tennessee: 70%, Oregon: 68%
|
||||
- **Hyperscaler energy strategies** (non-metro sites):
|
||||
- AWS: 114 GW wind + 66 GW hydro
|
||||
- Microsoft: 13 GW nuclear (Palo Verde co-location)
|
||||
- Meta: 16 GW solar
|
||||
|
||||
### Clustered vs. Isolated Facilities
|
||||
Facilities in DBSCAN clusters differ significantly from isolated sites:
|
||||
- **$35K income gap**: Clustered sites in tracts with median income $108K vs. $73K for isolated
|
||||
- **+18 pp education**: 51% bachelor's+ vs. 33%
|
||||
- **More diverse**: 25 pp less non-Hispanic white
|
||||
- **2× energy infrastructure**: 89 vs. 40 generators within 50 km
|
||||
|
||||
### Submarine Cables
|
||||
- **Data centers are NOT systematically closer to cables** than ordinary US cities
|
||||
@@ -194,10 +205,18 @@ python3 make_internet_cables_map.py
|
||||
## Data Quality Notes
|
||||
|
||||
1. **Incomplete power ratings**: Only 5.9% of data centers have power ratings (108/1,833)
|
||||
2. **Operator fragmentation**: String variations ("Meta" vs. "Meta, Inc.") inflate distinct-operator counts
|
||||
2. **Operator fragmentation**: String variations ("Meta" vs. "Meta, Inc.", AWS variants) inflate distinct-operator counts
|
||||
3. **45 facilities** use city-precision fallback coordinates (approximate tract assignment)
|
||||
4. **7 facilities** failed RUCA join (Puerto Rico / non-US)
|
||||
5. **Broadband subscribers** are a coarse benefit proxy (actual cloud users are global)
|
||||
6. **EIA longitude correction**: 2008-2010 generator coordinates had sign errors, corrected in flat-table build
|
||||
|
||||
## Known Limitations
|
||||
|
||||
- **Power capacity**: Only 5.9% populated - nearby EIA generator capacity used as proxy
|
||||
- **Operator strings**: Need deduplication (50 of 190 non-metro facilities have null operator)
|
||||
- **Benefit measurement**: Broadband subscribers are an imperfect proxy for cloud computing benefits
|
||||
- **Universe**: Limited to 46 DC-host states (excludes DC-free states from ACS comparison)
|
||||
|
||||
## Research Ideas & Future Work
|
||||
|
||||
|
||||
Reference in New Issue
Block a user