Reorganize project into scripts/, docs/, data/, output/ directories

Move all Python scripts to scripts/, documentation to docs/, raw input
data to data/, and generated HTML/CSV outputs to output/. Update path
references in 8 scripts to use Path(__file__).parent.parent as project
root so they work correctly from the new location. Update README links
and quick-start commands accordingly. Notebooks remain at root.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
2026-05-27 21:57:22 -07:00
parent a2e295d95b
commit ee5856661a
40 changed files with 31 additions and 30 deletions

View File

@@ -4,8 +4,9 @@ A comprehensive geospatial research project investigating the spatial concentrat
## Documentation
- **[Database Tables](database-tables.md)** - Complete database schema with table descriptions, column definitions, and SQL examples
- **[Research Ideas](research-ideas.md)** - Future research directions, data improvements, and potential collaborations
- **[Database Tables](docs/database-tables.md)** - Complete database schema with table descriptions, column definitions, and SQL examples
- **[Research Ideas](docs/research-ideas.md)** - Future research directions, data improvements, and potential collaborations
- **[SQL Queries](docs/query_legiscan_bills.sql)** - Pre-built legislative analysis queries
## Project Overview
@@ -93,25 +94,25 @@ Facilities in DBSCAN clusters differ significantly from isolated sites:
### Core Python Scripts
**Data Ingestion**
- `load_postgis_data_centers.py` - Load curated data center CSV into PostGIS
- `load_postgis_osm_data_centers.py` - Fetch OSM data centers via Overpass API
- `build_master_data_centers.py` - Deduplicate & merge curated + OSM sources
- `load_postgis_internet_cables.py` - Load submarine cables and landing points
- `ingest_eia_energy_layers.py` - Ingest EIA energy data via API
- `build_watershed_huc8_tables.py` - Load USGS HUC8 watersheds
- `ingest_legiscan.py` - Download all US state/federal bills 20162026 via LegiScan API, tag for data center research topics
**Data Ingestion** (`scripts/`)
- `scripts/load_postgis_data_centers.py` - Load curated data center CSV into PostGIS
- `scripts/load_postgis_osm_data_centers.py` - Fetch OSM data centers via Overpass API
- `scripts/build_master_data_centers.py` - Deduplicate & merge curated + OSM sources
- `scripts/load_postgis_internet_cables.py` - Load submarine cables and landing points
- `scripts/ingest_eia_energy_layers.py` - Ingest EIA energy data via API
- `scripts/build_watershed_huc8_tables.py` - Load USGS HUC8 watersheds
- `scripts/ingest_legiscan.py` - Download all US state/federal bills 20162026 via LegiScan API, tag for data center research topics
**Enrichment**
- `create_data_center_census_tract_table.py` - Join data centers to Census tracts with ACS demographics
- `build_fcc_bdc_broadband_connection_table.py` - Build per-facility broadband provider table
- `build_fcc_bdc_location_provider_aggregates.py` - Aggregate FCC BDC data by county/tract
- `scripts/create_data_center_census_tract_table.py` - Join data centers to Census tracts with ACS demographics
- `scripts/build_fcc_bdc_broadband_connection_table.py` - Build per-facility broadband provider table
- `scripts/build_fcc_bdc_location_provider_aggregates.py` - Aggregate FCC BDC data by county/tract
**Analysis**
- `analyze_dc_tract_concentration.py` - Tract-level cost concentration analysis (Gini, HHI, demographic deltas)
- `analyze_cables_concentration.py` - Test if data centers cluster near submarine cables
- `make_data_center_map.py` - Generate Leaflet map of data centers
- `make_internet_cables_map.py` - Generate Leaflet map of data centers + cables
- `scripts/analyze_dc_tract_concentration.py` - Tract-level cost concentration analysis (Gini, HHI, demographic deltas)
- `scripts/analyze_cables_concentration.py` - Test if data centers cluster near submarine cables
- `scripts/make_data_center_map.py` - Generate Leaflet map of data centers
- `scripts/make_internet_cables_map.py` - Generate Leaflet map of data centers + cables
### Key Jupyter Notebooks
- `spatial_clustering_master_data_centers.ipynb` - DBSCAN clustering of data centers
@@ -161,34 +162,34 @@ Credentials stored in `~/.zsh_secrets`, loaded via environment variables:
```bash
# 1. Load base data center data
python3 load_postgis_data_centers.py
python3 load_postgis_osm_data_centers.py
python3 build_master_data_centers.py
python3 scripts/load_postgis_data_centers.py
python3 scripts/load_postgis_osm_data_centers.py
python3 scripts/build_master_data_centers.py
# 2. Enrich with context layers
python3 create_data_center_census_tract_table.py --replace-final
python3 load_postgis_internet_cables.py
python3 ingest_eia_energy_layers.py --category power
python3 build_watershed_huc8_tables.py
python3 scripts/create_data_center_census_tract_table.py --replace-final
python3 scripts/load_postgis_internet_cables.py
python3 scripts/ingest_eia_energy_layers.py --category power
python3 scripts/build_watershed_huc8_tables.py
# 3. Run analyses
python3 analyze_dc_tract_concentration.py > output/tract_analysis.txt
python3 analyze_cables_concentration.py > output/cables_analysis.txt
python3 scripts/analyze_dc_tract_concentration.py > output/tract_analysis.txt
python3 scripts/analyze_cables_concentration.py > output/cables_analysis.txt
# 4. Execute notebooks
jupyter notebook cluster_analysis.ipynb
# 5. Load legislation (all states, 2016-2026)
python3 ingest_legiscan.py --all
python3 scripts/ingest_legiscan.py --all
# Weekly refresh (skips unchanged sessions):
python3 ingest_legiscan.py --fetch --load
python3 scripts/ingest_legiscan.py --fetch --load
```
### Generate Maps
```bash
python3 make_data_center_map.py
python3 make_internet_cables_map.py
python3 scripts/make_data_center_map.py
python3 scripts/make_internet_cables_map.py
```
## Key Outputs

View File

Can't render this file because it is too large.