Document database table previews
This commit is contained in:
@@ -13,12 +13,13 @@
|
||||
|
||||
## Table Organization
|
||||
|
||||
Tables are organized into five categories:
|
||||
Tables are organized into six categories:
|
||||
1. **Core Data Center Tables** - Master inventories and source data
|
||||
2. **Enrichment Tables** - Data centers joined with contextual data
|
||||
3. **Base Layer Tables** - Geographic and demographic reference layers
|
||||
4. **Infrastructure Tables** - Energy and connectivity infrastructure
|
||||
5. **Legislation Tables** - LegiScan state and federal bill data (2016-2026)
|
||||
3. **Environmental and Election Source Tables** - Long-form climate, drought, fire/smoke, and precinct-election source layers
|
||||
4. **Base Layer Tables** - Geographic and demographic reference layers
|
||||
5. **Infrastructure Tables** - Energy and connectivity infrastructure
|
||||
6. **Legislation Tables** - LegiScan state and federal bill data (2016-2026)
|
||||
|
||||
---
|
||||
|
||||
@@ -147,20 +148,224 @@ Tables are organized into five categories:
|
||||
|
||||
**Source**: FEMA National Risk Index (December 2025 release)
|
||||
|
||||
### `data_center_rdh_precinct_vote_matches`
|
||||
**Rows**: Varies
|
||||
**Purpose**: Per-facility precinct-level election results
|
||||
### `data_center_historical_climate`
|
||||
**Rows**: 1,833
|
||||
**Purpose**: One-row-per-facility historical climate summary for data center locations
|
||||
|
||||
**Key Columns**:
|
||||
- Data center identifiers
|
||||
- `precinct_name`, `precinct_id`
|
||||
- `master_id` (TEXT) - FK to `master_data_centers`
|
||||
- `source`, `name`, `operator`, `city`, `state`, `country`
|
||||
- `latitude`, `longitude`, `geom`
|
||||
- `daymet_dataset_version`, `gridmet_dataset_version`
|
||||
- `climate_period_start`, `climate_period_end` - Current period: 1991-01-01 to 2020-12-31
|
||||
- **Temperature**: `mean_annual_temperature_c`, `mean_summer_temperature_c`, `max_daily_temperature_c`, `min_daily_temperature_c`
|
||||
- **Humidity / wet bulb**: `mean_relative_humidity_pct`, `mean_wet_bulb_temperature_c`, `max_wet_bulb_temperature_c`, `extreme_wet_bulb_days`
|
||||
- **Cooling / heat**: `cooling_degree_days_c`, `annual_cooling_degree_days_c_mean`, `extreme_heat_days`, `annual_extreme_heat_days_mean`
|
||||
- **Precipitation**: `precipitation_total_mm`, `annual_precipitation_mm_mean`, `annual_precipitation_cv`, `wet_day_precipitation_p95_mm`
|
||||
- **Wind**: `mean_wind_speed_ms`, `max_daily_mean_wind_speed_ms`, `sustained_wind_days`, `annual_sustained_wind_days_mean`
|
||||
|
||||
**Source**: Daymet + gridMET historical climate data
|
||||
|
||||
**Notes**: Built by `historical_climate_data_centers.ipynb` / `open_meteo_historical_data_centers.ipynb`
|
||||
|
||||
### `data_center_usdm_drought_exposure`
|
||||
**Rows**: 1,833
|
||||
**Purpose**: Per-facility drought exposure summary from weekly U.S. Drought Monitor polygons
|
||||
|
||||
**Key Columns**:
|
||||
- `master_id` (TEXT) - FK to `master_data_centers`
|
||||
- `source`, `name`, `operator`, `city`, `state`, `country`
|
||||
- `latitude`, `longitude`, `geom`
|
||||
- `usdm_status` - `covered` or `no_coverage`
|
||||
- `drought_period_start`, `drought_period_end` - Current period: 2000-01-04 to 2025-12-30
|
||||
- `weeks_observed`
|
||||
- `weeks_in_d0_or_worse`, `weeks_in_d1_or_worse`, `weeks_in_d2_or_worse`, `weeks_in_d3_or_worse`, `weeks_in_d4`
|
||||
- `pct_weeks_in_d0_or_worse`, `pct_weeks_in_d1_or_worse`, `pct_weeks_in_d2_or_worse`, `pct_weeks_in_d3_or_worse`, `pct_weeks_in_d4`
|
||||
- `worst_dm_category`, `mean_dm_category`
|
||||
- `longest_d0_streak_weeks`, `longest_d2_streak_weeks`, `longest_d3_streak_weeks`
|
||||
|
||||
**Source**: U.S. Drought Monitor weekly spatial data
|
||||
|
||||
**Notes**:
|
||||
- Summary table is rolled up from `data_center_usdm_drought_dc_week`
|
||||
- `dm_category` scale: D0-D4, stored as 0-4
|
||||
- 1,830 facilities have covered status; 3 have no coverage
|
||||
|
||||
### `data_center_hms_smoke_exposure`
|
||||
**Rows**: 1,833
|
||||
**Purpose**: Per-facility wildfire-smoke exposure summary from NOAA HMS smoke polygons
|
||||
|
||||
**Key Columns**:
|
||||
- `master_id` (TEXT) - FK to `master_data_centers`
|
||||
- `source`, `name`, `operator`, `city`, `state`, `country`
|
||||
- `latitude`, `longitude`, `geom`
|
||||
- `hms_status`
|
||||
- `smoke_period_start`, `smoke_period_end` - Current period: 2005-08-05 to 2026-05-22
|
||||
- `days_observed`
|
||||
- `days_with_any_smoke`, `days_with_light_or_worse`, `days_with_medium_or_worse`, `days_with_heavy_smoke`
|
||||
- `pct_days_with_any_smoke`, `pct_days_with_light_or_worse`, `pct_days_with_medium_or_worse`, `pct_days_with_heavy_smoke`
|
||||
- `worst_density_rank`, `worst_density`, `mean_density_rank`
|
||||
- `longest_any_smoke_streak_days`, `longest_medium_or_heavy_streak_days`, `longest_heavy_smoke_streak_days`
|
||||
|
||||
**Source**: NOAA Hazard Mapping System (HMS) smoke polygons
|
||||
|
||||
**Notes**:
|
||||
- Summary table is rolled up from `data_center_hms_smoke_dc_day`
|
||||
- Density rank: 0 = observed no smoke, 1 = Light, 2 = Medium, 3 = Heavy
|
||||
- HMS product path uses NOAA's `/FIRE/web/HMS/Smoke_Polygons/` archive
|
||||
|
||||
### `data_center_election_context`
|
||||
**Rows**: 1,833
|
||||
**Purpose**: Standardized one-row-per-facility election context derived from RDH precinct matches
|
||||
|
||||
**Key Columns**:
|
||||
- `master_id` (TEXT) - FK to `master_data_centers`
|
||||
- `name`, `city`, `state`
|
||||
- `rdh_layer_title`
|
||||
- `precinct_identifier_name`
|
||||
- `election_year`, `office`
|
||||
- `candidate`, `party`, `votes`
|
||||
- `vote_share_pct`
|
||||
- `democratic_votes`, `republican_votes`, `total_votes`
|
||||
- `turnout_or_vote_share`
|
||||
- `updated_at`
|
||||
|
||||
**Source**: Redistricting Data Hub precinct election shapefiles
|
||||
|
||||
**Notes**:
|
||||
- Built from `data_center_rdh_precinct_vote_matches` plus RDH feature properties
|
||||
- Current rows cover 2020-2024 election layers; 1,829 facilities have non-null election year context
|
||||
|
||||
### `data_center_rdh_precinct_vote_matches`
|
||||
**Rows**: 3,330
|
||||
**Purpose**: Spatial join bridge between data centers and RDH precinct vote features
|
||||
|
||||
**Key Columns**:
|
||||
- `master_id` (TEXT) - FK to `master_data_centers`
|
||||
- `feature_id` (TEXT) - FK to `rdh_precinct_vote_features`
|
||||
- `layer_id` (TEXT) - FK to `rdh_precinct_vote_layers`
|
||||
- `state_code`
|
||||
- `join_method`
|
||||
- `match_distance_m`
|
||||
- `matched_at`
|
||||
|
||||
**Source**: Redistricting Data Hub precinct shapefiles
|
||||
|
||||
**Notes**: Spatial join to voting precincts (point-in-polygon)
|
||||
**Notes**: Spatial join to voting precincts (point-in-polygon, with nearest/fallback logic where needed)
|
||||
|
||||
---
|
||||
|
||||
## Environmental and Election Source Tables
|
||||
|
||||
### `usdm_drought_weekly`
|
||||
**Rows**: 12,080
|
||||
**Purpose**: Raw weekly U.S. Drought Monitor polygons by drought category
|
||||
|
||||
**Key Columns**:
|
||||
- `id` (BIGINT) - Primary key
|
||||
- `week_date` (DATE)
|
||||
- `dm_category` (SMALLINT) - Drought Monitor category D0-D4 stored as 0-4
|
||||
- `objectid`, `shape_leng`, `shape_area`
|
||||
- `geom` (GEOMETRY) - Drought polygon geometry
|
||||
|
||||
**Source**: U.S. Drought Monitor spatial archive
|
||||
|
||||
**Notes**: Source table for `data_center_usdm_drought_dc_week`
|
||||
|
||||
### `data_center_usdm_drought_dc_week`
|
||||
**Rows**: ~2.48 million
|
||||
**Purpose**: Long-form weekly drought exposure for each covered data center
|
||||
|
||||
**Key Columns**:
|
||||
- `master_id` (TEXT) - FK to `master_data_centers`
|
||||
- `week_date` (DATE)
|
||||
- `worst_dm` (SMALLINT) - Worst drought category covering the facility that week
|
||||
|
||||
**Source**: Spatial join of `master_data_centers` to `usdm_drought_weekly`
|
||||
|
||||
**Notes**:
|
||||
- Primary key: (`master_id`, `week_date`)
|
||||
- `worst_dm = -1` indicates an observed week with no drought polygon covering the facility
|
||||
|
||||
### `hms_smoke_days`
|
||||
**Rows**: 7,075
|
||||
**Purpose**: One row per observed NOAA HMS smoke product day, including zero-polygon days
|
||||
|
||||
**Key Columns**:
|
||||
- `smoke_date` (DATE) - Primary key
|
||||
- `source`, `source_file`, `source_url`
|
||||
- `feature_count` (INTEGER) - Number of smoke polygons for the day
|
||||
- `fetched_at`, `updated_at`
|
||||
|
||||
**Source**: NOAA HMS smoke polygon archive
|
||||
|
||||
**Notes**: Denominator table for daily smoke-exposure percentages
|
||||
|
||||
### `hms_smoke_daily`
|
||||
**Rows**: 536,286
|
||||
**Purpose**: Raw daily NOAA HMS smoke polygons with density categories
|
||||
|
||||
**Key Columns**:
|
||||
- `id` (BIGINT) - Primary key
|
||||
- `smoke_date` (DATE) - FK to `hms_smoke_days`
|
||||
- `satellite`
|
||||
- `start_raw`, `end_raw`, `start_utc`, `end_utc`
|
||||
- `density`, `density_rank`
|
||||
- `source`, `source_file`, `source_url`
|
||||
- `geom` (GEOMETRY) - Smoke polygon geometry
|
||||
|
||||
**Source**: NOAA Hazard Mapping System (HMS) smoke polygons
|
||||
|
||||
**Notes**: Density rank 1-3 corresponds to Light, Medium, Heavy
|
||||
|
||||
### `data_center_hms_smoke_dc_day`
|
||||
**Rows**: ~13.9 million
|
||||
**Purpose**: Long-form daily smoke exposure for each data center and observed HMS product day
|
||||
|
||||
**Key Columns**:
|
||||
- `master_id` (TEXT) - FK to `master_data_centers`
|
||||
- `smoke_date` (DATE) - FK to `hms_smoke_days`
|
||||
- `max_density_rank` (SMALLINT) - Maximum smoke density covering the facility on that date
|
||||
- `polygon_hits` (INTEGER)
|
||||
|
||||
**Source**: Spatial join of `master_data_centers` to `hms_smoke_daily`
|
||||
|
||||
**Notes**:
|
||||
- Primary key: (`master_id`, `smoke_date`)
|
||||
- `max_density_rank = 0` indicates an observed HMS day with no smoke polygon covering the facility
|
||||
|
||||
### `rdh_precinct_vote_layers`
|
||||
**Rows**: 69
|
||||
**Purpose**: Metadata for downloaded RDH precinct election layers
|
||||
|
||||
**Key Columns**:
|
||||
- `layer_id` (TEXT) - Primary key
|
||||
- `state_code`
|
||||
- `title`
|
||||
- `format`
|
||||
- `datasetid`
|
||||
- `source_url`
|
||||
- `filename`, `local_path`, `spatial_path`
|
||||
- `metadata` (JSONB)
|
||||
- `loaded_at`
|
||||
|
||||
**Source**: Redistricting Data Hub precinct election datasets
|
||||
|
||||
**Notes**: Current loaded layers cover 45 distinct state codes
|
||||
|
||||
### `rdh_precinct_vote_features`
|
||||
**Rows**: 260,953
|
||||
**Purpose**: Staged RDH precinct polygons and source attributes
|
||||
|
||||
**Key Columns**:
|
||||
- `feature_id` (TEXT) - Primary key
|
||||
- `layer_id` (TEXT) - FK to `rdh_precinct_vote_layers`
|
||||
- `state_code`
|
||||
- `source_row`
|
||||
- `properties` (JSONB) - Raw RDH election attributes
|
||||
- `geom` (GEOMETRY) - Precinct polygon geometry
|
||||
|
||||
**Source**: Redistricting Data Hub precinct election shapefiles
|
||||
|
||||
**Notes**: Source feature table for `data_center_rdh_precinct_vote_matches`
|
||||
|
||||
---
|
||||
|
||||
@@ -293,7 +498,7 @@ Tables are organized into five categories:
|
||||
- Use for proximity analysis (e.g., "all generators within 50 km of data center")
|
||||
|
||||
#### `energy_eia_facility_fuel_flat`
|
||||
**Rows**: Varies
|
||||
**Rows**: Not loaded yet
|
||||
**Purpose**: Monthly generation by plant/fuel
|
||||
|
||||
**Key Columns**:
|
||||
@@ -305,6 +510,8 @@ Tables are organized into five categories:
|
||||
|
||||
**Source**: EIA Form 923 via API
|
||||
|
||||
**Notes**: Target table defined in `ingest_eia_energy_layers.py`; current database does not yet have `public.energy_eia_facility_fuel_flat`.
|
||||
|
||||
#### `energy_eia_seds_flat`
|
||||
**Rows**: 2.57 million
|
||||
**Purpose**: Annual state energy consumption/production (1960-2024)
|
||||
|
||||
Reference in New Issue
Block a user