diff --git a/build_fcc_bdc_broadband_connection_table.ipynb b/build_fcc_bdc_broadband_connection_table.ipynb index e165bd4..635a9ce 100644 --- a/build_fcc_bdc_broadband_connection_table.ipynb +++ b/build_fcc_bdc_broadband_connection_table.ipynb @@ -1486,29 +1486,23 @@ "source": [ "## Tables Created by This Notebook and Their Relationships\n", "\n", - "This notebook creates and/or maintains five PostgreSQL tables in the `public` schema:\n", - "\n", + "### Tables Created / Maintained\n", "1. `public.fcc_bdc_as_of`\n", - "- One row per FCC BDC release date and data type.\n", - "- Primary metadata table used to track versioning (`as_of_date`) for downstream loads.\n", + "- Release/version metadata by `as_of_date`.\n", "\n", "2. `public.fcc_bdc_files`\n", - "- One row per file discovered/downloaded for a release.\n", - "- Linked to releases via `as_of_date` and used as file-level lineage/provenance.\n", + "- File-level lineage records for each FCC BDC release.\n", "\n", "3. `public.fcc_bdc_broadband_by_datacenter`\n", - "- Fact table keyed by `(master_id, as_of_date)` for per-data-center broadband availability metrics.\n", - "- Includes scalar broadband fields and summary JSON payloads.\n", - "- `master_id` aligns with `public.master_data_centers.master_id`.\n", + "- Per-data-center broadband fact table keyed by `(master_id, as_of_date)`.\n", "\n", "4. `public.fcc_bdc_broadband_summary`\n", - "- Aggregated summary metrics by release (`as_of_date`) used for QA and reporting.\n", + "- Release-level aggregate summary metrics.\n", "\n", "5. `public.fcc_bdc_provider_summary`\n", - "- Provider catalog/aggregation table by release (`as_of_date`) with provider class rollups.\n", - "\n", - "### Relationship Summary\n", + "- Release-level provider catalog and provider-class summary metrics.\n", "\n", + "### Key Relationships\n", "- `public.fcc_bdc_as_of (as_of_date)`\n", " - 1-to-many -> `public.fcc_bdc_files (as_of_date)`\n", " - 1-to-many -> `public.fcc_bdc_broadband_by_datacenter (as_of_date)`\n", @@ -1518,7 +1512,9 @@ "- `public.master_data_centers (master_id)`\n", " - 1-to-many over time -> `public.fcc_bdc_broadband_by_datacenter (master_id, as_of_date)`\n", "\n", - "In short: **release metadata (`as_of` + `files`) supports reproducible loads, while per-DC broadband facts and release-level/provider-level summaries support analysis.**" + "### Rerun Notes\n", + "- The notebook is designed for repeat refreshes as new FCC releases arrive.\n", + "- Use `as_of_date` as the version key when comparing snapshots over time." ] } ], diff --git a/cluster_analysis.ipynb b/cluster_analysis.ipynb index 320808a..2572022 100644 --- a/cluster_analysis.ipynb +++ b/cluster_analysis.ipynb @@ -916,6 +916,28 @@ "print('Top non-metro watersheds (RUCA 4-10):')\n", "nm_ws.head(15).reset_index(drop=True)\n" ] + }, + { + "cell_type": "markdown", + "id": "25", + "metadata": {}, + "source": [ + "## Tables Created by This Notebook and Their Relationships\n", + "\n", + "### Tables Created / Maintained\n", + "1. `public.ruca_codes_2020_tract`\n", + "- Tract-level RUCA lookup loaded from `new/RUCA-codes-2020-tract.csv`.\n", + "- Rebuilt with drop + recreate during load.\n", + "- Primary key: `tract_fips_20`.\n", + "\n", + "### Key Relationships\n", + "- `public.master_data_centers (geoid)`\n", + " - many-to-1 -> `public.ruca_codes_2020_tract (tract_fips_20)`\n", + "\n", + "### Rerun Notes\n", + "- Rerunning refreshes the RUCA lookup table from the latest CSV.\n", + "- Downstream joins in this notebook read from this table but do not create additional persistent analysis tables." + ] } ], "metadata": { diff --git a/historical_climate_data_centers.ipynb b/historical_climate_data_centers.ipynb index e5368ed..2feb858 100644 --- a/historical_climate_data_centers.ipynb +++ b/historical_climate_data_centers.ipynb @@ -895,21 +895,18 @@ "source": [ "## Tables Created by This Notebook and Their Relationships\n", "\n", - "This notebook creates and/or maintains one primary PostGIS table:\n", - "\n", + "### Tables Created / Maintained\n", "1. `public.data_center_historical_climate`\n", - "- One row per data center (`master_id`).\n", - "- Stores climate summary metrics (temperature, humidity, wet-bulb, precipitation variability, cooling-degree-days, wind fields/status), geometry, and lineage timestamps.\n", - "- Upserted incrementally so reruns refresh changed rows without duplicating records.\n", - "\n", - "### Relationship Summary\n", + "- One row per `master_id` with climate summary fields and geometry.\n", + "- Populated by incremental upsert so reruns refresh existing sites and add new sites.\n", "\n", + "### Key Relationships\n", "- `public.master_data_centers (master_id)`\n", " - 1-to-1 (effective) -> `public.data_center_historical_climate (master_id)`\n", "\n", - "`public.data_center_historical_climate.master_id` is a foreign key to `public.master_data_centers.master_id` (with cascade delete), so climate rows track the master data-center record set.\n", - "\n", - "In short: **`master_data_centers` is the entity table, and `data_center_historical_climate` is its one-row-per-site climate feature extension.**" + "### Rerun Notes\n", + "- Safe to rerun when the master data-center set changes.\n", + "- Existing rows are updated in place; no duplicate-per-site history table is created by this notebook." ] } ], diff --git a/hms_smoke_data_centers.ipynb b/hms_smoke_data_centers.ipynb index 1fb485e..83440d3 100644 --- a/hms_smoke_data_centers.ipynb +++ b/hms_smoke_data_centers.ipynb @@ -1184,23 +1184,35 @@ "id": "22", "metadata": {}, "source": [ - "## Tables Created\n", + "## Tables Created by This Notebook and Their Relationships\n", "\n", - "This notebook creates four PostGIS tables for NOAA HMS smoke exposure analysis. The tables are designed to separate source observations, raw geometries, long-form data-center exposure, and the final per-site summary.\n", + "### Tables Created / Maintained\n", + "1. `public.hms_smoke_days`\n", + "- One row per observed HMS product day (daily denominator table).\n", "\n", - "| Table | Grain | Purpose |\n", - "|---|---|---|\n", - "| `public.hms_smoke_days` | One row per observed HMS product day | Denominator table for daily percentages, including days with zero smoke polygons. Stores `smoke_date`, source metadata, and `feature_count`. |\n", - "| `public.hms_smoke_daily` | One row per HMS smoke polygon | Raw smoke plume geometry table. Stores `smoke_date`, satellite/time fields, normalized `density`, `density_rank`, source metadata, and `geom`. |\n", - "| `public.data_center_hms_smoke_dc_day` | One row per `(master_id, smoke_date)` | Long-form daily exposure table for every data center on every observed HMS day. `max_density_rank = 0` means observed no smoke; `1`, `2`, and `3` represent light/unspecified, medium, and heavy smoke exposure. |\n", - "| `public.data_center_hms_smoke_exposure` | One row per `master_id` | Final per-data-center summary table joinable to `public.master_data_centers`. Includes location fields, observation status, smoke-period dates, exposure-day counts, percentage metrics, worst/mean density, and longest streak metrics. |\n", + "2. `public.hms_smoke_daily`\n", + "- One row per smoke polygon geometry from HMS source products.\n", "\n", - "Recommended use:\n", + "3. `public.data_center_hms_smoke_dc_day`\n", + "- One row per `(master_id, smoke_date)` with daily smoke exposure classification.\n", "\n", - "- Use `public.data_center_hms_smoke_exposure` for most site-level analysis and ranking.\n", - "- Use `public.data_center_hms_smoke_dc_day` for time-series analysis, seasonal summaries, or custom thresholds.\n", - "- Use `public.hms_smoke_daily` when you need the original smoke plume geometries for mapping or spatial QA.\n", - "- Use `public.hms_smoke_days` whenever calculating percentages so no-smoke observed days remain in the denominator." + "4. `public.data_center_hms_smoke_exposure`\n", + "- One row per `master_id` with summary smoke-exposure metrics.\n", + "\n", + "### Key Relationships\n", + "- `public.hms_smoke_days (smoke_date)`\n", + " - 1-to-many -> `public.hms_smoke_daily (smoke_date)`\n", + "\n", + "- `public.master_data_centers (master_id)`\n", + " - 1-to-many -> `public.data_center_hms_smoke_dc_day (master_id, smoke_date)`\n", + " - 1-to-1 (effective) -> `public.data_center_hms_smoke_exposure (master_id)`\n", + "\n", + "- `public.data_center_hms_smoke_dc_day`\n", + " - many-to-1 summary rollup -> `public.data_center_hms_smoke_exposure`\n", + "\n", + "### Rerun Notes\n", + "- Designed for repeat refreshes as additional HMS days become available.\n", + "- Summary exposure table is recomputed from daily source/bridge tables so results stay consistent after reloads." ] } ], diff --git a/open_meteo_historical_data_centers.ipynb b/open_meteo_historical_data_centers.ipynb index c5c89c9..1d67ac6 100644 --- a/open_meteo_historical_data_centers.ipynb +++ b/open_meteo_historical_data_centers.ipynb @@ -844,21 +844,18 @@ "source": [ "## Tables Created by This Notebook and Their Relationships\n", "\n", - "This notebook creates and/or maintains one primary PostGIS table:\n", - "\n", + "### Tables Created / Maintained\n", "1. `public.data_center_historical_climate`\n", - "- One row per data center (`master_id`).\n", - "- Stores climate summary metrics (temperature, humidity, wet-bulb, precipitation variability, cooling-degree-days, wind fields/status), geometry, and lineage timestamps.\n", - "- Upserted incrementally so reruns refresh changed rows without duplicating records.\n", - "\n", - "### Relationship Summary\n", + "- One row per `master_id` with climate summary fields and geometry.\n", + "- Populated by incremental upsert so reruns refresh existing sites and add new sites.\n", "\n", + "### Key Relationships\n", "- `public.master_data_centers (master_id)`\n", " - 1-to-1 (effective) -> `public.data_center_historical_climate (master_id)`\n", "\n", - "`public.data_center_historical_climate.master_id` is a foreign key to `public.master_data_centers.master_id` (with cascade delete), so climate rows track the master data-center record set.\n", - "\n", - "In short: **`master_data_centers` is the entity table, and `data_center_historical_climate` is its one-row-per-site climate feature extension.**" + "### Rerun Notes\n", + "- Safe to rerun when the master data-center set changes.\n", + "- Existing rows are updated in place; no duplicate-per-site history table is created by this notebook." ] } ], diff --git a/postgis_table_loader.ipynb b/postgis_table_loader.ipynb index 5ef7f99..62f0f54 100644 --- a/postgis_table_loader.ipynb +++ b/postgis_table_loader.ipynb @@ -538,6 +538,29 @@ " for row in cur.fetchall():\n", " print(f'{row[0]}.{row[1]}')" ] + }, + { + "cell_type": "markdown", + "id": "11", + "metadata": {}, + "source": [ + "## Tables Created by This Notebook and Their Relationships\n", + "\n", + "### Tables Created / Maintained\n", + "1. `TARGET_TABLE` (configured at runtime)\n", + "- Generic loader output table built from the current dataframe schema.\n", + "- Replaced/appended according to `if_exists` behavior.\n", + "- Optional point geometry can be added in helper cells.\n", + "\n", + "### Key Relationships\n", + "- This notebook is table-agnostic: relationships depend on the selected `TARGET_TABLE` and source columns.\n", + "- When key columns (for example `master_id`, `geoid`, IDs, dates) are present, the loaded table can be joined to domain tables.\n", + "- When geometry is present, the loaded table can participate in spatial joins.\n", + "\n", + "### Rerun Notes\n", + "- Safe to rerun for recurring refreshes of different source files.\n", + "- Always confirm `TARGET_TABLE` and `if_exists` before execution to avoid unintended replacement of existing tables." + ] } ], "metadata": { diff --git a/rdh_precinct_vote_data_centers.ipynb b/rdh_precinct_vote_data_centers.ipynb index dd789fb..bdb4b09 100644 --- a/rdh_precinct_vote_data_centers.ipynb +++ b/rdh_precinct_vote_data_centers.ipynb @@ -1676,36 +1676,20 @@ "source": [ "## Tables Created by This Notebook and Their Relationships\n", "\n", - "This notebook creates and/or maintains the following PostGIS/PostgreSQL tables:\n", - "\n", + "### Tables Created / Maintained\n", "1. `public.rdh_precinct_vote_layers`\n", - "- One row per RDH precinct-election layer ingested.\n", - "- Key columns: `layer_id` (PK), `state_code`, `title`, `format`, file/source metadata, `loaded_at`.\n", + "- One row per ingested precinct-election layer.\n", "\n", "2. `public.rdh_precinct_vote_features`\n", - "- One row per precinct polygon feature from a loaded layer.\n", - "- Key columns: `feature_id` (PK), `layer_id` (FK), `state_code`, `source_row`, `properties` (JSONB), `geom` (MultiPolygon).\n", - "- Relationship: many features belong to one layer.\n", + "- One row per precinct geometry feature with source properties JSON.\n", "\n", "3. `public.data_center_rdh_precinct_vote_matches`\n", - "- Spatial match table linking data centers to precinct features.\n", - "- Key columns: `master_id` (FK), `feature_id` (FK), `layer_id` (FK), `state_code`, `join_method`, `match_distance_m`, `matched_at`.\n", - "- Primary key: (`master_id`, `feature_id`).\n", - "- Relationship: many-to-many bridge between data centers and precinct features (with match metadata).\n", + "- Bridge table linking data centers to matched precinct features.\n", "\n", "4. `public.data_center_election_context`\n", - "- Final standardized, one-row-per-data-center election context used by downstream mapping/analysis.\n", - "- Key columns: `master_id` (PK, FK), `name`, `city`, `state`, `rdh_layer_title`,\n", - " `precinct_identifier_name`, `election_year`, `office`, `democratic_votes`, `republican_votes`,\n", - " `total_votes`, `turnout_or_vote_share`, `updated_at`.\n", - "- Relationship: one row per `master_id` in `public.master_data_centers` (left-joined so all master rows can be retained, even if election fields are null).\n", - "\n", - "### Relationship Summary\n", - "\n", - "- `public.master_data_centers (master_id)`\n", - " - 1-to-many -> `public.data_center_rdh_precinct_vote_matches (master_id)`\n", - " - 1-to-1 (effective in this notebook) -> `public.data_center_election_context (master_id)`\n", + "- Standardized, one-row-per-data-center election context for downstream analysis/mapping.\n", "\n", + "### Key Relationships\n", "- `public.rdh_precinct_vote_layers (layer_id)`\n", " - 1-to-many -> `public.rdh_precinct_vote_features (layer_id)`\n", " - 1-to-many -> `public.data_center_rdh_precinct_vote_matches (layer_id)`\n", @@ -1713,7 +1697,13 @@ "- `public.rdh_precinct_vote_features (feature_id)`\n", " - 1-to-many -> `public.data_center_rdh_precinct_vote_matches (feature_id)`\n", "\n", - "In short: **layers -> features -> matches**, then matches are standardized into **one election-context row per data center**." + "- `public.master_data_centers (master_id)`\n", + " - 1-to-many -> `public.data_center_rdh_precinct_vote_matches (master_id)`\n", + " - 1-to-1 (effective) -> `public.data_center_election_context (master_id)`\n", + "\n", + "### Rerun Notes\n", + "- Safe to rerun as new RDH layers and/or data centers are added.\n", + "- Reruns refresh matching outputs and regenerate standardized election context rows." ] } ], diff --git a/spatial_clustering_master_data_centers.ipynb b/spatial_clustering_master_data_centers.ipynb index 6efe401..286a7c4 100644 --- a/spatial_clustering_master_data_centers.ipynb +++ b/spatial_clustering_master_data_centers.ipynb @@ -1116,6 +1116,27 @@ "else:\n", " print('WRITE_BACK_TO_DB is False; no database table was modified.')" ] + }, + { + "cell_type": "markdown", + "id": "32", + "metadata": {}, + "source": [ + "## Tables Created by This Notebook and Their Relationships\n", + "\n", + "### Tables Created / Maintained\n", + "1. `public.master_data_center_spatial_clusters` (optional write)\n", + "- One row per `master_id` with cluster label and clustering metadata.\n", + "- Written only when `WRITE_BACK_TO_DB = True`.\n", + "\n", + "### Key Relationships\n", + "- `public.master_data_centers (master_id)`\n", + " - 1-to-1 (effective) -> `public.master_data_center_spatial_clusters (master_id)`\n", + "\n", + "### Rerun Notes\n", + "- Default behavior (`WRITE_BACK_TO_DB = False`) performs no table writes.\n", + "- With write-back enabled, reruns replace cluster assignments using the current parameters/data." + ] } ], "metadata": { diff --git a/usdm_drought_data_centers.ipynb b/usdm_drought_data_centers.ipynb index da40f31..5e6ce82 100644 --- a/usdm_drought_data_centers.ipynb +++ b/usdm_drought_data_centers.ipynb @@ -677,134 +677,32 @@ "id": "16", "metadata": {}, "source": [ - "## Tables Created\n", + "## Tables Created by This Notebook and Their Relationships\n", "\n", - "This notebook builds three tables in the `public` schema, all keyed (directly or transitively) to `master_data_centers.master_id`.\n", + "### Tables Created / Maintained\n", + "1. `public.usdm_drought_weekly`\n", + "- Weekly USDM drought polygons by `week_date` and drought category.\n", "\n", - "---\n", + "2. `public.data_center_usdm_drought_dc_week`\n", + "- One row per `(master_id, week_date)` with weekly worst drought category at each data center.\n", "\n", - "### 1. `public.usdm_drought_weekly`\n", + "3. `public.data_center_usdm_drought_exposure`\n", + "- One row per `master_id` with summary drought-exposure metrics and streak fields.\n", "\n", - "Raw weekly USDM drought polygons — one row per `(week_date, dm_category)` (occasionally multiple rows for early-USDM weeks that published per-category fragments). Source of truth for any later spatial query against the drought record.\n", + "### Key Relationships\n", + "- `public.usdm_drought_weekly (week_date, dm_category, geom)`\n", + " - spatial/time source for -> `public.data_center_usdm_drought_dc_week`\n", "\n", - "| Column | Type | Meaning |\n", - "|---|---|---|\n", - "| `id` | `bigserial` PK | Surrogate row id |\n", - "| `week_date` | `date` | Tuesday-of-publication date parsed from filename (`USDM_YYYYMMDD_M.zip`) |\n", - "| `dm_category` | `smallint` | 0=D0 Abnormally Dry, 1=D1 Moderate, 2=D2 Severe, 3=D3 Extreme, 4=D4 Exceptional. **Cumulative** — D4 polygon is inside D3 inside D2… |\n", - "| `objectid`, `shape_leng`, `shape_area` | original shapefile attributes |\n", - "| `geom` | `geometry(MultiPolygon, 4326)` | Drought-affected area for that category that week |\n", + "- `public.master_data_centers (master_id)`\n", + " - 1-to-many -> `public.data_center_usdm_drought_dc_week (master_id, week_date)`\n", + " - 1-to-1 (effective) -> `public.data_center_usdm_drought_exposure (master_id)`\n", "\n", - "**Indexes:** GIST on `geom`, btree on `week_date`.\n", + "- `public.data_center_usdm_drought_dc_week`\n", + " - many-to-1 summary rollup -> `public.data_center_usdm_drought_exposure`\n", "\n", - "**Size:** ~12,000 polygon rows across 1,356 weeks (Jan 2000 – mid 2025).\n", - "\n", - "**Example uses:**\n", - "```sql\n", - "-- Map of D3+ drought in August 2022\n", - "SELECT week_date, dm_category, geom\n", - "FROM usdm_drought_weekly\n", - "WHERE week_date = '2022-08-30' AND dm_category >= 3;\n", - "\n", - "-- Worst week ever for a specific lat/lon\n", - "SELECT week_date, MAX(dm_category) AS worst_dm\n", - "FROM usdm_drought_weekly\n", - "WHERE ST_Within(ST_SetSRID(ST_MakePoint(-98.5, 29.5), 4326), geom)\n", - "GROUP BY week_date ORDER BY worst_dm DESC, week_date LIMIT 10;\n", - "```\n", - "\n", - "---\n", - "\n", - "### 2. `public.data_center_usdm_drought_dc_week`\n", - "\n", - "Long-form per-(DC, week) intermediate. One row per data center per USDM week observed; useful for time-series and streak analysis. Computed from `usdm_drought_weekly` via spatial join, then back-filled so every covered DC has a row for every week.\n", - "\n", - "| Column | Type | Meaning |\n", - "|---|---|---|\n", - "| `master_id` | `text` PK (composite) | FK → `master_data_centers.master_id` |\n", - "| `week_date` | `date` PK (composite) | USDM week |\n", - "| `worst_dm` | `smallint` | Max `dm_category` whose polygon contained the DC point that week. **`-1` means observed week but no drought polygon contained the DC** (filter `worst_dm >= 0` for actual drought weeks) |\n", - "\n", - "**Indexes:** PK on `(master_id, week_date)`, btree on `week_date`, btree on `worst_dm`.\n", - "\n", - "**Size:** ~2.5 M rows (1,833 DCs × 1,356 weeks, minus DCs not covered by USDM).\n", - "\n", - "**Example uses:**\n", - "```sql\n", - "-- Drought timeline for one DC\n", - "SELECT week_date, worst_dm\n", - "FROM data_center_usdm_drought_dc_week\n", - "WHERE master_id = 'curated/1010260676' AND worst_dm >= 0\n", - "ORDER BY week_date;\n", - "\n", - "-- DCs that were in D4 during a specific week\n", - "SELECT master_id FROM data_center_usdm_drought_dc_week\n", - "WHERE week_date = '2012-07-24' AND worst_dm = 4;\n", - "```\n", - "\n", - "If you only need the per-DC summary, this table can be dropped — it's regenerable from `usdm_drought_weekly` + `master_data_centers`.\n", - "\n", - "---\n", - "\n", - "### 3. `public.data_center_usdm_drought_exposure`\n", - "\n", - "Per-DC drought-exposure summary keyed by `master_id`. The analytical surface — one row per data center with all the headline metrics. Joinable directly to `master_data_centers` and `data_center_historical_climate`.\n", - "\n", - "| Column | Type | Meaning |\n", - "|---|---|---|\n", - "| `master_id` | `text` PK | FK → `master_data_centers.master_id` |\n", - "| Identity cols | `source`, `name`, `operator`, `city`, `state`, `country`, `longitude`, `latitude`, `geom` — denormalized from master for convenience |\n", - "| `usdm_status` | `text` | `'covered'` (USDM zone) or `'no_coverage'` (outside USDM extent) |\n", - "| `drought_period_start`, `drought_period_end` | `date` | First / last USDM week observed for this DC |\n", - "| `weeks_observed` | `int` | Total weekly observations |\n", - "| `weeks_in_d0_or_worse` … `weeks_in_d4` | `int` | Cumulative weekly counts at each severity threshold |\n", - "| `pct_weeks_in_d0_or_worse` … `pct_weeks_in_d4` | `double` | Same as ratios over `weeks_observed` |\n", - "| `worst_dm_category` | `smallint` | Max DM ever experienced (0–4) |\n", - "| `mean_dm_category` | `double` | Average DM across all weeks, treating no-drought (`-1`) as 0 |\n", - "| `longest_d0_streak_weeks` | `int` | Longest consecutive run with any drought (D0+) |\n", - "| `longest_d2_streak_weeks` | `int` | Longest consecutive run with severe drought (D2+) — **the headline streak metric** |\n", - "| `longest_d3_streak_weeks` | `int` | Longest consecutive run with extreme drought (D3+) |\n", - "| `fetched_at`, `updated_at` | `timestamptz` | Provenance |\n", - "\n", - "**Indexes:** GIST on `geom`, btree on `state`, btree on `worst_dm_category`.\n", - "\n", - "**Size:** 1,833 rows (one per master DC; PR sites flagged `no_coverage` if applicable).\n", - "\n", - "**Headline metric for site-selection analysis:** `pct_weeks_in_d2_or_worse`. D2 = \"Severe Drought\" is the threshold at which water-use restrictions typically kick in for utilities and municipalities.\n", - "\n", - "**Example: joined climate + drought view for cooling-water risk analysis**\n", - "```sql\n", - "SELECT\n", - " c.master_id, c.name, c.state,\n", - " c.cooling_degree_days_c, -- baseline cooling load\n", - " c.mean_wet_bulb_temperature_c, -- evaporative-cooling efficiency\n", - " d.pct_weeks_in_d2_or_worse * 100 AS pct_severe_drought,\n", - " d.longest_d2_streak_weeks,\n", - " d.worst_dm_category\n", - "FROM data_center_historical_climate c\n", - "JOIN data_center_usdm_drought_exposure d USING (master_id)\n", - "WHERE d.usdm_status = 'covered'\n", - "ORDER BY (c.cooling_degree_days_c * d.pct_weeks_in_d2_or_worse) DESC\n", - "LIMIT 25;\n", - "```\n", - "\n", - "---\n", - "\n", - "### Relationship diagram\n", - "\n", - "```\n", - "master_data_centers (master_id PK)\n", - " │\n", - " ├── data_center_historical_climate (master_id PK) ← from open_meteo/Daymet/gridMET notebook\n", - " │\n", - " └── data_center_usdm_drought_exposure (master_id PK) ← this notebook\n", - " │\n", - " └── data_center_usdm_drought_dc_week (master_id, week_date)\n", - " │\n", - " └── usdm_drought_weekly (id PK, week_date, dm_category, geom)\n", - "```\n", - "\n", - "All three USDM tables are regenerable from the zip files in `USDM Shape Files/`. `RELOAD_WEEKLY=True` rebuilds from scratch; `RECOMPUTE_SUMMARY=True` (default) recomputes the dc-week + exposure tables from whatever's in `usdm_drought_weekly`.\n" + "### Rerun Notes\n", + "- Supports repeat runs when new USDM weeks or new data centers are added.\n", + "- Weekly table can be reloaded and the downstream `dc_week` + `exposure` tables can be recomputed from that source." ] } ],