Standardize notebook table-relationship documentation cells

This commit is contained in:
2026-05-22 14:21:51 -07:00
parent c95f22fcdb
commit 03239ad007
9 changed files with 147 additions and 191 deletions

View File

@@ -1486,29 +1486,23 @@
"source": [ "source": [
"## Tables Created by This Notebook and Their Relationships\n", "## Tables Created by This Notebook and Their Relationships\n",
"\n", "\n",
"This notebook creates and/or maintains five PostgreSQL tables in the `public` schema:\n", "### Tables Created / Maintained\n",
"\n",
"1. `public.fcc_bdc_as_of`\n", "1. `public.fcc_bdc_as_of`\n",
"- One row per FCC BDC release date and data type.\n", "- Release/version metadata by `as_of_date`.\n",
"- Primary metadata table used to track versioning (`as_of_date`) for downstream loads.\n",
"\n", "\n",
"2. `public.fcc_bdc_files`\n", "2. `public.fcc_bdc_files`\n",
"- One row per file discovered/downloaded for a release.\n", "- File-level lineage records for each FCC BDC release.\n",
"- Linked to releases via `as_of_date` and used as file-level lineage/provenance.\n",
"\n", "\n",
"3. `public.fcc_bdc_broadband_by_datacenter`\n", "3. `public.fcc_bdc_broadband_by_datacenter`\n",
"- Fact table keyed by `(master_id, as_of_date)` for per-data-center broadband availability metrics.\n", "- Per-data-center broadband fact table keyed by `(master_id, as_of_date)`.\n",
"- Includes scalar broadband fields and summary JSON payloads.\n",
"- `master_id` aligns with `public.master_data_centers.master_id`.\n",
"\n", "\n",
"4. `public.fcc_bdc_broadband_summary`\n", "4. `public.fcc_bdc_broadband_summary`\n",
"- Aggregated summary metrics by release (`as_of_date`) used for QA and reporting.\n", "- Release-level aggregate summary metrics.\n",
"\n", "\n",
"5. `public.fcc_bdc_provider_summary`\n", "5. `public.fcc_bdc_provider_summary`\n",
"- Provider catalog/aggregation table by release (`as_of_date`) with provider class rollups.\n", "- Release-level provider catalog and provider-class summary metrics.\n",
"\n",
"### Relationship Summary\n",
"\n", "\n",
"### Key Relationships\n",
"- `public.fcc_bdc_as_of (as_of_date)`\n", "- `public.fcc_bdc_as_of (as_of_date)`\n",
" - 1-to-many -> `public.fcc_bdc_files (as_of_date)`\n", " - 1-to-many -> `public.fcc_bdc_files (as_of_date)`\n",
" - 1-to-many -> `public.fcc_bdc_broadband_by_datacenter (as_of_date)`\n", " - 1-to-many -> `public.fcc_bdc_broadband_by_datacenter (as_of_date)`\n",
@@ -1518,7 +1512,9 @@
"- `public.master_data_centers (master_id)`\n", "- `public.master_data_centers (master_id)`\n",
" - 1-to-many over time -> `public.fcc_bdc_broadband_by_datacenter (master_id, as_of_date)`\n", " - 1-to-many over time -> `public.fcc_bdc_broadband_by_datacenter (master_id, as_of_date)`\n",
"\n", "\n",
"In short: **release metadata (`as_of` + `files`) supports reproducible loads, while per-DC broadband facts and release-level/provider-level summaries support analysis.**" "### Rerun Notes\n",
"- The notebook is designed for repeat refreshes as new FCC releases arrive.\n",
"- Use `as_of_date` as the version key when comparing snapshots over time."
] ]
} }
], ],

View File

@@ -916,6 +916,28 @@
"print('Top non-metro watersheds (RUCA 4-10):')\n", "print('Top non-metro watersheds (RUCA 4-10):')\n",
"nm_ws.head(15).reset_index(drop=True)\n" "nm_ws.head(15).reset_index(drop=True)\n"
] ]
},
{
"cell_type": "markdown",
"id": "25",
"metadata": {},
"source": [
"## Tables Created by This Notebook and Their Relationships\n",
"\n",
"### Tables Created / Maintained\n",
"1. `public.ruca_codes_2020_tract`\n",
"- Tract-level RUCA lookup loaded from `new/RUCA-codes-2020-tract.csv`.\n",
"- Rebuilt with drop + recreate during load.\n",
"- Primary key: `tract_fips_20`.\n",
"\n",
"### Key Relationships\n",
"- `public.master_data_centers (geoid)`\n",
" - many-to-1 -> `public.ruca_codes_2020_tract (tract_fips_20)`\n",
"\n",
"### Rerun Notes\n",
"- Rerunning refreshes the RUCA lookup table from the latest CSV.\n",
"- Downstream joins in this notebook read from this table but do not create additional persistent analysis tables."
]
} }
], ],
"metadata": { "metadata": {

View File

@@ -895,21 +895,18 @@
"source": [ "source": [
"## Tables Created by This Notebook and Their Relationships\n", "## Tables Created by This Notebook and Their Relationships\n",
"\n", "\n",
"This notebook creates and/or maintains one primary PostGIS table:\n", "### Tables Created / Maintained\n",
"\n",
"1. `public.data_center_historical_climate`\n", "1. `public.data_center_historical_climate`\n",
"- One row per data center (`master_id`).\n", "- One row per `master_id` with climate summary fields and geometry.\n",
"- Stores climate summary metrics (temperature, humidity, wet-bulb, precipitation variability, cooling-degree-days, wind fields/status), geometry, and lineage timestamps.\n", "- Populated by incremental upsert so reruns refresh existing sites and add new sites.\n",
"- Upserted incrementally so reruns refresh changed rows without duplicating records.\n",
"\n",
"### Relationship Summary\n",
"\n", "\n",
"### Key Relationships\n",
"- `public.master_data_centers (master_id)`\n", "- `public.master_data_centers (master_id)`\n",
" - 1-to-1 (effective) -> `public.data_center_historical_climate (master_id)`\n", " - 1-to-1 (effective) -> `public.data_center_historical_climate (master_id)`\n",
"\n", "\n",
"`public.data_center_historical_climate.master_id` is a foreign key to `public.master_data_centers.master_id` (with cascade delete), so climate rows track the master data-center record set.\n", "### Rerun Notes\n",
"\n", "- Safe to rerun when the master data-center set changes.\n",
"In short: **`master_data_centers` is the entity table, and `data_center_historical_climate` is its one-row-per-site climate feature extension.**" "- Existing rows are updated in place; no duplicate-per-site history table is created by this notebook."
] ]
} }
], ],

View File

@@ -1184,23 +1184,35 @@
"id": "22", "id": "22",
"metadata": {}, "metadata": {},
"source": [ "source": [
"## Tables Created\n", "## Tables Created by This Notebook and Their Relationships\n",
"\n", "\n",
"This notebook creates four PostGIS tables for NOAA HMS smoke exposure analysis. The tables are designed to separate source observations, raw geometries, long-form data-center exposure, and the final per-site summary.\n", "### Tables Created / Maintained\n",
"1. `public.hms_smoke_days`\n",
"- One row per observed HMS product day (daily denominator table).\n",
"\n", "\n",
"| Table | Grain | Purpose |\n", "2. `public.hms_smoke_daily`\n",
"|---|---|---|\n", "- One row per smoke polygon geometry from HMS source products.\n",
"| `public.hms_smoke_days` | One row per observed HMS product day | Denominator table for daily percentages, including days with zero smoke polygons. Stores `smoke_date`, source metadata, and `feature_count`. |\n",
"| `public.hms_smoke_daily` | One row per HMS smoke polygon | Raw smoke plume geometry table. Stores `smoke_date`, satellite/time fields, normalized `density`, `density_rank`, source metadata, and `geom`. |\n",
"| `public.data_center_hms_smoke_dc_day` | One row per `(master_id, smoke_date)` | Long-form daily exposure table for every data center on every observed HMS day. `max_density_rank = 0` means observed no smoke; `1`, `2`, and `3` represent light/unspecified, medium, and heavy smoke exposure. |\n",
"| `public.data_center_hms_smoke_exposure` | One row per `master_id` | Final per-data-center summary table joinable to `public.master_data_centers`. Includes location fields, observation status, smoke-period dates, exposure-day counts, percentage metrics, worst/mean density, and longest streak metrics. |\n",
"\n", "\n",
"Recommended use:\n", "3. `public.data_center_hms_smoke_dc_day`\n",
"- One row per `(master_id, smoke_date)` with daily smoke exposure classification.\n",
"\n", "\n",
"- Use `public.data_center_hms_smoke_exposure` for most site-level analysis and ranking.\n", "4. `public.data_center_hms_smoke_exposure`\n",
"- Use `public.data_center_hms_smoke_dc_day` for time-series analysis, seasonal summaries, or custom thresholds.\n", "- One row per `master_id` with summary smoke-exposure metrics.\n",
"- Use `public.hms_smoke_daily` when you need the original smoke plume geometries for mapping or spatial QA.\n", "\n",
"- Use `public.hms_smoke_days` whenever calculating percentages so no-smoke observed days remain in the denominator." "### Key Relationships\n",
"- `public.hms_smoke_days (smoke_date)`\n",
" - 1-to-many -> `public.hms_smoke_daily (smoke_date)`\n",
"\n",
"- `public.master_data_centers (master_id)`\n",
" - 1-to-many -> `public.data_center_hms_smoke_dc_day (master_id, smoke_date)`\n",
" - 1-to-1 (effective) -> `public.data_center_hms_smoke_exposure (master_id)`\n",
"\n",
"- `public.data_center_hms_smoke_dc_day`\n",
" - many-to-1 summary rollup -> `public.data_center_hms_smoke_exposure`\n",
"\n",
"### Rerun Notes\n",
"- Designed for repeat refreshes as additional HMS days become available.\n",
"- Summary exposure table is recomputed from daily source/bridge tables so results stay consistent after reloads."
] ]
} }
], ],

View File

@@ -844,21 +844,18 @@
"source": [ "source": [
"## Tables Created by This Notebook and Their Relationships\n", "## Tables Created by This Notebook and Their Relationships\n",
"\n", "\n",
"This notebook creates and/or maintains one primary PostGIS table:\n", "### Tables Created / Maintained\n",
"\n",
"1. `public.data_center_historical_climate`\n", "1. `public.data_center_historical_climate`\n",
"- One row per data center (`master_id`).\n", "- One row per `master_id` with climate summary fields and geometry.\n",
"- Stores climate summary metrics (temperature, humidity, wet-bulb, precipitation variability, cooling-degree-days, wind fields/status), geometry, and lineage timestamps.\n", "- Populated by incremental upsert so reruns refresh existing sites and add new sites.\n",
"- Upserted incrementally so reruns refresh changed rows without duplicating records.\n",
"\n",
"### Relationship Summary\n",
"\n", "\n",
"### Key Relationships\n",
"- `public.master_data_centers (master_id)`\n", "- `public.master_data_centers (master_id)`\n",
" - 1-to-1 (effective) -> `public.data_center_historical_climate (master_id)`\n", " - 1-to-1 (effective) -> `public.data_center_historical_climate (master_id)`\n",
"\n", "\n",
"`public.data_center_historical_climate.master_id` is a foreign key to `public.master_data_centers.master_id` (with cascade delete), so climate rows track the master data-center record set.\n", "### Rerun Notes\n",
"\n", "- Safe to rerun when the master data-center set changes.\n",
"In short: **`master_data_centers` is the entity table, and `data_center_historical_climate` is its one-row-per-site climate feature extension.**" "- Existing rows are updated in place; no duplicate-per-site history table is created by this notebook."
] ]
} }
], ],

View File

@@ -538,6 +538,29 @@
" for row in cur.fetchall():\n", " for row in cur.fetchall():\n",
" print(f'{row[0]}.{row[1]}')" " print(f'{row[0]}.{row[1]}')"
] ]
},
{
"cell_type": "markdown",
"id": "11",
"metadata": {},
"source": [
"## Tables Created by This Notebook and Their Relationships\n",
"\n",
"### Tables Created / Maintained\n",
"1. `TARGET_TABLE` (configured at runtime)\n",
"- Generic loader output table built from the current dataframe schema.\n",
"- Replaced/appended according to `if_exists` behavior.\n",
"- Optional point geometry can be added in helper cells.\n",
"\n",
"### Key Relationships\n",
"- This notebook is table-agnostic: relationships depend on the selected `TARGET_TABLE` and source columns.\n",
"- When key columns (for example `master_id`, `geoid`, IDs, dates) are present, the loaded table can be joined to domain tables.\n",
"- When geometry is present, the loaded table can participate in spatial joins.\n",
"\n",
"### Rerun Notes\n",
"- Safe to rerun for recurring refreshes of different source files.\n",
"- Always confirm `TARGET_TABLE` and `if_exists` before execution to avoid unintended replacement of existing tables."
]
} }
], ],
"metadata": { "metadata": {

View File

@@ -1676,36 +1676,20 @@
"source": [ "source": [
"## Tables Created by This Notebook and Their Relationships\n", "## Tables Created by This Notebook and Their Relationships\n",
"\n", "\n",
"This notebook creates and/or maintains the following PostGIS/PostgreSQL tables:\n", "### Tables Created / Maintained\n",
"\n",
"1. `public.rdh_precinct_vote_layers`\n", "1. `public.rdh_precinct_vote_layers`\n",
"- One row per RDH precinct-election layer ingested.\n", "- One row per ingested precinct-election layer.\n",
"- Key columns: `layer_id` (PK), `state_code`, `title`, `format`, file/source metadata, `loaded_at`.\n",
"\n", "\n",
"2. `public.rdh_precinct_vote_features`\n", "2. `public.rdh_precinct_vote_features`\n",
"- One row per precinct polygon feature from a loaded layer.\n", "- One row per precinct geometry feature with source properties JSON.\n",
"- Key columns: `feature_id` (PK), `layer_id` (FK), `state_code`, `source_row`, `properties` (JSONB), `geom` (MultiPolygon).\n",
"- Relationship: many features belong to one layer.\n",
"\n", "\n",
"3. `public.data_center_rdh_precinct_vote_matches`\n", "3. `public.data_center_rdh_precinct_vote_matches`\n",
"- Spatial match table linking data centers to precinct features.\n", "- Bridge table linking data centers to matched precinct features.\n",
"- Key columns: `master_id` (FK), `feature_id` (FK), `layer_id` (FK), `state_code`, `join_method`, `match_distance_m`, `matched_at`.\n",
"- Primary key: (`master_id`, `feature_id`).\n",
"- Relationship: many-to-many bridge between data centers and precinct features (with match metadata).\n",
"\n", "\n",
"4. `public.data_center_election_context`\n", "4. `public.data_center_election_context`\n",
"- Final standardized, one-row-per-data-center election context used by downstream mapping/analysis.\n", "- Standardized, one-row-per-data-center election context for downstream analysis/mapping.\n",
"- Key columns: `master_id` (PK, FK), `name`, `city`, `state`, `rdh_layer_title`,\n",
" `precinct_identifier_name`, `election_year`, `office`, `democratic_votes`, `republican_votes`,\n",
" `total_votes`, `turnout_or_vote_share`, `updated_at`.\n",
"- Relationship: one row per `master_id` in `public.master_data_centers` (left-joined so all master rows can be retained, even if election fields are null).\n",
"\n",
"### Relationship Summary\n",
"\n",
"- `public.master_data_centers (master_id)`\n",
" - 1-to-many -> `public.data_center_rdh_precinct_vote_matches (master_id)`\n",
" - 1-to-1 (effective in this notebook) -> `public.data_center_election_context (master_id)`\n",
"\n", "\n",
"### Key Relationships\n",
"- `public.rdh_precinct_vote_layers (layer_id)`\n", "- `public.rdh_precinct_vote_layers (layer_id)`\n",
" - 1-to-many -> `public.rdh_precinct_vote_features (layer_id)`\n", " - 1-to-many -> `public.rdh_precinct_vote_features (layer_id)`\n",
" - 1-to-many -> `public.data_center_rdh_precinct_vote_matches (layer_id)`\n", " - 1-to-many -> `public.data_center_rdh_precinct_vote_matches (layer_id)`\n",
@@ -1713,7 +1697,13 @@
"- `public.rdh_precinct_vote_features (feature_id)`\n", "- `public.rdh_precinct_vote_features (feature_id)`\n",
" - 1-to-many -> `public.data_center_rdh_precinct_vote_matches (feature_id)`\n", " - 1-to-many -> `public.data_center_rdh_precinct_vote_matches (feature_id)`\n",
"\n", "\n",
"In short: **layers -> features -> matches**, then matches are standardized into **one election-context row per data center**." "- `public.master_data_centers (master_id)`\n",
" - 1-to-many -> `public.data_center_rdh_precinct_vote_matches (master_id)`\n",
" - 1-to-1 (effective) -> `public.data_center_election_context (master_id)`\n",
"\n",
"### Rerun Notes\n",
"- Safe to rerun as new RDH layers and/or data centers are added.\n",
"- Reruns refresh matching outputs and regenerate standardized election context rows."
] ]
} }
], ],

View File

@@ -1116,6 +1116,27 @@
"else:\n", "else:\n",
" print('WRITE_BACK_TO_DB is False; no database table was modified.')" " print('WRITE_BACK_TO_DB is False; no database table was modified.')"
] ]
},
{
"cell_type": "markdown",
"id": "32",
"metadata": {},
"source": [
"## Tables Created by This Notebook and Their Relationships\n",
"\n",
"### Tables Created / Maintained\n",
"1. `public.master_data_center_spatial_clusters` (optional write)\n",
"- One row per `master_id` with cluster label and clustering metadata.\n",
"- Written only when `WRITE_BACK_TO_DB = True`.\n",
"\n",
"### Key Relationships\n",
"- `public.master_data_centers (master_id)`\n",
" - 1-to-1 (effective) -> `public.master_data_center_spatial_clusters (master_id)`\n",
"\n",
"### Rerun Notes\n",
"- Default behavior (`WRITE_BACK_TO_DB = False`) performs no table writes.\n",
"- With write-back enabled, reruns replace cluster assignments using the current parameters/data."
]
} }
], ],
"metadata": { "metadata": {

View File

@@ -677,134 +677,32 @@
"id": "16", "id": "16",
"metadata": {}, "metadata": {},
"source": [ "source": [
"## Tables Created\n", "## Tables Created by This Notebook and Their Relationships\n",
"\n", "\n",
"This notebook builds three tables in the `public` schema, all keyed (directly or transitively) to `master_data_centers.master_id`.\n", "### Tables Created / Maintained\n",
"1. `public.usdm_drought_weekly`\n",
"- Weekly USDM drought polygons by `week_date` and drought category.\n",
"\n", "\n",
"---\n", "2. `public.data_center_usdm_drought_dc_week`\n",
"- One row per `(master_id, week_date)` with weekly worst drought category at each data center.\n",
"\n", "\n",
"### 1. `public.usdm_drought_weekly`\n", "3. `public.data_center_usdm_drought_exposure`\n",
"- One row per `master_id` with summary drought-exposure metrics and streak fields.\n",
"\n", "\n",
"Raw weekly USDM drought polygons — one row per `(week_date, dm_category)` (occasionally multiple rows for early-USDM weeks that published per-category fragments). Source of truth for any later spatial query against the drought record.\n", "### Key Relationships\n",
"- `public.usdm_drought_weekly (week_date, dm_category, geom)`\n",
" - spatial/time source for -> `public.data_center_usdm_drought_dc_week`\n",
"\n", "\n",
"| Column | Type | Meaning |\n", "- `public.master_data_centers (master_id)`\n",
"|---|---|---|\n", " - 1-to-many -> `public.data_center_usdm_drought_dc_week (master_id, week_date)`\n",
"| `id` | `bigserial` PK | Surrogate row id |\n", " - 1-to-1 (effective) -> `public.data_center_usdm_drought_exposure (master_id)`\n",
"| `week_date` | `date` | Tuesday-of-publication date parsed from filename (`USDM_YYYYMMDD_M.zip`) |\n",
"| `dm_category` | `smallint` | 0=D0 Abnormally Dry, 1=D1 Moderate, 2=D2 Severe, 3=D3 Extreme, 4=D4 Exceptional. **Cumulative** — D4 polygon is inside D3 inside D2… |\n",
"| `objectid`, `shape_leng`, `shape_area` | original shapefile attributes |\n",
"| `geom` | `geometry(MultiPolygon, 4326)` | Drought-affected area for that category that week |\n",
"\n", "\n",
"**Indexes:** GIST on `geom`, btree on `week_date`.\n", "- `public.data_center_usdm_drought_dc_week`\n",
" - many-to-1 summary rollup -> `public.data_center_usdm_drought_exposure`\n",
"\n", "\n",
"**Size:** ~12,000 polygon rows across 1,356 weeks (Jan 2000 mid 2025).\n", "### Rerun Notes\n",
"\n", "- Supports repeat runs when new USDM weeks or new data centers are added.\n",
"**Example uses:**\n", "- Weekly table can be reloaded and the downstream `dc_week` + `exposure` tables can be recomputed from that source."
"```sql\n",
"-- Map of D3+ drought in August 2022\n",
"SELECT week_date, dm_category, geom\n",
"FROM usdm_drought_weekly\n",
"WHERE week_date = '2022-08-30' AND dm_category >= 3;\n",
"\n",
"-- Worst week ever for a specific lat/lon\n",
"SELECT week_date, MAX(dm_category) AS worst_dm\n",
"FROM usdm_drought_weekly\n",
"WHERE ST_Within(ST_SetSRID(ST_MakePoint(-98.5, 29.5), 4326), geom)\n",
"GROUP BY week_date ORDER BY worst_dm DESC, week_date LIMIT 10;\n",
"```\n",
"\n",
"---\n",
"\n",
"### 2. `public.data_center_usdm_drought_dc_week`\n",
"\n",
"Long-form per-(DC, week) intermediate. One row per data center per USDM week observed; useful for time-series and streak analysis. Computed from `usdm_drought_weekly` via spatial join, then back-filled so every covered DC has a row for every week.\n",
"\n",
"| Column | Type | Meaning |\n",
"|---|---|---|\n",
"| `master_id` | `text` PK (composite) | FK → `master_data_centers.master_id` |\n",
"| `week_date` | `date` PK (composite) | USDM week |\n",
"| `worst_dm` | `smallint` | Max `dm_category` whose polygon contained the DC point that week. **`-1` means observed week but no drought polygon contained the DC** (filter `worst_dm >= 0` for actual drought weeks) |\n",
"\n",
"**Indexes:** PK on `(master_id, week_date)`, btree on `week_date`, btree on `worst_dm`.\n",
"\n",
"**Size:** ~2.5 M rows (1,833 DCs × 1,356 weeks, minus DCs not covered by USDM).\n",
"\n",
"**Example uses:**\n",
"```sql\n",
"-- Drought timeline for one DC\n",
"SELECT week_date, worst_dm\n",
"FROM data_center_usdm_drought_dc_week\n",
"WHERE master_id = 'curated/1010260676' AND worst_dm >= 0\n",
"ORDER BY week_date;\n",
"\n",
"-- DCs that were in D4 during a specific week\n",
"SELECT master_id FROM data_center_usdm_drought_dc_week\n",
"WHERE week_date = '2012-07-24' AND worst_dm = 4;\n",
"```\n",
"\n",
"If you only need the per-DC summary, this table can be dropped — it's regenerable from `usdm_drought_weekly` + `master_data_centers`.\n",
"\n",
"---\n",
"\n",
"### 3. `public.data_center_usdm_drought_exposure`\n",
"\n",
"Per-DC drought-exposure summary keyed by `master_id`. The analytical surface — one row per data center with all the headline metrics. Joinable directly to `master_data_centers` and `data_center_historical_climate`.\n",
"\n",
"| Column | Type | Meaning |\n",
"|---|---|---|\n",
"| `master_id` | `text` PK | FK → `master_data_centers.master_id` |\n",
"| Identity cols | `source`, `name`, `operator`, `city`, `state`, `country`, `longitude`, `latitude`, `geom` — denormalized from master for convenience |\n",
"| `usdm_status` | `text` | `'covered'` (USDM zone) or `'no_coverage'` (outside USDM extent) |\n",
"| `drought_period_start`, `drought_period_end` | `date` | First / last USDM week observed for this DC |\n",
"| `weeks_observed` | `int` | Total weekly observations |\n",
"| `weeks_in_d0_or_worse` … `weeks_in_d4` | `int` | Cumulative weekly counts at each severity threshold |\n",
"| `pct_weeks_in_d0_or_worse` … `pct_weeks_in_d4` | `double` | Same as ratios over `weeks_observed` |\n",
"| `worst_dm_category` | `smallint` | Max DM ever experienced (04) |\n",
"| `mean_dm_category` | `double` | Average DM across all weeks, treating no-drought (`-1`) as 0 |\n",
"| `longest_d0_streak_weeks` | `int` | Longest consecutive run with any drought (D0+) |\n",
"| `longest_d2_streak_weeks` | `int` | Longest consecutive run with severe drought (D2+) — **the headline streak metric** |\n",
"| `longest_d3_streak_weeks` | `int` | Longest consecutive run with extreme drought (D3+) |\n",
"| `fetched_at`, `updated_at` | `timestamptz` | Provenance |\n",
"\n",
"**Indexes:** GIST on `geom`, btree on `state`, btree on `worst_dm_category`.\n",
"\n",
"**Size:** 1,833 rows (one per master DC; PR sites flagged `no_coverage` if applicable).\n",
"\n",
"**Headline metric for site-selection analysis:** `pct_weeks_in_d2_or_worse`. D2 = \"Severe Drought\" is the threshold at which water-use restrictions typically kick in for utilities and municipalities.\n",
"\n",
"**Example: joined climate + drought view for cooling-water risk analysis**\n",
"```sql\n",
"SELECT\n",
" c.master_id, c.name, c.state,\n",
" c.cooling_degree_days_c, -- baseline cooling load\n",
" c.mean_wet_bulb_temperature_c, -- evaporative-cooling efficiency\n",
" d.pct_weeks_in_d2_or_worse * 100 AS pct_severe_drought,\n",
" d.longest_d2_streak_weeks,\n",
" d.worst_dm_category\n",
"FROM data_center_historical_climate c\n",
"JOIN data_center_usdm_drought_exposure d USING (master_id)\n",
"WHERE d.usdm_status = 'covered'\n",
"ORDER BY (c.cooling_degree_days_c * d.pct_weeks_in_d2_or_worse) DESC\n",
"LIMIT 25;\n",
"```\n",
"\n",
"---\n",
"\n",
"### Relationship diagram\n",
"\n",
"```\n",
"master_data_centers (master_id PK)\n",
" │\n",
" ├── data_center_historical_climate (master_id PK) ← from open_meteo/Daymet/gridMET notebook\n",
" │\n",
" └── data_center_usdm_drought_exposure (master_id PK) ← this notebook\n",
" │\n",
" └── data_center_usdm_drought_dc_week (master_id, week_date)\n",
" │\n",
" └── usdm_drought_weekly (id PK, week_date, dm_category, geom)\n",
"```\n",
"\n",
"All three USDM tables are regenerable from the zip files in `USDM Shape Files/`. `RELOAD_WEEKLY=True` rebuilds from scratch; `RECOMPUTE_SUMMARY=True` (default) recomputes the dc-week + exposure tables from whatever's in `usdm_drought_weekly`.\n"
] ]
} }
], ],