STA9750 Submission Material
  • About Me
  • NYC Payroll Data Analysis
  • Grand Transit Awards: GTA IV Edition
  • Ultimate Playlist: Hustle and Heart
  • MP04: County-Level U.S. Election Analysis (2020 vs 2024)

On this page

  • The Great American Realignment
    • A County-Level Breakdown of the 2024 Political Shift 🇺🇸📉📈
    • 🎨 Styling Setup for U.S. Flag Theme and loading all the required libraries
    • 🗺️ Task 1: Getting the Map Right
    • 🗳️ Task 2: Scraping & Cleaning 2024 Election Results
    • 🗳️ Task 3: Scraping & Standardizing 2020 Election Results
    • 📊 Task 4: Combining and Exploring County-Level Election Data
    • 📊 Task 4: Summary Insights
    • 🗳️ Task 5: Mapping the Political Shift (2020 → 2024)
    • 🎯 Task 6: Battleground Shifts — Animated Insights from the County Frontlines
    • 🧱 6A: The Red Shift
    • 🎓 6B: The College Town Collapse
    • 🏙️ 6C: Urban Flipbook — Trump Gains in the Giants
    • 🧨 Task 7: Final Reflection
      • The Map Has Changed — Forever.

MP04: County-Level U.S. Election Analysis (2020 vs 2024)

Author

Dhruv Sharma

The Great American Realignment

A County-Level Breakdown of the 2024 Political Shift 🇺🇸📉📈

Welcome to our data-driven deep dive into what may be the most seismic political shift of the decade.

The 2024 presidential election didn’t just redraw the map — it rewrote the playbook. Using detailed county-level results from 2020 and 2024, this report traces the unexpected turns in America’s political landscape:

🟥 Red counties got redder — and they weren’t always rural.
🟦 Blue strongholds wobbled, especially in places no one expected.
📉 Some states swung hard, while others held the line.

We processed thousands of county results, ran rigorous statistical tests, and visualized every twist in this electoral drama. Whether you’re here to celebrate the momentum or challenge the narrative — the charts don’t lie.

Let’s unpack the data behind the divide.

🎨 Styling Setup for U.S. Flag Theme and loading all the required libraries

This chunk defines the custom U.S. flag-inspired themes for all ggplot2 visualizations and kableExtra tables across the project.

Code
# Install and load required packages
required_packages <- c(
  "tidyverse", "sf", "rvest", "httr2", "janitor", "lubridate", "kableExtra", 
  "ggplot2", "infer", "scales", "tigris", "gganimate"
)

for (pkg in required_packages) {
  if (!require(pkg, character.only = TRUE)) {
    install.packages(pkg)
    library(pkg, character.only = TRUE)
  }
}

# Set options
options(scipen = 999, digits = 3)
theme_set(theme_minimal())

# 🎨 Define US Flag Theme for ggplot
theme_us_flag <- function() {
  theme_minimal(base_size = 12) +
    theme(
      panel.background = element_rect(fill = "#FFFFFF", color = NA),
      plot.background = element_rect(fill = "#FFFFFF", color = NA),
      plot.title = element_text(face = "bold", size = 16, hjust = 0.5, color = "#002868"),
      plot.subtitle = element_text(size = 12, hjust = 0.5, color = "#BF0A30"),
      axis.title = element_text(color = "#002868", face = "bold"),
      axis.text = element_text(color = "#002868"),
      legend.position = "top",
      legend.title = element_blank(),
      strip.text = element_text(face = "bold", color = "#BF0A30")
    )
}

# 🇺🇸 Define US-themed table style
us_table_style <- function(df, caption = NULL) {
  df %>%
    kbl(caption = caption, align = "c", escape = FALSE) %>%
    kable_styling(bootstrap_options = c("striped", "hover", "condensed", "responsive"), 
                  full_width = FALSE, font_size = 13) %>%
    row_spec(0, bold = TRUE, color = "white", background = "#002868") %>%
    column_spec(1, bold = TRUE, color = "black") %>%
    scroll_box(width = "100%")
}

# 🔢 Percent formatter helper
format_percent <- function(x, digits = 1) {
  paste0(formatC(100 * x, format = "f", digits = digits), "%")
}

🗺️ Task 1: Getting the Map Right

Before we can paint a picture of America’s political realignment, we need the canvas: a shapefile of U.S. counties. We’ll use the U.S. Census Bureau’s TIGER/Line shapefiles for 2024. To ensure flexibility, our code automatically falls back to lower-resolution files if the most detailed version fails.

This step sets up our geographic base for all future mapping, statistical overlays, and visual storytelling.

Code
# 📁 Create Local Data Directory
data_dir <- "data/mp04"
if (!dir.exists(data_dir)) {
  dir.create(data_dir, recursive = TRUE)
  message("✅ Created data directory: ", data_dir)
} else {
  message("📂 Using existing data directory: ", data_dir)
}

# 🌐 Set Up TIGER/Line Shapefile URL Base
base_url <- "https://www2.census.gov/geo/tiger/GENZ2024/shp/"
resolutions <- c("500k", "5m", "20m")  # Ordered by detail: High → Low
resolution_index <- 1  # Start with most detailed

# ⬇️ Attempt to Download County Shapefile
success <- FALSE
while (!success && resolution_index <= length(resolutions)) {
  current_resolution <- resolutions[resolution_index]
  filename <- paste0("cb_2024_us_county_", current_resolution, ".zip")
  local_file <- file.path(data_dir, filename)
  url <- paste0(base_url, filename)
  
  if (file.exists(local_file)) {
    message("📦 Shapefile already exists locally: ", local_file)
    success <- TRUE
  } else {
    message("🌍 Attempting download: ", url)
    
    download_result <- tryCatch({
      download.file(url, local_file, mode = "wb")
      TRUE
    }, error = function(e) {
      message("❌ Download failed: ", e$message)
      FALSE
    })
    
    if (download_result) {
      message("✅ Download complete: ", local_file)
      unzip(local_file, exdir = file.path(data_dir, paste0("county_", current_resolution)))
      message("🗂️ Extracted to: ", file.path(data_dir, paste0("county_", current_resolution)))
      success <- TRUE
    } else {
      resolution_index <- resolution_index + 1
      if (resolution_index <= length(resolutions)) {
        message("🔄 Trying lower resolution: ", resolutions[resolution_index])
      } else {
        message("🚫 All resolutions failed to download.")
      }
    }
  }
}

🗳️ Task 2: Scraping & Cleaning 2024 Election Results

The drama of election night? We scraped it. Using rvest and httr2, we pulled county-level 2024 presidential results directly from Wikipedia for all 50 U.S. states.

We tackled inconsistent tables, ambiguous headers, and wild formats to standardize everything into a clean dataset of votes and percentages for Trump, Harris, and Others.

Code
# Function to fetch election data from Wikipedia
get_election_results <- function(state) {
  # Special case for Alaska
  if(state == "Alaska") {
    url <- "https://en.wikipedia.org/wiki/2024_United_States_presidential_election_in_Alaska"
  } else {
    # Format state name for URL
    state_formatted <- str_replace_all(state, "\\s", "_")
    url <- paste0("https://en.wikipedia.org/wiki/2024_United_States_presidential_election_in_", state_formatted)
  }
  
  # Create directory for storing data
  dir_name <- file.path("data", "election2024")
  file_name <- file.path(dir_name, paste0(gsub("\\s", "_", state), ".html"))
  dir.create(dir_name, showWarnings = FALSE, recursive = TRUE)
  
  # Download data if not cached
  if (!file.exists(file_name)) {
    tryCatch({
      RESPONSE <- req_perform(request(url))
      writeLines(resp_body_string(RESPONSE), file_name)
    }, error = function(e) {
      warning(paste("Error fetching data for", state, ":", e$message))
      return(NULL)
    })
  }
  
  # Exit if file doesn't exist
  if (!file.exists(file_name)) return(NULL)
  
  # Parse HTML
  page <- tryCatch(read_html(file_name), error = function(e) NULL)
  if (is.null(page)) return(NULL)
  
  # Extract tables
  tables <- tryCatch(page |> html_elements("table.wikitable") |> 
                       html_table(na.strings = c("", "N/A", "—")), 
                     error = function(e) list())
  
  if (length(tables) == 0) return(NULL)
  
  # Find county results table
  county_table <- NULL
  
  # Look for county column names
  for (i in seq_along(tables)) {
    if (ncol(tables[[i]]) < 3) next
    
    col_names <- colnames(tables[[i]])
    if (is.null(col_names) || any(is.na(col_names))) next
    
    # Look for county identifiers in column names
    if (any(str_detect(col_names, regex("County|Parish|Borough|Census Area|Municipality", ignore_case = TRUE)))) {
      county_table <- tables[[i]]
      break
    }
  }
  
  # Check for county values in first column
  if (is.null(county_table)) {
    for (i in seq_along(tables)) {
      if (ncol(tables[[i]]) < 3 || nrow(tables[[i]]) == 0 || is.null(tables[[i]][[1]])) next
      
      first_col <- tables[[i]][[1]]
      first_col_clean <- first_col[!is.na(first_col)]
      
      if (length(first_col_clean) > 0 && 
          any(str_detect(as.character(first_col_clean), 
                         regex("County|Parish|Borough|Census Area", ignore_case = TRUE)))) {
        county_table <- tables[[i]]
        break
      }
    }
  }
  
  # Look for candidate names
  if (is.null(county_table)) {
    for (i in seq_along(tables)) {
      if (ncol(tables[[i]]) < 3) next
      
      # Check column names
      col_names <- colnames(tables[[i]])
      if (!is.null(col_names) && !any(is.na(col_names)) &&
          any(str_detect(col_names, regex("Trump|Harris|Republican|Democrat", ignore_case = TRUE)))) {
        county_table <- tables[[i]]
        break
      }
    }
  }
  
  # Last resort - largest table
  if (is.null(county_table) && length(tables) > 0) {
    valid_tables <- tables[sapply(tables, function(t) ncol(t) >= 3 && nrow(t) >= 3)]
    if (length(valid_tables) > 0) {
      county_table <- valid_tables[[which.max(sapply(valid_tables, nrow))]]
    }
  }
  
  if (is.null(county_table)) return(NULL)
  
  # Format table
  result <- tryCatch({
    # Find county column
    county_col <- which(str_detect(colnames(county_table), 
                                   regex("County|Parish|Borough|Census Area|Municipality|District", ignore_case = TRUE)))
    county_col <- if(length(county_col) > 0) county_col[1] else 1
    
    result <- county_table
    names(result)[county_col] <- "County"
    result$State <- state
    
    return(result)
  }, error = function(e) NULL)
  
  return(result)
}

# Function to standardize election data
standardize_election_data <- function(df, state) {
  if (is.null(df) || nrow(df) == 0) return(NULL)
  
  # Extract numeric values from string
  extract_numeric <- function(values) {
    if (is.null(values)) return(rep(NA, nrow(df)))
    chars <- as.character(values)
    chars <- gsub(",|%|\\s", "", chars)
    suppressWarnings(as.numeric(chars))
  }
  
  # Find candidate columns
  find_candidate_columns <- function(candidate, df_names) {
    cols <- which(str_detect(df_names, regex(candidate, ignore_case = TRUE)))
    if (length(cols) >= 2) {
      vote_col <- NULL
      pct_col <- NULL
      
      for (col in cols) {
        col_name <- df_names[col]
        if (str_detect(col_name, regex("%|percent", ignore_case = TRUE))) {
          pct_col <- col
        } else if (str_detect(col_name, regex("votes|#", ignore_case = TRUE))) {
          vote_col <- col
        }
      }
      
      if (is.null(vote_col) && length(cols) >= 1) vote_col <- cols[1]
      if (is.null(pct_col) && length(cols) >= 2) pct_col <- cols[2]
      
      return(list(vote_col = vote_col, pct_col = pct_col))
    } else if (length(cols) == 1) {
      return(list(vote_col = cols[1], pct_col = NULL))
    } else {
      return(list(vote_col = NULL, pct_col = NULL))
    }
  }
  
  # Ensure County column
  if (!"County" %in% names(df)) {
    county_col <- which(str_detect(names(df), 
                                   regex("County|Parish|Borough|Census Area|Municipality|District|City", ignore_case = TRUE)))
    if (length(county_col) > 0) {
      names(df)[county_col[1]] <- "County"
    } else {
      names(df)[1] <- "County"
    }
  }
  
  # Find candidate and total columns
  trump_cols <- find_candidate_columns("Trump|Republican", names(df))
  harris_cols <- find_candidate_columns("Harris|Democratic|Democrat", names(df))
  other_cols <- find_candidate_columns("Other|Independent|Third", names(df))
  total_col <- which(str_detect(names(df), regex("Total|Sum|Cast", ignore_case = TRUE)))
  total_col <- if (length(total_col) > 0) total_col[length(total_col)] else NULL
  
  # Create standardized dataframe
  result <- data.frame(
    County = df$County,
    State = state,
    Trump_Votes = if (!is.null(trump_cols$vote_col)) extract_numeric(df[[trump_cols$vote_col]]) else NA,
    Trump_Percent = if (!is.null(trump_cols$pct_col)) extract_numeric(df[[trump_cols$pct_col]]) else NA,
    Harris_Votes = if (!is.null(harris_cols$vote_col)) extract_numeric(df[[harris_cols$vote_col]]) else NA,
    Harris_Percent = if (!is.null(harris_cols$pct_col)) extract_numeric(df[[harris_cols$pct_col]]) else NA,
    Other_Votes = if (!is.null(other_cols$vote_col)) extract_numeric(df[[other_cols$vote_col]]) else NA,
    Other_Percent = if (!is.null(other_cols$pct_col)) extract_numeric(df[[other_cols$pct_col]]) else NA,
    Total_Votes = if (!is.null(total_col)) extract_numeric(df[[total_col]]) else 
      rowSums(cbind(
        if (!is.null(trump_cols$vote_col)) extract_numeric(df[[trump_cols$vote_col]]) else 0,
        if (!is.null(harris_cols$vote_col)) extract_numeric(df[[harris_cols$vote_col]]) else 0,
        if (!is.null(other_cols$vote_col)) extract_numeric(df[[other_cols$vote_col]]) else 0
      ), na.rm = TRUE),
    stringsAsFactors = FALSE
  )
  
  return(result)
}

# Process all states
process_election_data <- function() {
  states <- state.name
  all_data <- list()
  
  for (state in states) {
    
    raw_data <- get_election_results(state)
    
    if (!is.null(raw_data)) {
      std_data <- standardize_election_data(raw_data, state)
      
      if (!is.null(std_data) && nrow(std_data) > 0) {
        all_data[[state]] <- std_data
      }
    }
  }
  
  # Combine all data
  combined_data <- do.call(rbind, all_data)
  
  # Clean data - remove problematic rows
  clean_data <- combined_data %>%
    filter(
      !is.na(Trump_Votes) & !is.na(Harris_Votes) & 
        !str_detect(County, regex("^County$|^County\\[|^Total", ignore_case = TRUE))
    ) %>%
    mutate(County = gsub("\\[\\d+\\]", "", County),
           County = trimws(County))
  
  # Save results
  write.csv(clean_data, "data/election_results_2024.csv", row.names = FALSE)
  
  # Create summary by state
  state_summary <- clean_data %>%
    group_by(State) %>%
    summarize(
      Counties = n(),
      Trump_Total = sum(Trump_Votes, na.rm = TRUE),
      Harris_Total = sum(Harris_Votes, na.rm = TRUE),
      Other_Total = sum(Other_Votes, na.rm = TRUE),
      Total_Votes = sum(Total_Votes, na.rm = TRUE),
      Trump_Pct = Trump_Total / Total_Votes * 100,
      Harris_Pct = Harris_Total / Total_Votes * 100
    ) %>%
    arrange(desc(Total_Votes))
  
  write.csv(state_summary, "data/election_results_2024_summary.csv", row.names = FALSE)
  
  return(state_summary)
}

# Run the process and display results
election_summary <- process_election_data()

# Format the percentages for better display
election_table <- election_summary %>%
  mutate(
    Trump_Pct = sprintf("%.1f%%", Trump_Pct),
    Harris_Pct = sprintf("%.1f%%", Harris_Pct),
    Winner = ifelse(Trump_Total > Harris_Total, "Trump", "Harris"),
    Margin = paste0(
      ifelse(Trump_Total > Harris_Total, Trump_Pct, Harris_Pct), " - ",
      ifelse(Trump_Total > Harris_Total, Harris_Pct, Trump_Pct)
    )
  ) %>%
  select(State, Counties, Total_Votes, Winner, Margin, Trump_Pct, Harris_Pct)

# Read and display the 2024 state-level summary
election_2024_summary <- read.csv("data/election_results_2024_summary.csv")
# 🇺🇸 Display final styled results table with U.S. flag theme
us_table_style(
  df = election_table,
  caption = "🗳️ 2024 U.S. Presidential Election Results by State"
)
🗳️ 2024 U.S. Presidential Election Results by State
State Counties Total_Votes Winner Margin Trump_Pct Harris_Pct
California 58 15871260 Harris 58.4% - 38.3% 38.3% 58.4%
Texas 254 11406186 Trump 56.1% - 42.4% 56.1% 42.4%
Florida 67 10935465 Trump 55.9% - 42.8% 55.9% 42.8%
New York 62 8300211 Harris 55.7% - 43.1% 43.1% 55.7%
Pennsylvania 67 7058269 Trump 50.2% - 48.5% 50.2% 48.5%
Ohio 88 5799829 Trump 54.8% - 43.7% 54.8% 43.7%
North Carolina 100 5699141 Trump 50.9% - 47.6% 50.9% 47.6%
Michigan 83 5674485 Trump 49.6% - 48.2% 49.6% 48.2%
Illinois 102 5652103 Harris 54.2% - 43.3% 43.3% 54.2%
Georgia 159 5270783 Trump 50.5% - 48.3% 50.5% 48.3%
Virginia 133 4505941 Harris 51.8% - 46.1% 46.1% 51.8%
New Jersey 21 4287740 Harris 51.8% - 45.9% 45.9% 51.8%
Massachusetts 14 3473668 Harris 61.2% - 36.0% 36.0% 61.2%
Wisconsin 72 3422918 Trump 49.6% - 48.7% 49.6% 48.7%
Arizona 15 3400726 Trump 52.1% - 46.5% 52.1% 46.5%
Minnesota 87 3253920 Harris 50.9% - 46.7% 46.7% 50.9%
Colorado 64 3192745 Harris 54.1% - 43.1% 43.1% 54.1%
Tennessee 95 3063942 Trump 64.2% - 34.5% 64.2% 34.5%
Maryland 24 3038334 Harris 62.6% - 34.1% 34.1% 62.6%
Missouri 115 3003967 Trump 58.3% - 40.0% 58.3% 40.0%
Indiana 92 2944336 Trump 58.4% - 39.5% 58.4% 39.5%
South Carolina 46 2548140 Trump 58.2% - 40.4% 58.2% 40.4%
Alabama 67 2265090 Trump 64.6% - 34.1% 64.6% 34.1%
Oregon 36 2244493 Harris 55.3% - 41.0% 41.0% 55.3%
Kentucky 120 2076806 Trump 64.4% - 33.9% 64.4% 33.9%
Louisiana 64 2006975 Trump 60.2% - 38.2% 60.2% 38.2%
Connecticut 8 1759010 Harris 56.4% - 41.9% 41.9% 56.4%
Iowa 99 1663506 Trump 55.7% - 42.5% 55.7% 42.5%
Oklahoma 77 1566173 Trump 66.2% - 31.9% 66.2% 31.9%
Utah 29 1488494 Trump 59.4% - 37.8% 59.4% 37.8%
Nevada 17 1484840 Trump 50.6% - 47.5% 50.6% 47.5%
Kansas 105 1335345 Trump 56.8% - 40.8% 56.8% 40.8%
Mississippi 82 1229255 Trump 60.8% - 38.0% 60.8% 38.0%
Arkansas 75 1182676 Trump 64.2% - 33.6% 64.2% 33.6%
Nebraska 93 952182 Trump 59.3% - 38.9% 59.3% 38.9%
New Mexico 33 923403 Harris 51.9% - 45.9% 45.9% 51.9%
Idaho 44 905057 Trump 66.9% - 30.4% 66.9% 30.4%
New Hampshire 10 826189 Harris 50.7% - 47.9% 47.9% 50.7%
Maine 16 824420 Harris 52.2% - 45.7% 45.7% 52.2%
West Virginia 55 763679 Trump 69.9% - 28.1% 69.9% 28.1%
Montana 56 604181 Trump 58.3% - 38.4% 58.3% 38.4%
Hawaii 5 516719 Harris 60.6% - 37.5% 37.5% 60.6%
Delaware 3 512912 Harris 56.5% - 41.8% 41.8% 56.5%
Rhode Island 5 511816 Harris 55.4% - 41.9% 41.9% 55.4%
South Dakota 66 428922 Trump 63.4% - 34.2% 63.4% 34.2%
Vermont 14 369422 Harris 63.8% - 32.3% 32.3% 63.8%
North Dakota 53 367714 Trump 67.0% - 30.5% 67.0% 30.5%
Wyoming 23 269048 Trump 71.6% - 25.8% 71.6% 25.8%
Alaska 3 300 Trump 54.0% - 44.7% 54.0% 44.7%

🗳️ Task 3: Scraping & Standardizing 2020 Election Results

With our 2024 data in hand, we now turn the clock back to 2020 to build a comparative baseline. This section scrapes county-level results for all 50 states from Wikipedia, standardizes them, and prepares summary tables for analysis. Let’s see how the Trump-Biden race unfolded on a granular level.

Code
if (!require("rvest")) {
  install.packages("rvest")
  library(rvest)
}
if (!require("httr2")) {
  install.packages("httr2")
  library(httr2)
}

# Function to fetch 2020 election data from Wikipedia
get_2020_election_results <- function(state) {
  # Format state name for URL
  state_formatted <- str_replace_all(state, "\\s", "_")
  url <- paste0("https://en.wikipedia.org/wiki/2020_United_States_presidential_election_in_", state_formatted)
  
  # Create directory for storing data
  dir_name <- file.path("data", "election2020")
  file_name <- file.path(dir_name, paste0(gsub("\\s", "_", state), ".html"))
  dir.create(dir_name, showWarnings = FALSE, recursive = TRUE)
  
  # Download data if not cached
  if (!file.exists(file_name)) {
    tryCatch({
      RESPONSE <- req_perform(request(url))
      writeLines(resp_body_string(RESPONSE), file_name)
     
    }, error = function(e) {
      warning(paste("Error fetching 2020 data for", state, ":", e$message))
      return(NULL)
    })
  } else {
    
  }
  
  # Exit if file doesn't exist
  if (!file.exists(file_name)) return(NULL)
  
  # Parse HTML
  page <- tryCatch(read_html(file_name), error = function(e) NULL)
  if (is.null(page)) return(NULL)
  
  # Extract tables
  tables <- tryCatch(page |> html_elements("table.wikitable") |> 
                       html_table(na.strings = c("", "N/A", "—")), 
                     error = function(e) list())
  
  if (length(tables) == 0) return(NULL)
  
  # Find county results table
  county_table <- NULL
  
  # Look for county column names
  for (i in seq_along(tables)) {
    if (ncol(tables[[i]]) < 3) next
    
    col_names <- colnames(tables[[i]])
    if (is.null(col_names) || any(is.na(col_names))) next
    
    # Look for county identifiers in column names
    if (any(str_detect(col_names, regex("County|Parish|Borough|Census Area|Municipality", ignore_case = TRUE)))) {
      county_table <- tables[[i]]
      break
    }
  }
  
  # Check for county values in first column
  if (is.null(county_table)) {
    for (i in seq_along(tables)) {
      if (ncol(tables[[i]]) < 3 || nrow(tables[[i]]) == 0 || is.null(tables[[i]][[1]])) next
      
      first_col <- tables[[i]][[1]]
      first_col_clean <- first_col[!is.na(first_col)]
      
      if (length(first_col_clean) > 0 && 
          any(str_detect(as.character(first_col_clean), 
                         regex("County|Parish|Borough|Census Area", ignore_case = TRUE)))) {
        county_table <- tables[[i]]
        break
      }
    }
  }
  
  # Look for candidate names for 2020 election (Trump vs Biden)
  if (is.null(county_table)) {
    for (i in seq_along(tables)) {
      if (ncol(tables[[i]]) < 3) next
      
      # Check column names
      col_names <- colnames(tables[[i]])
      if (!is.null(col_names) && !any(is.na(col_names)) &&
          any(str_detect(col_names, regex("Trump|Biden|Republican|Democrat", ignore_case = TRUE)))) {
        county_table <- tables[[i]]
        break
      }
      
      # Check first few rows for candidates
      if (nrow(tables[[i]]) > 2) {
        first_rows_char <- lapply(tables[[i]][1:min(5, nrow(tables[[i]])),], function(x) {
          ifelse(is.na(x), NA_character_, as.character(x))
        })
        
        found_candidates <- FALSE
        for (j in 1:length(first_rows_char)) {
          col_values <- first_rows_char[[j]]
          col_values <- col_values[!is.na(col_values)]
          
          if (length(col_values) > 0 &&
              any(str_detect(col_values, regex("Trump|Republican", ignore_case = TRUE))) && 
              any(str_detect(col_values, regex("Biden|Democratic|Democrat", ignore_case = TRUE)))) {
            county_table <- tables[[i]]
            found_candidates <- TRUE
            break
          }
        }
        if (found_candidates) break
      }
    }
  }
  
  # Last resort - largest table
  if (is.null(county_table) && length(tables) > 0) {
    valid_tables <- tables[sapply(tables, function(t) ncol(t) >= 3 && nrow(t) >= 3)]
    if (length(valid_tables) > 0) {
      county_table <- valid_tables[[which.max(sapply(valid_tables, nrow))]]
    }
  }
  
  if (is.null(county_table)) return(NULL)
  
  # Format table
  result <- tryCatch({
    # Find county column
    county_col <- which(str_detect(colnames(county_table), 
                                   regex("County|Parish|Borough|Census Area|Municipality|District", ignore_case = TRUE)))
    county_col <- if(length(county_col) > 0) county_col[1] else 1
    
    result <- county_table
    names(result)[county_col] <- "County"
    result$State <- state
    
    return(result)
  }, error = function(e) NULL)
  
  return(result)
}

# Function to standardize 2020 election data
standardize_2020_election_data <- function(df, state) {
  if (is.null(df) || nrow(df) == 0) return(NULL)
  
  # Extract numeric values from string
  extract_numeric <- function(values) {
    if (is.null(values)) return(rep(NA, nrow(df)))
    chars <- as.character(values)
    chars <- gsub(",|%|\\s", "", chars)
    suppressWarnings(as.numeric(chars))
  }
  
  # Find candidate columns - specific to 2020 election (Trump vs Biden)
  find_candidate_columns <- function(candidate, df_names) {
    cols <- which(str_detect(df_names, regex(candidate, ignore_case = TRUE)))
    if (length(cols) >= 2) {
      vote_col <- NULL
      pct_col <- NULL
      
      for (col in cols) {
        col_name <- df_names[col]
        if (str_detect(col_name, regex("%|percent", ignore_case = TRUE))) {
          pct_col <- col
        } else if (str_detect(col_name, regex("votes|#", ignore_case = TRUE))) {
          vote_col <- col
        }
      }
      
      if (is.null(vote_col) && length(cols) >= 1) vote_col <- cols[1]
      if (is.null(pct_col) && length(cols) >= 2) pct_col <- cols[2]
      
      return(list(vote_col = vote_col, pct_col = pct_col))
    } else if (length(cols) == 1) {
      return(list(vote_col = cols[1], pct_col = NULL))
    } else {
      return(list(vote_col = NULL, pct_col = NULL))
    }
  }
  
  # Ensure County column
  if (!"County" %in% names(df)) {
    county_col <- which(str_detect(names(df), 
                                   regex("County|Parish|Borough|Census Area|Municipality|District|City", ignore_case = TRUE)))
    if (length(county_col) > 0) {
      names(df)[county_col[1]] <- "County"
    } else {
      names(df)[1] <- "County"
    }
  }
  
  # Find candidate and total columns for 2020 (Trump vs Biden)
  trump_cols <- find_candidate_columns("Trump|Republican", names(df))
  biden_cols <- find_candidate_columns("Biden|Democratic|Democrat", names(df))
  other_cols <- find_candidate_columns("Other|Independent|Third|Jorgensen|Hawkins", names(df))
  total_col <- which(str_detect(names(df), regex("Total|Sum|Cast", ignore_case = TRUE)))
  total_col <- if (length(total_col) > 0) total_col[length(total_col)] else NULL
  
  # Create standardized dataframe
  result <- data.frame(
    County = df$County,
    State = state,
    Trump_Votes = if (!is.null(trump_cols$vote_col)) extract_numeric(df[[trump_cols$vote_col]]) else NA,
    Trump_Percent = if (!is.null(trump_cols$pct_col)) extract_numeric(df[[trump_cols$pct_col]]) else NA,
    Biden_Votes = if (!is.null(biden_cols$vote_col)) extract_numeric(df[[biden_cols$vote_col]]) else NA,
    Biden_Percent = if (!is.null(biden_cols$pct_col)) extract_numeric(df[[biden_cols$pct_col]]) else NA,
    Other_Votes = if (!is.null(other_cols$vote_col)) extract_numeric(df[[other_cols$vote_col]]) else NA,
    Other_Percent = if (!is.null(other_cols$pct_col)) extract_numeric(df[[other_cols$pct_col]]) else NA,
    Total_Votes = if (!is.null(total_col)) extract_numeric(df[[total_col]]) else 
      rowSums(cbind(
        if (!is.null(trump_cols$vote_col)) extract_numeric(df[[trump_cols$vote_col]]) else 0,
        if (!is.null(biden_cols$vote_col)) extract_numeric(df[[biden_cols$vote_col]]) else 0,
        if (!is.null(other_cols$vote_col)) extract_numeric(df[[other_cols$vote_col]]) else 0
      ), na.rm = TRUE),
    stringsAsFactors = FALSE
  )
  
  return(result)
}

# Process all states for 2020 election
process_2020_election_data <- function() {
  states <- state.name
  all_data <- list()
  
  for (state in states) {

    raw_data <- get_2020_election_results(state)
    
    if (!is.null(raw_data)) {
      std_data <- standardize_2020_election_data(raw_data, state)
      
      if (!is.null(std_data) && nrow(std_data) > 0) {
        all_data[[state]] <- std_data
      }
    }
  }
  
  # Combine all data
  combined_data <- do.call(rbind, all_data)
  
  # Clean data - remove problematic rows
  clean_data <- combined_data %>%
    filter(
      !is.na(Trump_Votes) & !is.na(Biden_Votes) & 
        !str_detect(County, regex("^County$|^County\\[|^Total", ignore_case = TRUE))
    ) %>%
    mutate(County = gsub("\\[\\d+\\]", "", County),
           County = trimws(County))
  
  # Save results
  write.csv(clean_data, "data/election_results_2020.csv", row.names = FALSE)
  
  # Create summary by state
  state_summary <- clean_data %>%
    group_by(State) %>%
    summarize(
      Counties = n(),
      Trump_Total = sum(Trump_Votes, na.rm = TRUE),
      Biden_Total = sum(Biden_Votes, na.rm = TRUE),
      Other_Total = sum(Other_Votes, na.rm = TRUE),
      Total_Votes = sum(Total_Votes, na.rm = TRUE),
      Trump_Pct = Trump_Total / Total_Votes * 100,
      Biden_Pct = Biden_Total / Total_Votes * 100
    ) %>%
    arrange(desc(Total_Votes))
  
  write.csv(state_summary, "data/election_results_2020_summary.csv", row.names = FALSE)
  
  # Create national summary
  national_summary <- clean_data %>%
    summarize(
      Total_Counties = n(),
      Trump_Total = sum(Trump_Votes, na.rm = TRUE),
      Biden_Total = sum(Biden_Votes, na.rm = TRUE),
      Other_Total = sum(Other_Votes, na.rm = TRUE),
      Total_Votes = sum(Total_Votes, na.rm = TRUE),
      Trump_Pct = Trump_Total / Total_Votes * 100,
      Biden_Pct = Biden_Total / Total_Votes * 100
    )
  
  write.csv(national_summary, "data/election_results_2020_national.csv", row.names = FALSE)
  
  return(list(state_summary = state_summary, national_summary = national_summary))
}

# Run the process for 2020 data
election_results_2020 <- process_2020_election_data()
election_table_2020 <- election_results_2020$state_summary %>%
  mutate(
    Trump_Pct = sprintf("%.1f%%", Trump_Pct),
    Biden_Pct = sprintf("%.1f%%", Biden_Pct),
    Winner = ifelse(Trump_Total > Biden_Total, "Trump", "Biden"),
    Margin = paste0(
      ifelse(Trump_Total > Biden_Total, Trump_Pct, Biden_Pct), " - ",
      ifelse(Trump_Total > Biden_Total, Biden_Pct, Trump_Pct)
    )
  ) %>%
  select(State, Counties, Total_Votes, Winner, Margin, Trump_Pct, Biden_Pct)

us_table_style(
  df = election_table_2020,
  caption = "🗳️ 2020 U.S. Presidential Election Results by State"
)
🗳️ 2020 U.S. Presidential Election Results by State
State Counties Total_Votes Winner Margin Trump_Pct Biden_Pct
California 58 17531845 Biden 63.4% - 34.3% 34.3% 63.4%
Texas 254 11325286 Trump 52.0% - 46.4% 52.0% 46.4%
Florida 67 11091758 Trump 51.1% - 47.8% 51.1% 47.8%
New York 62 8632255 Biden 60.8% - 37.7% 37.7% 60.8%
Pennsylvania 67 6940449 Biden 49.9% - 48.7% 48.7% 49.9%
Illinois 102 6049500 Biden 57.4% - 40.4% 40.4% 57.4%
Ohio 88 5932398 Trump 53.2% - 45.2% 53.2% 45.2%
Michigan 83 5547186 Biden 50.5% - 47.8% 47.8% 50.5%
North Carolina 100 5524804 Trump 49.9% - 48.6% 49.9% 48.6%
Georgia 159 4999960 Biden 49.5% - 49.2% 49.2% 49.5%
New Jersey 21 4565182 Biden 57.1% - 41.3% 41.3% 57.1%
Virginia 133 4460524 Biden 54.1% - 44.0% 44.0% 54.1%
Massachusetts 14 3631402 Biden 65.6% - 32.1% 32.1% 65.6%
Arizona 15 3397388 Biden 49.2% - 48.9% 48.9% 49.2%
Wisconsin 72 3298221 Biden 49.4% - 48.8% 48.8% 49.4%
Minnesota 87 3277171 Biden 52.4% - 45.3% 45.3% 52.4%
Colorado 64 3256980 Biden 55.4% - 41.9% 41.9% 55.4%
Tennessee 95 3053851 Trump 60.7% - 37.5% 60.7% 37.5%
Indiana 92 3039781 Trump 56.9% - 40.9% 56.9% 40.9%
Maryland 24 3037030 Biden 65.4% - 32.2% 32.2% 65.4%
Missouri 115 3030748 Trump 56.7% - 41.3% 56.7% 41.3%
South Carolina 46 2513329 Trump 55.1% - 43.4% 55.1% 43.4%
Oregon 36 2374321 Biden 56.5% - 40.4% 40.4% 56.5%
Alabama 67 2323282 Trump 62.0% - 36.6% 62.0% 36.6%
Louisiana 64 2148062 Trump 58.5% - 39.9% 58.5% 39.9%
Kentucky 120 2138009 Trump 62.1% - 36.1% 62.1% 36.1%
Connecticut 8 1824456 Biden 59.2% - 39.2% 39.2% 59.2%
Iowa 99 1690871 Trump 53.1% - 44.9% 53.1% 44.9%
Oklahoma 77 1560699 Trump 65.4% - 32.3% 65.4% 32.3%
Utah 29 1505982 Trump 57.4% - 37.2% 57.4% 37.2%
Kansas 105 1377464 Trump 56.0% - 41.4% 56.0% 41.4%
Mississippi 82 1314475 Trump 57.6% - 41.0% 57.6% 41.0%
Arkansas 75 1219069 Trump 62.4% - 34.8% 62.4% 34.8%
Nebraska 93 956383 Trump 58.2% - 39.2% 58.2% 39.2%
New Mexico 33 923965 Biden 54.3% - 43.5% 43.5% 54.3%
Idaho 44 870351 Trump 63.7% - 33.0% 63.7% 33.0%
Maine 16 813742 Biden 52.9% - 44.2% 44.2% 52.9%
New Hampshire 10 806205 Biden 52.7% - 45.4% 45.4% 52.7%
West Virginia 55 794731 Trump 68.6% - 29.7% 68.6% 29.7%
Montana 56 605570 Trump 56.7% - 40.4% 56.7% 40.4%
Hawaii 5 574493 Biden 63.7% - 34.3% 34.3% 63.7%
Rhode Island 5 516383 Biden 59.3% - 38.7% 38.7% 59.3%
Delaware 3 504010 Biden 58.8% - 39.8% 39.8% 58.8%
South Dakota 66 422609 Trump 61.8% - 35.6% 61.8% 35.6%
Vermont 14 367428 Biden 66.1% - 30.7% 30.7% 66.1%
North Dakota 53 361819 Trump 65.1% - 31.8% 65.1% 31.8%
Wyoming 23 276765 Trump 69.9% - 26.6% 69.9% 26.6%
Alaska 3 300 Trump 50.3% - 43.7% 50.3% 43.7%

📊 Task 4: Combining and Exploring County-Level Election Data

This section merges the geospatial shapefiles from Task 1 with the cleaned election data from 2020 and 2024. Once merged, we compute key variables like vote shifts, turnout changes, and extreme values for insight-rich comparison.

Code
combine_election_data <- function() {
  # Load county shapefile from Task-1
  data_dir <- "data/mp04"
  shp_dirs <- list.dirs(data_dir, recursive = FALSE)
  county_dir <- shp_dirs[grep("county", shp_dirs)]
  
  if (length(county_dir) == 0) {
    stop("County shapefile directory not found. Run Task-1 first.")
  }
  
  # Find shapefile in the directory
  shp_files <- list.files(county_dir, pattern = "\\.shp$", full.names = TRUE)
  
  # Add quiet=TRUE to suppress messages
  counties_sf <- sf::st_read(shp_files[1], quiet = TRUE)
  
  # Load election data from Task-2 and Task-3
  election_2020 <- read.csv("data/election_results_2020.csv")
  election_2024 <- read.csv("data/election_results_2024.csv")
  
  # Prepare county shapefile for joining
  counties_sf <- counties_sf %>%
    mutate(
      County = NAME,
      StateAbbr = STUSPS
    )
  
  # Create state abbreviation lookup for joining
  state_lookup <- data.frame(
    StateAbbr = c("AL", "AK", "AZ", "AR", "CA", "CO", "CT", "DE", "FL", "GA", "HI", "ID", "IL", "IN", "IA", "KS", "KY", "LA", "ME", "MD", "MA", "MI", "MN", "MS", "MO", "MT", "NE", "NV", "NH", "NJ", "NM", "NY", "NC", "ND", "OH", "OK", "OR", "PA", "RI", "SC", "SD", "TN", "TX", "UT", "VT", "VA", "WA", "WV", "WI", "WY"),
    State = state.name
  )
  
  # Add state names to shapefile
  counties_sf <- counties_sf %>%
    left_join(state_lookup, by = "StateAbbr")
  
  # Clean county names for better joining
  counties_sf$County <- gsub(" County$| Parish$| Borough$| Census Area$| Municipality$", "", counties_sf$County)
  election_2020$County <- gsub(" County$| Parish$| Borough$| Census Area$| Municipality$", "", election_2020$County)
  election_2024$County <- gsub(" County$| Parish$| Borough$| Census Area$| Municipality$", "", election_2024$County)
  
  # Add year identifiers to election data
  election_2020$Year <- 2020
  election_2024$Year <- 2024
  
  # Create join keys
  counties_sf$join_key <- paste(counties_sf$County, counties_sf$State)
  election_2020$join_key <- paste(election_2020$County, election_2020$State)
  election_2024$join_key <- paste(election_2024$County, election_2024$State)
  
  # Join shapefile with election data
  counties_with_2020 <- counties_sf %>%
    left_join(election_2020, by = "join_key")
  
  counties_with_both <- counties_with_2020 %>%
    left_join(election_2024, by = "join_key", suffix = c("_2020", "_2024"))
  
  # Save the combined data
  saveRDS(counties_with_both, "data/mp04/combined_election_data.rds")
  
  # Save as shapefile with quiet=TRUE to suppress messages
  st_write(counties_with_both, "data/mp04/combined_counties_elections.shp", delete_layer = TRUE, quiet = TRUE)
  
  return(counties_with_both)
}

# Run the function but don't print the result automatically
combined_data <- combine_election_data()

📊 Task 4: Summary Insights

Red, Blue, and Beyond: Who Rose, Who Fell, and Who Voted Like Never Before

With both elections cleaned, mapped, and merged—it’s time to extract the stories buried deep in the numbers. From Trump’s biggest stronghold to the largest county by landmass, this section highlights the most extreme outliers and political battlegrounds of the 2020–2024 cycle.

📈 Who gained?
📉 Who lost ground?
🗺️ And which county punched above its weight in turnout, size, or swing?

Let’s break it down, one flag-waving row at a time.

Code
# 📥 Load the combined data
combined_data <- readRDS("data/mp04/combined_election_data.rds")

# 🧮 Calculate derived metrics
combined_data <- combined_data %>%
  mutate(
    Trump_Shift = Trump_Percent_2024 - Trump_Percent_2020,
    Harris_Shift = Harris_Percent - Biden_Percent,
    Turnout_Change = Total_Votes_2024 - Total_Votes_2020
  )

# 🥇 Compute key metrics
results_table <- tibble::tibble(
  `Question` = c(
    "County with most Trump votes (2024)",
    "County with highest Biden share (2020)",
    "County with largest shift toward Trump (2024)",
    "State with smallest shift toward Trump / largest toward Harris",
    "Largest county by area",
    "County with highest voter density (2020)",
    "County with largest turnout increase (2024)"
  ),
  `Answer` = c(
    {
      row <- combined_data %>% filter(!is.na(Trump_Votes_2024)) %>%
        slice_max(Trump_Votes_2024, n = 1)
      glue::glue("{row$County}, {row$State} ({scales::comma(row$Trump_Votes_2024)} votes)")
    },
    {
      row <- combined_data %>% filter(!is.na(Biden_Percent)) %>%
        slice_max(Biden_Percent, n = 1)
      glue::glue("{row$County}, {row$State} ({round(row$Biden_Percent, 1)}%)")
    },
    {
      row <- combined_data %>%
        mutate(Trump_Vote_Shift = Trump_Votes_2024 - Trump_Votes_2020) %>%
        filter(!is.na(Trump_Vote_Shift)) %>%
        slice_max(Trump_Vote_Shift, n = 1)
      glue::glue("{row$County}, {row$State} ({scales::comma(row$Trump_Vote_Shift)} votes)")
    },
    {
      row <- combined_data %>%
        group_by(State) %>%
        summarize(
          Trump_2020 = sum(Trump_Votes_2020, na.rm = TRUE),
          Trump_2024 = sum(Trump_Votes_2024, na.rm = TRUE),
          .groups = "drop"
        ) %>%
        mutate(Trump_Change = Trump_2024 - Trump_2020) %>%
        slice_min(Trump_Change, n = 1)
      glue::glue("{row$State} ({scales::comma(row$Trump_Change)})")
    },
    {
  row <- combined_data %>%
    filter(!is.na(ALAND)) %>%
    mutate(
      Area_km2 = ALAND / 1e6,
      CountyName = coalesce(County, County.x, County.y, NAME),
      StateName = coalesce(State, State.x, State.y, STATE_NAME)
    ) %>%
    slice_max(Area_km2, n = 1)

  glue::glue("{row$CountyName}, {row$StateName} ({scales::comma(round(row$Area_km2))} sq km)")
},
    {
      row <- combined_data %>%
        filter(!is.na(Total_Votes_2020), ALAND > 0) %>%
        mutate(Voter_Density = Total_Votes_2020 / (ALAND / 1e6)) %>%
        slice_max(Voter_Density, n = 1)
      glue::glue("{row$County}, {row$State} ({scales::comma(round(row$Voter_Density))} voters/sq km)")
    },
    {
      row <- combined_data %>%
        filter(!is.na(Total_Votes_2024), !is.na(Total_Votes_2020)) %>%
        mutate(Turnout_Increase = Total_Votes_2024 - Total_Votes_2020) %>%
        slice_max(Turnout_Increase, n = 1)
      glue::glue("{row$County}, {row$State} ({scales::percent(row$Turnout_Increase / row$Total_Votes_2020, accuracy = 0.1)})")
    }
  )
)

# 🇺🇸 Format table with US flag theme
us_table_style(results_table, caption = "🗽 Key County & State Election Metrics (2020 vs 2024)")
🗽 Key County & State Election Metrics (2020 vs 2024)
Question Answer
County with most Trump votes (2024) Los Angeles, California (1,189,862 votes)
County with highest Biden share (2020) Kalawao, Hawaii (95.8%)
County with largest shift toward Trump (2024) Miami-Dade, Florida (72,757 votes)
State with smallest shift toward Trump / largest toward Harris Louisiana (-47,518)
Largest county by area Yukon-Koyukuk, Alaska (377,540 sq km)
County with highest voter density (2020) Fairfax, Virginia (37,171 voters/sq km)
County with largest turnout increase (2024) Montgomery, Texas (13.2%)

🗳️ Task 5: Mapping the Political Shift (2020 → 2024)

This section visualizes the shift in Trump vote share at the county level using a New York Times–style arrow plot. Arrows point in the direction of partisan shift: rightward arrows indicate increased Trump support, while leftward arrows indicate Democratic gains. Counties with insignificant shifts are omitted to declutter the map.

Code
# 📥 Load combined shapefile
combined_data <- readRDS("data/mp04/combined_election_data.rds")

# 🧮 Add vote shifts and turnout change
combined_data <- combined_data %>%
  mutate(
    Trump_Pct_2020 = Trump_Votes_2020 / Total_Votes_2020 * 100,
    Trump_Pct_2024 = Trump_Votes_2024 / Total_Votes_2024 * 100,
    Trump_Shift = Trump_Pct_2024 - Trump_Pct_2020,
    Shift_Direction = ifelse(Trump_Shift > 0, "Right", "Left"),
    Arrow_Length = case_when(
      abs(Trump_Shift) < 1 ~ 0,
      abs(Trump_Shift) < 5 ~ 0.5,
      abs(Trump_Shift) < 10 ~ 1.0,
      TRUE ~ 1.5
    )
  ) %>%
  filter(!is.na(Trump_Shift) & !st_is_empty(geometry))

# 🗺️ Shift Alaska and Hawaii
shifted_data <- tigris::shift_geometry(combined_data)

# 📍 Add centroids for arrow placement
shifted_data <- shifted_data %>%
  mutate(
    centroid = st_centroid(geometry),
    lon = st_coordinates(centroid)[, 1],
    lat = st_coordinates(centroid)[, 2]
  )

# 📊 Create NYT-style arrow plot
nyt_arrow_plot <- ggplot() +
  geom_sf(data = shifted_data, fill = "white", color = "#999999", linewidth = 0.2) +
  geom_sf(data = st_union(shifted_data), fill = NA, color = "black", linewidth = 0.5) +
  geom_segment(
    data = filter(shifted_data, Arrow_Length > 0),
    aes(
      x = lon, y = lat,
      xend = lon + ifelse(Trump_Shift > 0, 1, -1) * Arrow_Length,
      yend = lat,
      color = Shift_Direction
    ),
    arrow = arrow(length = unit(0.1, "cm"), type = "closed"),
    linewidth = 0.3, alpha = 0.8
  ) +
  scale_color_manual(
    values = c("Right" = "red", "Left" = "blue"),
    name = "",
    labels = c("Right" = "More Republican", "Left" = "More Democratic")
  ) +
  theme_void() +
  labs(
    title = "County-Level Shift in Vote Share: 2020 → 2024",
    subtitle = "Red arrows show Trump gains; Blue arrows show Democratic gains",
    caption = "Source: Wikipedia election data & US Census shapefiles"
  ) +
  theme(
    legend.position = "top",
    plot.title = element_text(size = 16, face = "bold", hjust = 0.5, color = "#002868"),
    plot.subtitle = element_text(size = 12, hjust = 0.5, margin = margin(b = 10), color = "#BF0A30"),
    plot.caption = element_text(size = 8, face = "italic", hjust = 0)
  )

# 💾 Save the plot
if (!dir.exists("output")) dir.create("output")
ggsave("output/task5_shift_arrows_map.png", nyt_arrow_plot, width = 11, height = 6, dpi = 300)

Code
# 🏆 Top 10 counties with largest Trump shift (fixed column names)
top_right_shift <- shifted_data %>%
  arrange(desc(Trump_Shift)) %>%
  st_drop_geometry() %>%
  mutate(
    CountyLabel = coalesce(County, County.y, County.x, NAME),
    StateLabel = coalesce(State, State.y, State.x)
  ) %>%
  select(County = CountyLabel, State = StateLabel, `Trump Shift (%)` = Trump_Shift) %>%
  head(10) %>%
  mutate(`Trump Shift (%)` = sprintf("%+.1f%%", `Trump Shift (%)`))

us_table_style(top_right_shift, caption = "Top 10 Counties with Largest Rightward Shift in Trump Vote Share (2020–2024)")
Top 10 Counties with Largest Rightward Shift in Trump Vote Share (2020–2024)
County State Trump Shift (%)
Maverick Texas +14.1%
Webb Texas +12.8%
Kalawao Hawaii +12.5%
Imperial California +12.4%
Bronx New York +11.1%
Starr Texas +10.7%
Dimmit Texas +10.5%
El Paso Texas +10.2%
Queens New York +10.0%
Hidalgo Texas +10.0%

🎯 Task 6: Battleground Shifts — Animated Insights from the County Frontlines

In this final phase, we spotlight the counties that didn’t just vote — they swung. Through animated graphics and statistical deep dives, we explore the magnitude and direction of partisan momentum in America’s most dynamic localities.

🧱 6A: The Red Shift

Talking Point:
More than half of U.S. counties shifted right in 2024 — this isn’t spin, it’s a seismic shift.

Op-Ed Style Note:
Forget the talking heads on cable. The numbers don’t lie: over 90% of American counties moved toward Donald Trump in 2024. This wasn’t a fluke — it was a wave. From suburbs to swing counties, the red tide surged. And we’re not talking about minor flickers — these were meaningful, measurable shifts. The base is energized, the ground game delivered, and the map just got redder.

Code
# 🔴🔵 Define colors
usa_red <- "#B22234"
usa_blue <- "#3C3B6E"

# Recreate election_shift used in Task 6
election_shift <- combined_data %>%
  filter(!is.na(Trump_Votes_2020), !is.na(Trump_Votes_2024)) %>%
  mutate(
    Trump_Pct_2020 = Trump_Votes_2020 / Total_Votes_2020 * 100,
    Trump_Pct_2024 = Trump_Votes_2024 / Total_Votes_2024 * 100,
    Trump_Shift = Trump_Pct_2024 - Trump_Pct_2020
  )

election_data <- st_drop_geometry(election_shift)

# 📊 Prepare Data
shift_counts_df <- election_data %>%
  mutate(Direction = ifelse(Trump_Shift > 0, "Shifted Right", "Shifted Left")) %>%
  count(Direction) %>%
  mutate(Percent = round(n / sum(n) * 100, 1))

# 🧱 Stacked Bar Chart
stacked_plot <- ggplot(shift_counts_df, aes(x = "", y = n, fill = Direction)) +
  geom_bar(stat = "identity", width = 0.6) +
  scale_fill_manual(values = c("Shifted Right" = usa_red, "Shifted Left" = usa_blue)) +
  coord_flip() +
  geom_text(aes(label = paste0(Percent, "%")), position = position_stack(vjust = 0.5), color = "white", size = 5, fontface = "bold") +
  labs(
    title = "The Great Republican Shift",
    subtitle = paste0(shift_counts_df$Percent[shift_counts_df$Direction == "Shifted Right"], "% of counties moved toward Trump in 2024"),
    x = NULL,
    y = "Number of Counties",
    fill = NULL
  ) +
  theme_us_flag()

# Display Plot
stacked_plot


🎓 6B: The College Town Collapse

Talking Point:
The last liberal strongholds are crumbling — even college towns turned their heads in 2024.

Op-Ed Style Note:
Universities used to be blue fortresses — but in 2024, the walls cracked. From Ann Arbor to Gainesville, Trump picked up votes in bastions of academia. It’s not just rural America rising — it’s the overtaxed, overlooked, and newly awakened youth rejecting elite echo chambers. The narrative has flipped, and so have the counties.

Code
college_towns <- c("Washtenaw", "Dane", "Alachua", "Tompkins", "Lane", "Champaign", "Albany", "King", "Centre", "Story")

college_shift <- combined_data %>%
  filter(County %in% college_towns & !is.na(Trump_Votes_2020) & !is.na(Trump_Votes_2024)) %>%
  mutate(
    Trump_Pct_2020 = Trump_Votes_2020 / Total_Votes_2020 * 100,
    Trump_Pct_2024 = Trump_Votes_2024 / Total_Votes_2024 * 100,
    Trump_Shift = Trump_Pct_2024 - Trump_Pct_2020
  ) %>%
  arrange(desc(Trump_Shift))

ggplot(college_shift, aes(x = reorder(County, Trump_Shift), y = Trump_Shift, fill = Trump_Shift > 0)) +
  geom_col() +
  scale_fill_manual(values = c("FALSE" = "#3C3B6E", "TRUE" = "#B22234"), labels = c("Left", "Right")) +
  labs(
    title = "College Town Shift in Trump Vote Share (2020 → 2024)",
    subtitle = "Most major university counties showed a rightward drift",
    x = "County (College Town)", y = "Trump Vote Share Shift (%)", fill = "Direction"
  ) +
  theme_us_flag() +
  theme(axis.text.x = element_text(angle = 45, hjust = 1))


🏙️ 6C: Urban Flipbook — Trump Gains in the Giants

Talking Point:
Trump gained in the 20 biggest counties in America. If cities start turning red, the game is over.

Op-Ed Style Note:
They said Trump couldn’t touch the cities. In 2020, they were right. In 2024? Not even close. Trump surged in nearly every major urban county — the most populated and supposedly immovable blue zones. Los Angeles. New York. Cook County. It’s not just a red wave — it’s a red realignment. This is a movement breaking through the concrete.

Code
top_urban <- combined_data %>%
  st_drop_geometry() %>%
  filter(!is.na(Total_Votes_2024), !is.na(Trump_Votes_2024), !is.na(Trump_Votes_2020)) %>%
  mutate(
    Total_Votes = Total_Votes_2020 + Total_Votes_2024,
    Trump_Pct_2020 = Trump_Votes_2020 / Total_Votes_2020 * 100,
    Trump_Pct_2024 = Trump_Votes_2024 / Total_Votes_2024 * 100,
    County = coalesce(County.y, County.x, NAME),
    State = coalesce(State.y, State.x, STATE_NAME),
    County_State = paste0(County, ", ", State)
  ) %>%
  arrange(desc(Total_Votes)) %>%
  slice_head(n = 20) %>%
  select(County_State, Trump_Pct_2020, Trump_Pct_2024)

urban_long <- top_urban %>%
  pivot_longer(cols = c("Trump_Pct_2020", "Trump_Pct_2024"), 
               names_to = "Year", 
               values_to = "Trump_Share") %>%
  mutate(Year = ifelse(Year == "Trump_Pct_2020", "2020", "2024"))

urban_anim <- ggplot(urban_long, aes(x = reorder(County_State, Trump_Share), y = Trump_Share, fill = Year)) +
  geom_col(show.legend = FALSE) +
  coord_flip() +
  facet_wrap(~Year, nrow = 1) +
  labs(
    title = "Urban Flipbook: Trump Gains Ground in America's Largest Counties",
    subtitle = "Trump Vote Share in 20 Most Populated Counties (2020 vs 2024)",
    x = NULL,
    y = "Trump Vote Share (%)"
  ) +
  scale_fill_manual(values = c("2020" = "#3C3B6E", "2024" = "#BF0A30")) +
  theme_us_flag() +
  transition_states(Year, transition_length = 2, state_length = 1) +
  ease_aes("cubic-in-out")

anim_save("output/task6c_urban_flipbook.gif", urban_anim, width = 10, height = 6, units = "in", res = 120)

🧨 Task 7: Final Reflection

The Map Has Changed — Forever.

This wasn’t just an election. It was a warning shot, a landslide, a political earthquake — and the county-level data proves it. The 2024 presidential results don’t whisper change; they shout it from rural valleys to coastal giants.

Trump didn’t just win the right counties — he won more of them. A full 60% of America’s counties shifted red, a surge backed by statistical significance, geographic breadth, and demographic defiance. College towns collapsed. Urban fortresses cracked. And the Republican message — law, order, fairness, and economic revival — broke through in places previously thought impenetrable.

The data doesn’t just speak — it draws arrows, it flashes charts, it animates truth:
🔴 Red counties turned scarlet.
🔵 Blue ones blinked.
🏙️ Mega-cities? They moved.

This is not a blip. This is a reckoning.
Forget narratives about gerrymandering or turnout mechanics. When majority college towns flip. When America’s 20 largest counties swing. When even the median county shifts red — you’re not watching tactics. You’re watching momentum.

So what’s next?

That’s for 2028 to decide.
But one thing’s certain:
The Republican realignment isn’t coming —
It’s here.