American Politics in Perspective

The Limitations of Static Constitutions and Static Content

Author

Affiliation

Nathan Morse

Pennsylvania State University

Last modified

May 25, 2024

Summary

Abstract

Why has the United States become one of the most polarized and unequal countries in the democratic world? I argue that the American constitutional model is no longer compatible with American society. Large, diverse republics generally need more flexible institutions that are geared toward consensus building rather than majority rule to remain free and stable. Over the last half century, the American electorate has become more diverse than its institutions can handle. Americans are now gasping for a multiparty system and other updates that are not viable under the current framework. The equal representation of states in the Senate, majoritarian elections, and outdated amendment process have enabled economic inequality and polarization to rise by blocking routine maintenance to the nation’s democratic institutions and economic strategies. American democracy is now struggling not in spite of the Constitution, but because of it. A constitutional convention may be necessary to address these challenges in the long run.

Adding to the difficulty of constitutional reform, political misinformation has been getting more sophisticated—misleading charts often go viral, reaching millions of people—and paywalled PDFs are no match for modern media. Embracing dynamic data visualizations, videos, and interactive articles would help researchers advocate for policies that could strengthen American democracy. To illustrate this point, each chapter of this dissertation features an interactive data app showing how political institutions affect a variety of outcomes. I also shared animated charts adapted from these figures on social media and reflected on the experience. The dissertation as a whole is designed to inform the public and contribute to the academic literature at the same time.

About

The Pennsylvania State University, The Graduate School

American Politics in Perspective: The Limitations of Static Constitutions and Static Content

A Dissertation in Political Science by Nathan Morse

Submitted in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy

May 2024

Committee

The dissertation of Nathan Morse is being reviewed and approved by the following:

Christopher Zorn, Liberal Arts Professor of Political Science and Sociology (Dissertation Advisor, Chair of Committee)

Ray Block Jr., Associate Professor of Political Science and African American Studies

Suzanna Linn, Liberal Arts Professor of Political Science

Christopher Witko, Professor of Public Policy and Political Science

Michael Nelson, Director of Graduate Studies, Department of Political Science

Data preparation code

Packages and functions

#----- Packages -----#
library(tidyverse)
library(ForeCA)
library(knitr)
library(highcharter)
library(cdlTools)
library(haven)
library(rio)
library(countrycode)
library(fastDummies)
library(mice)
library(htmltools)
library(scales)
library(readr)
library(boot)

# Determine if HTML
html = knitr::is_html_output()

# Pooled standard deviation
sp = function(s1, s2) sqrt((s1^2 + s2^2)/2)

# Weighted standard deviation
weighted.sd = function(x, w) {
  x0 = weighted.mean(x, w, na.rm=TRUE)
  m = length(which(w>0))
  m0 = (m-1)/m
  sqrt(sum(w*((x-x0)^2), na.rm=TRUE)/(m0*sum(w, na.rm=TRUE)))
}

# Mode of data
getmode = function(x) names(table(x))[table(x)==max(table(x))][1]

# Asterisks based on p-values or rejection regions
psig = function(p) case_when(p < 0.01 ~ "**", p < 0.05 ~ "*", TRUE ~ "")
rsig = function(stat, cv) ifelse(abs(stat)>abs(cv), "*", "")

# Wrap text
wrapper = function(x, ...) gsub("\\n", "<br>", str_wrap(x, ...))


#----- Function to format regression table -----#
var_sum = function(mod, name) {
  
  # Variables
  variables = unlist(strsplit(name, " > "))
  pred = variables[1]
  resp = variables[2]
  
  # Format regression table
  summary(mod)$coef |> 
    as.data.frame() |>
    rownames_to_column("xvar") |>  # get x variable
    separate(xvar, into=c("xvar", "t"), "\\.") |>  # split columns
    mutate(t = str_sub(t, -1) |> as.numeric(),  # get lag time
           sig = (`Pr(>|t|)`<.05),  # determine if significant
           response = resp) |>
    filter(!is.na(t)) |>  # remove constant
    select(predictor=xvar, response, t, coef=Estimate, sig)
  
}


#----- Recode measure name -----#
key = read.csv("data/key.csv")
code_metric = function(x, type="metric", rev=FALSE, by="name") {
  key = key[key[[by]] %in% x,]
  lev = key[[by]]
  lab = key[[type]]
  if (rev) { lev = rev(lev); lab = rev(lab) }
  factor(x, lev, lab)
}
code_metric2 = function(type, x, rev=FALSE) code_metric(x, type, rev)


#----- Recode party -----#
south = c("GA", "NC", "SC", "VA", "KY", "LA", "MS", "AL", "AR", "FL", "TX", "OK")
code_party = function(x, state=NULL) {
  
  # Code standard two-party affiliations
  x = case_when(
    x=="dem" ~ "Democrats", 
    x=="rep" ~ "Republicans",
    TRUE ~ NA_character_
  )
  
  # Code South separately if desired
  if(!is.null(state)) {x = ifelse(state %in% south, "South", x)}
  
  # Return party affiliations
  factor(x)
}


#----- Generate congressional term label -----#
code_term = function(year=NULL, term=NULL) {
  
  # Calculate whichever value wasn't supplied
  if (is.null(term)) term = (year-1787)/2
  if (is.null(year)) year = 1787+(term*2)

  # Label in form of "nth Congress (YYYY-YYYY)"
  paste0(scales::label_ordinal()(term), # writes 1st, 2nd, 3rd, 4th, etc.
         " Congress (", year, "-", year+2, ")") # writes years
  
}


#----- Prepare data for plotting -----#
reshaper = function(data, ..., cols=2:ncol(data)) {
  
  # Types of variables to get information on
  info = c(...)
  
  # Reshape data
  data = data |>
    pivot_longer(all_of(cols), values_to="orig", values_drop_na=TRUE) |>
    group_by(name) |>
    mutate(norm = as.numeric(scale(orig)),
           term = code_term(year)) |>
    ungroup()
  
  # Recode metrics and arrange dataframe
  data = cbind(data, data.frame(sapply(info, code_metric2, data$name))) |>
    select(year, term, all_of(info), orig, norm)
  
}


#----- Theme for highcharts -----#
hc_morse = function(hc, wide=TRUE, colors="metric", scatter=FALSE) {
  hc |> hc_add_theme(hc_theme(
  
    # Colors and fonts
    chart = list(style = list(fontFamily = "Source Sans Pro"),
                 spacing = c(10,0,15,0)),
    title = list(align = "left", margin=36, style = list(
      color = "#1b5283", fontWeight="bold", fontSize="19px", useHTML=TRUE)),
    subtitle = list(align = "left", style = list(
      color = "#444444", fontSize="16px", useHTML=TRUE)),

    # Hover options
    tooltip = list(headerFormat = "<b>{point.key}</b><br>",
                   shared=TRUE, shadow=FALSE, borderRadius=4),
    xAxis = list(crosshair=TRUE),
    plotOptions = list(series = list(marker = list(
      enabled=scatter, symbol="circle", radius=3))),
    
    # Legend options
    legend = list(align="center", layout="horizontal")
  
  ))
}

# Dot for highchart tooltips
bullet = "<span style='color:{point.color}'>\u25CF</span>"

# Font link for svg images
googlefont = "https://fonts.googleapis.com/css2?family=Source+Sans+Pro:wght@400;700"

# Theme for ggplot
theme_morse = function() { theme_minimal() %+%
    theme(text = element_text(family="Source Sans Pro"),
          plot.title = element_text(hjust=0, color="#1b5283", 
                                    face="bold", size=13),
          strip.text = element_text(face="bold", size=10))
}

Ideology and polarization in Congress (member, party, and chamber-level, 1917-2021)

#----- Ideal points of members of Congress (Voteview) -----#
id_members = read.csv("data/voteview/HSall_members.csv") |> 
  
  # Clean data
  filter(congress>64 & congress<118,  # filter to 1917-2019
         party_code %in% c(100, 200),  # remove third parties
         chamber != "President") |>   # remove presidents
  mutate(year = as.integer(1787+(congress*2)),  # recode year
         party = ifelse(party_code==100, "dem", "rep")) |>  # recode party
  
  # Recode and rearrange variables
  mutate(faction = code_party(party, state_abbrev),
         party = code_party(party)) |>
  select(year, congress, chamber, icpsr, state=state_abbrev, 
         party, faction, dim1=nominate_dim1, dim2=nominate_dim2) |>
  na.omit()


#----- Chamber-year summaries -----#
id_chambers = id_members |>
  group_by(year, chamber, party) |>  # collapse to caucus-year
  
  # Party means and SDs
  summarise(d1_x = mean(dim1),  # party mean on dimension 1
            d1_s = sd(dim1),  # party std dev on dimension 1
            d2_x = mean(dim2),  # party mean on dimension 2
            d2_s = sd(dim2),  # party std dev on dimension 2
            n = n()) |>  # party size
  
  # Reshape data
  mutate(party = ifelse(party=="Democrats", "dem", "rep")) |>
  pivot_wider(year:chamber, names_from=party, values_from=d1_x:n) |>
  
  # Party distances and pooled SDs
  mutate(d1_d = d1_x_rep - d1_x_dem,  # difference between parties on dim 1
         d1_sp = sp(d1_s_dem, d1_s_rep),  # pooled std dev on dim 1
         d2_d = d2_x_rep - d2_x_dem,  # difference between parties on dim 2
         d2_sp = sp(d2_s_dem, d2_s_rep)) |>  # pooled std dev on dim 2
  select(year:chamber, d1_d:d2_sp)


#----- Party-chamber-year summaries -----#
id_parties = id_members |>
  group_by(year, chamber, party) |> # collapse to party-year
  summarise(d1_x = mean(dim1),
            d1_s = sd(dim1),
            d2_x = mean(dim2),
            d2_s = sd(dim2)) |>
  group_by(year, chamber) # collapse to chamber-year


#----- Faction-chamber-year summaries -----#
id_factions = id_members |>
  group_by(year, chamber, faction) |> # collapse to faction-year
  summarise(d1_x = mean(dim1),
            d1_s = sd(dim1),
            d2_x = mean(dim2),
            d2_s = sd(dim2))


#----- Overall party distance and homogeneity measures -----#
id_cong = id_chambers |> select(year:d1_sp) |>
  
  # Reshape data
  mutate(chamber = ifelse(chamber=="House", "hs", "sen")) |>
  pivot_wider(year, names_from=chamber, values_from=d1_d:d1_sp) |>
  
  # Invert pooled SDs
  mutate(across(contains("sp"), ~-.)) |>
  select(year, hs_dist=d1_d_hs, hs_hom=d1_sp_hs,
         sen_dist=d1_d_sen, sen_hom=d1_sp_sen) |>
  ungroup()


#----- Generate polarization index -----#
ts_sen = ts(select(id_cong, sen_dist, sen_hom))
ts_hs = ts(select(id_cong, hs_dist, hs_hom))

# Run forecastable component analysis
fc_sen = foreca(ts_sen, n.comp=1)
fc_hs = foreca(ts_hs, n.comp=1)

# Add data to dataset
polar = id_cong |>
  mutate(sen_pol = as.numeric(fc_sen$scores[,1]),
         hs_pol = as.numeric(fc_hs$scores[,1]))

Income inequality (national and state-level, 1917-2018)

#----- Income inequality in the US -----#
inc = read.csv("data/frank/Frank_Gini_2018.csv") |>
  merge(read.csv("data/frank/Frank_WID_2018.csv")) |>
  filter(State=="United States") |>
  select(year=Year, gini=Gini, top1=Top1_adj)

# State-level income inequality
st_ineq = read.csv("data/frank/Frank_WID_2018.csv") |>
  mutate(state = fips(gsub(" ", "", State), to="Abbreviation")) |>
  filter(!is.na(state)) |>
  select(year=Year, state, inc_ineq=Top1_adj)


#----- Income inequality abroad -----#
wid_key = read.csv("data/wid/WID_Metadata_27072021-014959.csv", sep=";", skip=1) |>
  select(code=Country.Code) |>  # select cols
  distinct() |>  # remove duplicates
  filter(!grepl("-", code)) |>  # remove subnational units
  mutate(id = countrycode(code, "iso2c", "vdem"))  # convert country code

# Load income inequality data
wid = read.csv("data/wid/WID_Data_27072021-014959.csv", sep=";", skip=1) |>
  
  # Transform data from wide to long
  rename_with(~gsub(".*_", "", gsub(".Pre.tax.*", "", .x))) |>  # rename cols
  pivot_longer(!Percentile:Year, values_drop_na=TRUE) |>  # reshape data
  filter(!grepl("\\.", name)) |>  # remove subnational units
  
  # Clean up dataset
  select(year=Year, code=name, top1=value) |>  # select cols
  mutate(top1 = top1*100) |>  # turn decimals into percent
  
  # Bring in country ids for merging with dem data
  merge(wid_key) |>  # bring in country codes
  select(id, year, top1) # remove country abbreviation letters

Population and apportionment (state-level, 1910-2020)

#----- Load House apportionment data (Census) -----#
st_reps = read.csv("data/census/app.csv", na.strings="") |>
  
  # Filter to official states
  filter(Geography.Type=="State", !is.na(Number.of.Representatives)) |>
  
  # Select variables and recode state variable
  select(state=Name, year=Year, reps=Number.of.Representatives) |>
  mutate(state = fips(gsub(" ", "", state), to="Abbreviation"))


#----- Load annual population data (Census) -----#
st_pop = read.csv("data/census/pop.csv") |>
  
  # Reshape data to long format and recode year
  pivot_longer(AKPOP:WYPOP, names_to="state", values_to="pop",
               names_pattern="(.*)POP") |>
  mutate(year = as.numeric(format(as.Date(DATE), "%Y"))) |>
  
  # Bring in apportionment data and fill in missing values
  merge(st_reps, by=c("year", "state"), all.x=TRUE) |> 
  arrange(state, year) |>
  group_by(state) |>
  fill(reps) |>
  filter(reps>0) |>
  
  # State summary stats
  group_by(year) |> # collapse by year
  mutate(
    
    # House apportionment
    House_ps = reps/sum(reps), # prop of house seats from state
    House_pd = (pop/reps)/sum(pop), # avg district size, as prop of us pop
    House_px = pop/sum(pop), # prop of us pop in state
    House_w = House_ps/(reps*House_px), # ratio of bias for state in house
    
    # Senate apportionment
    Senate_ps = 2/sum(2*n()), # prop of senators from state
    Senate_pd = (pop/2)/sum(pop), # avg constituency size (not meaningful)
    Senate_px = pop/sum(pop), # prop of us pop in state
    Senate_w = Senate_ps/(2*Senate_px) # ratio of bias for state
    
  ) |>
  
  # Reshape data
  pivot_longer(House_ps:Senate_w, names_to=c("chamber", ".value"),
               names_sep="_") |>
  select(year, state, chamber, pop, ps, pd, px, w)


#----- State population data, 2020 -----#
st_pop2020 = read.csv("data/census/pop2020.csv") |>
  mutate(pop = as.numeric(gsub(",", "", pop)))


#----- State partisan composition data -----#
st_leg = read.csv("data/ncsl/Legis_Control_2020.csv") |>
  mutate(TotalHouse = as.numeric(TotalHouse),
         dems = 100*HouseDem/TotalHouse,
         reps = 100*HouseRep/TotalHouse) |>
  rowwise() |>
  mutate(share = max(dems, reps)) |>
  select(state=STATE, party=LegisControl, share) |>
  merge(st_pop2020)

Historical election returns (local-level, 1920-1990)

#----- Load key with state codes and prepare to read data -----#
icpsr = read.csv("data/icpsr/icpsrcnt.csv") |>
  select(ICPSR_State_Code=STATEICP, state=State) |>
  distinct()

# Define the widths and names of each field based on the codebook
widths = fwf_widths(
  c(2, 1, 3, 2, 1, 3, 1, 6, 8, 2, 1, 4, 46), 
  c("Candidate_Number", "Source", "Year_of_Election", 
                       "ICPSR_State_Code", "Office_Code", "Congressional_District_Number", 
                       "Asterisk", "Blank_Field", "Total_Votes", "Month_of_Election", 
                       "Type_of_Election", "ICPSR_Party_Code", "Candidate_Name")
)


#----- Load and fix up data -----#
returns = read_fwf("data/icpsr/DS0001/00002-0001-Data.txt", col_positions=widths) |>
  
  # Replace missing data codes with NA
  mutate(Year_of_Election = ifelse(Year_of_Election==0, NA, Year_of_Election),
         ICPSR_State_Code = ifelse(ICPSR_State_Code==0, NA, ICPSR_State_Code),
         Total_Votes = ifelse(Total_Votes == -9, NA, Total_Votes),
         Month_of_Election = ifelse(Month_of_Election == 99, NA, Month_of_Election),
         ICPSR_Party_Code = ifelse(ICPSR_Party_Code == 9999, NA, ICPSR_Party_Code)) |>
  
  # Fix variables
  mutate(year = Year_of_Election + 1000,
         office = case_when(Office_Code==1 ~ "President",
                            Office_Code==3 ~ "House",
                            Office_Code > 3 & Office_Code < 7 ~ "Senate",
                            TRUE ~ NA_character_),
         party = case_when(ICPSR_Party_Code=="0100" ~ "Democrat",
                           ICPSR_Party_Code=="0200" ~ "Republican",
                           TRUE ~ NA_character_),
         votes = as.numeric(Total_Votes)) |>
  
  # Filter out unneeded data
  filter(Type_of_Election != "S", year >= 1920) |>
  
  # Get state names
  mutate(ICPSR_State_Code = as.numeric(ICPSR_State_Code)) |>
  merge(icpsr, all.x=TRUE) |>
  
  # Clean up dataset
  select(year, state, office, party, votes) |>
  filter(!is.na(state), !is.na(office)) |>
  
  # Party vote totals
  group_by(year, office, party) |>
  summarise(votes = sum(votes, na.rm=TRUE)) |>
  mutate(p_votes = round(votes*100/sum(votes), 2)) |>
  na.omit()


filter(data, office=="President") |>
  ggplot(aes(x=year, y=p_votes, color=party)) +
  geom_line() + 
  theme_morse()

Historical election returns (national and state-level, 1976-2020)

#----- House popular votes -----#
house = read.csv("data/mit/1976-2020-house.csv") |>
  
  # Recode parties
  mutate(party = case_when(
    party=="DEMOCRAT" | party=="DEMOCRATIC-FARMER-LABOR" ~ "Democrats",
    party=="REPUBLICAN" ~ "Republicans",
    is.na(party) ~ "Other",
    TRUE ~ "Other"
  )) |>
  
  # Organize data
  filter(!special) |>
  select(year, office, state=state_po, party, votes=candidatevotes)


#----- Senate popular votes -----#
sen = read.csv("data/mit/1976-2020-senate.csv") |>

  # Recode parties
  mutate(party = case_when(
    party_simplified=="DEMOCRAT" ~ "Democrats",
    party_simplified=="REPUBLICAN" ~ "Republicans",
    is.na(party_simplified) ~ "Other",
    TRUE ~ "Other"
  )) |>
  
  # Organize data
  filter(!special) |>
  select(year, office, state=state_po, party, votes=candidatevotes)

# Cumulative popular votes
sen2 = sen %>%
  arrange(party, year) %>%
  group_by(year, party) %>%
  summarise(votes = sum(votes, na.rm=TRUE)) %>%
  mutate(cum_votes = lag(votes, 2) + lag(votes) + votes)


#----- Presidential popular vote -----#
pres = read.csv("data/mit/1976-2020-president.csv") |>
  
  # Recode party
  mutate(party = case_when(
    party_simplified=="DEMOCRAT" ~ "Democrats",
    party_simplified=="REPUBLICAN" ~ "Republicans",
    TRUE ~ "Other"
  )) |>
  
  # Organize data
  select(year, office, party, votes=candidatevotes)

  
#----- Popular vote totals -----#
popvotes = bind_rows(house, pres) |>
  
  # Aggregate party vote totals
  group_by(year, office, party) |>
  summarise(votes = sum(votes, na.rm=TRUE)) |>
  
  # Determine winner
  group_by(year, office) |>
  mutate(total = sum(votes),
         share = votes/total*100,
         won = (share==max(share))) |>
  
  # Organize data
  filter(party!="Other", year>1990, won) |>
  mutate(office = ifelse(office=="US HOUSE", "House", "President"),
         lab = paste0(toupper(str_sub(party, 1, 3)), "\n",
                      round(share), "%")) |>
  select(year, office, party, share, lab)


#----- Popular vote totals -----#
popvotes2 = bind_rows(house, sen, pres) |>
  
  # Aggregate party vote totals
  group_by(year, office, party) |>
  summarise(votes = sum(votes, na.rm=TRUE)) |>
  
  # Determine winner
  group_by(year, office) |>
  mutate(total = sum(votes),
         share = votes/total*100,
         won = (share==max(share))) |>
  
  # Organize data
  filter(party!="Other", year>1990, won) |>
  mutate(office = case_when(office=="US HOUSE" ~ "House",
                            office=="US SENATE" ~ "Senate",
                            TRUE ~ "President"),
         lab = paste0(toupper(str_sub(party, 1, 3)), "\n",
                      round(share), "%")) |>
  select(year, office, party, share, lab)

RCV elections (local district-level, 2012-2023)

# Burlington, VT
bvt = read.csv("data/fairvote/rcv2.csv") |>
  filter(Jurisdiction=="Burlington", Year>2012) |>
  mutate(Election.Date = lubridate::mdy(Election.Date),
         Winner = str_to_title(Winner),
         Party = ifelse(Party=="Democratic", "Democrat", Party)) |>
  rename(Date=Election.Date)

Parties in state legislatures (chamber-level, 2011-2023)

#----- Load datasets -----#
ncsl = list()
for (i in 2011:2023) {
  filename = paste0("data/ncsl/Legis_Control_", i, ".xlsx")
  ncsl[[length(ncsl)+1]] = import(filename, skip=1)
}
names(ncsl) = paste0("x", 2011:2023)


#----- Put data together -----#
ncsl_full = ncsl |>
  
  # Remove non-states and put states into single dataset
  map(filter, row_number()<51) |>
  bind_rows(.id="year") |>
  
  # Clean up dataset
  mutate(year = as.numeric(gsub("x", "", year))) |>
  select(year, state=STATE, 
         senate_dem="Senate\r\nDem.", senate_rep="Senate\r\nRep.",
         senate_other="Senate\r\nother", senate_total="Total\r\nSenate",
         house_dem="House\r\nDem.", house_rep="House\r\nRep.",
         house_other="House\r\nother", house_total="Total\r\nHouse") |>
  filter(state != "Nebraska") |>
  
  # Clean data
  mutate(across(!year:state, ~gsub("[0-9]v|V|u", "", .x)),
         across(!year:state, ~gsub(",| ", "", .x)),
         across(!year:state, ~ifelse(is.na(.x) | .x=="" | .x==" ", 0, .x)),
         across(!year:state, as.numeric)) |>
  
  # Calcualte seat shares
  mutate(senate_pct = 100*senate_other/senate_total,
         house_pct = 100*house_other/house_total,
         senate_dem_seats = 100*senate_dem/senate_total,
         senate_rep_seats = 100*senate_rep/senate_total,
         house_dem_seats = 100*house_dem/house_total,
         house_rep_seats = 100*house_rep/house_total)


#----- State partisan composition -----#
ncsl_parties = ncsl_full |>
  select(year, state, contains("seats")) |>
  mutate(state = fips(gsub(" |\\*", "", state), to="Abbreviation")) |>
  pivot_longer(!year:state, names_pattern="(.*)_(.*)_seats",
               names_to=c("chamber", "party"), values_to="seats2")
  

#----- Calculate national averages by chamber -----#
ncsl_nat = ncsl_full |>
  group_by(year) |>
  summarise(senate_pct = mean(senate_pct),
            house_pct = mean(house_pct)) |>
  mutate(state = "National average", .after=year)


#----- Calculate national averages of all legislatures -----#
ncsl_nat2 = ncsl_full |>
  select(year, senate_pct, house_pct) |>
  pivot_longer(senate_pct:house_pct, names_to="chamber", values_to="pct") |>
  group_by(year) |>
  summarise(pct = mean(pct)) |>
  mutate(leg = "National average", .after=year)


#----- Final dataset -----#
ncsl_leg = ncsl_full |>
  select(year, state, senate_pct, house_pct) |>
  filter(state %in% c("Alaska", "Mississippi", "Oregon", "Vermont")) |>
  pivot_longer(senate_pct:house_pct, names_to="chamber", values_to="pct") |>
  mutate(chamber = ifelse(chamber=="senate_pct", "Senate", "House"),
         leg = paste(state, chamber)) |>
  filter(leg %in% c("Alaska House", "Mississippi House", "Oregon Senate",
                    "Vermont House", "Vermont Senate")) |>
  select(-chamber, -state)

Parties in state legislatures (chamber-level, 1934-2011)

#------ Legislative partisan composition -----#
st_parties = read.csv("data/klarner/Partisan_Balance.csv") |>
  
  # Clean data
  select(year, state, 
         senate_dem_seats=sen_dem_prop_all, 
         senate_rep_seats=sen_rep_prop_all,
         house_dem_seats=hs_dem_prop_all, 
         house_rep_seats=hs_rep_prop_all) |>
  mutate(state = fips(gsub(" ", "", state), "Abbreviation"),
         across(!year:state, ~100*.x)) |>
  
  # Reshape data
  pivot_longer(!year:state, names_pattern="(.*)_(.*)_seats",
               names_to=c("chamber", "party"), values_to="seats") |>
  
  # Bring in data from NCSL for 2011-present
  merge(ncsl_parties, all.x=TRUE) |>
  mutate(seats = ifelse(is.na(seats) & !is.na(seats2), seats2, seats)) |>
  select(!seats2)

Disproportionality in state legislatures (chamber-level, 1968-2016)

#----- Legislative election returns -----#
load("data/klarner/196slers.Rdata")
st_elects = table |>
  
  # Remove unnecessary data
  filter(year != 0, etype=="g") |>
  select(year, state=sab, chamber=sen, district=dno, party=partyz, vote) |>
  
  # Party totals
  group_by(year, state, chamber, district, party) |>
  summarise(votes = sum(vote, na.rm=TRUE)) |>
  
  # Chamber totals
  group_by(year, state, chamber, party) |>
  summarise(votes = sum(votes)) |>
  group_by(year, state, chamber) |>
  mutate(total = sum(votes),
         vote_share = round(100*votes/total,2),
         state = toupper(state),
         chamber = ifelse(chamber==1, "senate", "house"),
         party = case_when(party=="d" ~ "dem",
                           party=="r" ~ "rep",
                           TRUE ~ NA_character_)) |>
  
  # Clean dataset
  select(-votes, -total) |>
  na.omit()


#----- Disproportionality -----#
st_chambers = st_parties |>
  merge(st_elects, all.x=TRUE) |>
  group_by(year, state, chamber) |>
  summarise(disp = sqrt(0.5 * sum((vote_share - seats)^2))) |>
  na.omit()

State politics (state-level, 1916-2020)

# Legislative polarization
st_pol = read.delim("data/sm/shor-mccarty.tab") |>
  select(year, state=st, senate_dem=sen_dem, senate_rep=sen_rep, 
         house_dem=hou_dem, house_rep=hou_rep,
         senate_diffs=s_diffs, house_diffs=h_diffs)

# Voter turnout
st_turnout = read.csv("data/mcdonald/turnout.csv") |>
  select(year=Year, state=State, turnout=VEP.Total.Ballots.Counted,
         vep=Voting.Eligible.Population..VEP.) |>
  mutate(state = fips(gsub(" ", "", state), to="Abbreviation"),
         turnout = as.numeric(gsub("%", "", turnout)),
         vep = as.numeric(gsub(",", "", vep))) |>
  filter(!is.na(state))

# Congressional party delegation
fed_party = id_members |>
  filter(chamber=="House") |>
  group_by(year, state, party) |>
  summarise(del_center = mean(dim1, na.rm=TRUE)) |>
  mutate(year = year-1,
         party = ifelse(party=="Democrats", "dem", "rep"))

# Congressional state delegation
fed_state = id_members |>
  filter(chamber=="House") |>
  group_by(year, state) |>
  summarise(del_center = mean(dim1, na.rm=TRUE)) |>
  mutate(year = year-1)

Parties in national legislatures (country-level, present)

# Setup
library(rvest)

# Scrape the main page for country links
main_url = "https://data.ipu.org/elections?region=&structure=any__lower_chamber&form_build_id=form-LM5XNmMSpTc7Fo099DiBo6Tpjxakn163-9q8jmdpASY&form_id=ipu__utils_filter_form&op=Show+items"
country_links = read_html(main_url) |>
  html_nodes("table") |>  # Select all tables
  html_nodes("a") |>
  html_attr("href") |>
  unique()

# Prefix with domain
country_links = paste0("https://data.ipu.org", country_links)[-c(1:6)]

# Initalize dataframe
parl_parties_full = tibble(`Political group`=NA, Total=NA, Country=NA)

# Scrape each country's data
for (link in country_links) {
  
  # Read webpage
  page = read_html(link)
  
  # Get party composition table
  parties = page |>
    html_nodes(".panel-pane:contains('Parties or coalitions') table") |>
    html_table() |> first()
  
  # Determine if successful
  grabbed = !is.null(parties)
  
  # Remove extra variales
  if (grabbed) {
    if (ncol(parties>2)) {
      parties = parties[,1:2]
    }
  }
  
  # Add country name
  parties$Country = page |> html_nodes("h2") |> html_text() |> first()
  
  # Add to full data
  if (grabbed) parl_parties_full = rbind(parl_parties, parties)
  
}

# Effective number of parties formula
enp = function(p) 1/sum(p^2)

# Finalize data
parl_parties = parl_parties_full |>
  select(country=Country, party=`Political group`, seats=Total) |>
  group_by(country) |>
  mutate(share = seats*100/sum(seats)) |>
  na.omit()

# Country-level summaries
party_comp = parl_parties |>
  mutate(abb = countrycode(country, "country.name", "iso3c")) |>
  group_by(abb) |>
  summarise(nparties = enp(share/100),
            nparties5 = length(which(share >= 5)),
            nparties10 = length(which(share >= 10)),
            avg_share = mean(share, na.rm=TRUE))

# Save data
save(parl_parties, party_comp, file="data/parline/parline.Rdata")

Comparative institutions (country-level, 2020)

# CPDS data
cpds = read_dta("data/cpds/cpds.dta") |>
  mutate(parties = round(effpar_leg),
         pop = pop*1000) |>
  filter(year==2020) |>
  select(abb=iso, country, parties, pop)

# CPDS time series data
cpds2 = read_dta("data/cpds/cpds.dta") |>
  mutate(single = case_when(gov_type %in% c(1, 4) ~ "Single-party executive",
                            gov_type %in% c(2, 3, 5) ~ "Multiparty executive",
                            TRUE ~ NA_character_)) |>
  filter(!is.na(single)) |>
  group_by(country) |>
  mutate(change = 1,
         change = ifelse(single!=lag(single), lag(change)+1, lag(change))) |>
  select(abb=iso, country, year, single, change)

# DPI data
dpi = read_dta("data/dpi/DPI2020.dta") |>
  filter(gov1me!="NA", system!=-999) |>
  mutate(system = as.character(as_factor(system)),
         gov1me = ifelse(gov1me=="NA" | gov1me=="-999", 0, 1),
         gov2me = ifelse(gov2me=="NA" | gov2me=="-999", 0, 1),
         gov3me = ifelse(gov3me=="NA" | gov3me=="-999", 0, 1),
         govoth = ifelse(govoth<1, 0, 1),
         opp1me = ifelse(opp1me=="NA" | opp1me=="-999", 0, 1),
         opp2me = ifelse(opp2me=="NA" | opp2me=="-999", 0, 1),
         opp3me = ifelse(opp3me=="NA" | opp3me=="-999", 0, 1),
         oppoth = ifelse(oppoth<1, 0, 1),
         nparties = gov1me + gov2me + gov3me + govoth,
         opp_parties = opp1me + opp2me + opp3me + oppoth,
         all_parties = nparties + opp_parties,
         checks = as.numeric(checks)) |>
  select(abb=ifs, year, system, numvote, nparties, opp_parties,
         all_parties, checks, party_share=gov1vote)

# Semi-presidential data
sp = read.csv("data/sp/SP dataset v2.0.csv") |>
  mutate(semi = case_when(pp1==1 ~ "Premier-presidential",
                          pp1==2 ~ "President-parliamentary",
                          TRUE ~ NA_character_)) |>
  select(abb=country_text_id, year, semi)

# Semi-presidential data in most recent year
sp16 = sp |>
  filter(year==max(year)) |>
  select(-year)

# Democracy data
vdem = read.csv("data/vdem/vdem-v13.csv") |>
  select(abb=country_text_id, country_name, year, pop=e_pop, 
         libdem=v2x_libdem, electdem=v2x_polyarchy, polity=e_p_polity,
         stability=e_wbgi_pve, corrupt=v2x_corr, conflict=e_miinterc,
         gdppc=e_gdppc, lifeexp=e_pelifeex)

# Civil liberties
fh = read.csv("data/fh/fh.csv") |>
  mutate(abb = countryname(Country, "iso3c")) |>
  select(abb, civil=Civil.Liberties)

# State fragility
fsi = read.csv("data/fsi/fsi.csv") |>
  mutate(abb = countryname(Country, "iso3c")) |>
  select(abb, fragility=Total)

# Economic freedom
hf = read.csv("data/hf/hf.csv") |>
  filter(X2022.Score != "N/A") |>
  mutate(abb = countryname(Country.Name, "iso3c"),
         econ = as.numeric(X2022.Score)) |>
  select(abb, econ)

# Income inequality data
wid2 = wid |>
  mutate(abb = countrycode(id, "vdem", "iso3c"))

# Ethnic fractionalization data
hief = read.csv("data/hief/Historical_Index_of_Ethnic_Fractionalisation_Dataset.csv") |>
  mutate(Country = ifelse(Country=="Democratic Republic of Vietnam", 
                          "Vietnam", Country), # rewrite name to prevent error
         abb = countrycode(Country, "country.name", "iso3c", nomatch=340)) |>
  select(abb, year=Year, ethnic=EFindex) |>
  arrange(abb, year)

# Electoral systems
elect = read.csv("data/idea/idea.csv") |>
  select(abb=ISO.Code.3166, 
         year=Year,
         elections=Electoral.system.family) |>
  mutate(elections = case_when(
    elections=="Plurality/Majority" ~ "Majoritarian elections", 
    elections=="Mixed" ~ "Mixed methods",
    elections=="PR" ~ "Proportional representation")) |>
  arrange(abb, year) |>
  group_by(abb) |>
  filter(year == max(year)) |>
  select(!year)

# Largest party seat shares
vparties = read.csv("data/vdem/parties/V-Dem-CPD-Party-V2.csv") |>
  group_by(country_name) |>
  filter(year==max(year)) |>
  filter(v2paseatshare==max(v2paseatshare)) |>
  ungroup() |>
  select(abb=country_text_id, seats=v2paseatshare, votes=v2pavote)

# All parties' seat shares
vparties2 = read.csv("data/vdem/parties/V-Dem-CPD-Party-V2.csv") |>
  group_by(country_name) |>
  filter(year==max(year)) |>
  ungroup() |>
  filter(!is.na(v2paseatshare), v2paseatshare>=5) |>
  select(abb=country_text_id, party=v2paenname, all_seats=v2paseatshare, 
         all_votes=v2pavote) |>
  merge(elect) |>
  pivot_longer(all_seats:all_votes, names_to="variable", values_drop_na=TRUE) |>
  mutate(measure = as.character(
           factor(variable, levels=c("all_seats", "all_votes"), 
                  labels=c("Seat shares of all parties",
                           "Vote shares of all parties"))),
         u = as.character(factor(variable,levels=c("all_seats", "all_votes"), 
                                 labels=c("%", "%")))) |>
  group_by(elections, variable, measure, u) |>
  summarise(n = length(unique(abb)),
            np = n(),
            avg = mean(value),
            median = median(value),
            se = sd(value)/sqrt(np)) |>
  ungroup() |>
  na.omit()

# Political parties data
manifesto = read.csv("data/manifesto/MPDS2023a.csv") |>
  mutate(abb = countrycode(countryname, "country.name", "iso3c"),
         year = format(lubridate::dmy(edate), "%Y"),
         share = 100*absseat/totseats) |>
  group_by(abb) |>
  filter(year==max(year)) |>
  filter(share==max(share)) |>
  select(abb, share)

# Merge political data together
inst = dpi |>
  filter(year==2020) |>
  merge(cpds, all.y=TRUE) |>
  merge(manifesto, all.x=TRUE) |>
  merge(sp16, all.x=TRUE) |>
  mutate(system = ifelse(!is.na(semi), semi, system)) |>
  arrange(pop)

# Merge veto player data
veto = vdem |>
  merge(dpi, all.x=TRUE) |>
  merge(sp, all.x=TRUE) |>
  merge(fh, all.x=TRUE) |>
  merge(fsi, all.x=TRUE) |>
  merge(hf, all.x=TRUE) |>
  merge(wid2, all.x=TRUE) |>
  merge(hief, all.x=TRUE) |>
  group_by(country_name) |>
  fill(semi, system, pop, top1, ethnic, .direction="down") |>
  mutate(system = ifelse(!is.na(semi), semi, system)) |>
  filter(!is.na(country_name), !is.na(libdem), 
         system!="Assembly-Elected President")

# Yearly summaries
veto2 = veto |>
  group_by(year, system) |>
  summarise(y = mean(libdem),
            n = n(),
            se = sd(libdem)/sqrt(n),
            y0 = y - se,
            y1 = y + se) |>
  ungroup()

# Party composition of parliaments
load("data/parline/parline.Rdata")

# Variable labels
measures = c(nparties="Effective number of parties",
             nparties5="Number of parties with at least 5% of seats",
             nparties10="Number of parties with at least 10% of seats",
             avg_share="Seat shares of all parties",
             seats="Seat shares of the largest party",
             votes="Vote shares of the largest party")
u = c("", "", "", "%", "%", "%")

# Election system labels
el_abb = c("Plurality", "Mixed", "PR")
el_systems = c(
  paste("<b>Plurality voting</b><br>",
        "each district has one seat, which goes to<br>",
        "the candidate with the most votes"),
  paste("<b>Mixed electoral methods</b><br>",
        "plurality voting for some seats and<br>",
        "proportional representation for others"),
  paste("<b>Proportional representation</b><br>",
        "districts have multiple seats, which are<br>",
        "split among the parties based on how many<br>",
        "votes they received")
)

# Parties and party sizes
all_parties = party_comp |>
  merge(vparties, all=TRUE) |>
  
  # Reshape data
  pivot_longer(!abb, names_to="variable", values_drop_na=TRUE) |>
  
  # Bring in electoral system classifications
  merge(elect) |>
  
  # Collapse by electoral system
  group_by(elections, variable) |>
  summarise(n = length(unique(abb)),
            np = n(),
            avg = mean(value),
            median = median(value),
            se = sd(value)/sqrt(np)) |>
  ungroup() |>
  
  # Add descriptions
  mutate(measure = as.character(factor(variable, levels=names(measures), labels=measures)),
         u = as.character(factor(variable, levels=names(measures), labels=u)),
         elections2 = factor(elections, labels=el_systems),
         elections = factor(elections, labels=el_abb)) |>
  na.omit()

# Merge electoral system and parties data
wta = dpi |>
  filter(year==2020) |>
  select(abb, system, all_parties) |>
  merge(vparties, all.x=TRUE) |>
  merge(party_comp, all.x=TRUE) |>
  merge(elect) |>
  merge(sp16, all.x=TRUE) |>
  fill(system) |>
  mutate(system = ifelse(!is.na(semi), "Semi-presidential", system),
         country = countrycode(abb, "iso3c", "country.name"), .after=abb,
         elections = as.factor(elections)) |>
  filter(!is.na(elections))

Political, economic, and social indicators in the OECD (country-level, 2020)

#----- OECD COUNTRIES -----#
oecd = read.csv("../apps/oecd/data/oecd.csv") |>
  mutate(abb = countryname(country, "iso3c"),
         pop = pop2022*1000,
         year = 2022) |>
  select(abb, country, year, pop)


#----- DEMOCRACY DATA -----#
vdem = read.csv("../apps/oecd/data/political/vdem.csv") |>
  select(id=country_id, abb=country_text_id, country_name, year, libdem=v2x_libdem) |>
  filter(abb %in% oecd$abb) |>
  merge(wid, by=c("id", "year"), all.x=TRUE) |>
  select(abb, year, libdem, top1)


#----- INDICATOR DATA -----#

# Rule of law
wjp = read.csv("../apps/oecd/data/justice/wjp.csv") |>
  mutate(year=2022) |> 
  select(abb=Country.Code, year, law=WJP.Rule.of.Law.Index..Overall.Score)

# Civil liberties
fh = read.csv("../apps/oecd/data/justice/fhts.csv") |>
  mutate(abb = countryname(Country, "iso3c"),
         across(!abb, ~ifelse(.x=="-", NA, .x))) |>
  select(abb, contains("CL"))
names(fh) = c("abb", paste0("x", 1973:2021))
fh = fh |>
  pivot_longer(!abb, names_prefix="x", names_to="year", values_to="civil")
fh2 = read.csv("../apps/oecd/data/justice/fh.csv") |>
  mutate(abb = countryname(Country, "iso3c")) |>
  select(abb, civil2=Civil.Liberties)

# Peace
gpi = read.csv("../apps/oecd/data/peace/gpi.csv") |>
  mutate(year=2022) |>
  select(abb=iso3c, year, peace, safety)

# State fragility
fsi = read.csv("../apps/oecd/data/defense/fsi.csv") |>
  mutate(abb = countryname(Country, "iso3c"),
         year = 2022) |>
  select(abb, year, fragility=Total, foreign=X1..External.Intervention)

# Human development
hdi = read.csv("../apps/oecd/data/welfare/undpts.csv") |>
  select(abb=iso3, hdi_1990:le_2021) |>
  pivot_longer(!abb, names_sep="_", names_to=c("name", "year")) |>
  pivot_wider(id_cols=c(abb, year), names_from=name, values_from=value) |>
  rename(life=le)

# Health spending
health = read.csv("../apps/oecd/data/welfare/health.csv") |>
  filter(SUBJECT=="TOT", MEASURE=="USD_CAP") |>
  select(abb=LOCATION, year=TIME, health=Value)

# Poverty rates
poverty = read.csv("../apps/oecd/data/welfare/poverty.csv") |>
  filter(SUBJECT=="TOT") |>
  select(abb=LOCATION, year=TIME, poverty=Value)

# Economic freedom
hf = read.csv("../apps/oecd/data/liberty/hf.csv") |>
  filter(X2022.Score != "N/A") |>
  mutate(abb = countryname(Country.Name, "iso3c"),
         econ = as.numeric(X2022.Score),
         year = 2022) |>
  select(abb, year, econ)


#----- INSTITUTIONS DATA -----#
elect = read.csv("../apps/oecd/data/political/idea.csv") |>
  select(abb=ISO.Code.3166, 
         year=Year,
         elections=Electoral.system.family,
         leg=Electoral.system.for.national.legislature,
         pres=Electoral.system.for.the.president,
         reps=Legislative.size..voting.members.) |>
  
  # Recode variables
  mutate(elections = case_when(elections=="Plurality/Majority" ~ "Majoritarian elections", 
                               elections=="Mixed" ~ "Mixed electoral methods",
                               elections=="PR" ~ "Proportional representation"),
         leg = case_when(leg=="AV" ~ "Ranked-choice voting",
                         leg=="List PR" ~ "Proportional representation",
                         leg=="FPTP" ~ "Winner-take-all elections",
                         leg=="TRS" ~ "Two-round system",
                         leg=="MMP" ~ "Mix of systems",
                         leg=="STV" ~ "Single transferable vote",
                         leg=="Parallel" ~ "Mix of systems"),
         pres = case_when(pres=="TRS" ~ "Two-round system",
                          pres=="FPTP" ~ "Winner-take-all elections",
                          pres=="STV" ~ "Single transferable vote",
                          TRUE ~ NA_character_)) |>
  
  # Calculate population per representation
  merge(select(oecd, abb, year, pop), by=c("abb", "year"), all.x=TRUE) |>
  filter(abb!="") |>
  mutate(reps_pop = round(pop/reps)) |>
  select(!pop & !reps)

# DPI data
dpi = read_dta("../apps/oecd/data/political/DPI2020.dta") |>
  filter(ifs %in% oecd$abb) |>
  mutate(exec = as.character(as_factor(system)),
         exec = ifelse(exec=="Assembly-Elected President",
                       "Presidential", exec)) |>
  select(abb=ifs, year, exec, mdmh)

# CPDS data
cpds = read_dta("../apps/oecd/data/political/cpds.dta") |>
  mutate(parties = round(effpar_leg),
         federal = ifelse(as.character(fed)=="0", "Unitary", "Federal")) |>
  select(abb=iso, year, parties, disp=dis_gall, federal)


#----- COMBINE ALL DATA -----#
all.ts = vdem |>
  merge(select(oecd, abb, year, pop), by=c("abb", "year"), all=TRUE) |>
  merge(wjp, by=c("abb", "year"), all.x=TRUE) |>
  merge(fh, by=c("abb", "year"), all.x=TRUE) |>
  merge(gpi, by=c("abb", "year"), all.x=TRUE) |>
  merge(fsi, by=c("abb", "year"), all.x=TRUE) |>
  merge(hdi, by=c("abb", "year"), all.x=TRUE) |>
  merge(health, by=c("abb", "year"), all.x=TRUE) |>
  merge(poverty, by=c("abb", "year"), all.x=TRUE) |>
  merge(hf, by=c("abb", "year"), all.x=TRUE) |>
  merge(dpi, by=c("abb", "year"), all.x=TRUE) |>
  merge(elect, by=c("abb", "year"), all.x=TRUE) |>
  merge(cpds, by=c("abb", "year"), all.x=TRUE) |>
  merge(select(oecd, abb, country), all=TRUE) |>
  select(abb, country, year, everything())

# Cross-sectional subset
oecd22 = all.ts |>
  filter(year==2022) |>
  mutate(continent = countrycode(abb, "iso3c", "continent"))

#save(oecd22, file="data/oecd22.Rdata")

Political parties around the world (party-level, 2020)

#----- Party position scores -----#
elff = read.csv("data/manifesto/partypos-summaries.csv") |>
  mutate(edate = lubridate::ymd(edate)) |>
  select(country, edate, party, econlr, authlib)


#----- Party manifesto data -----#
mp.ts = read.csv("data/manifesto/MPDS2023a.csv") |>
  mutate(edate = lubridate::dmy(edate)) |>
  
  # Merge party position scores
  merge(elff) |>
  
  # Update variables
  mutate(year = as.numeric(format(edate, "%Y")),
         seats = absseat/totseats) |>
  
  # Filter to parties from 1975-present
  filter(year>=1975, seats>.05) |>
  select(year, party, partyname, country=countryname, seats,
         econlr, authlib)


#----- Cross-sectional subset -----#
mp = mp.ts |>
  
  # Filter to most recent data for each party from last 10 years
  group_by(party, country) |>
  filter(year>=2000, year==max(year), seats>=.05) |>
  ungroup() |>
  select(year, party, partyname, country, seats,
         econlr, authlib)


#----- Weighting means with population data -----#
load("data/vdem/pops.Rdata")
mp0 = mp |>
  
  # Get country codes
  mutate(country = ifelse(country=="Northern Ireland", "United Kingdom", country),
         code = countrycode(country, "country.name", "cown"),
         code = ifelse(country=="Serbia", 345, code)) |>
  
  # Bring in population data and weight each party
  merge(pops, all.x=TRUE, all.y=FALSE) |>
  mutate(w = seats*log(pop)) |>
  
  # Weighted means and SDs
  summarise(x_bar = weighted.mean(econlr, w),
            s_x = weighted.sd(econlr, w),
            y_bar = weighted.mean(authlib, w),
            s_y = weighted.sd(authlib, w))


# Sample
m = length(unique(mp.ts$country))
n = length(unique(mp.ts$party))
years = min(mp.ts$year):max(mp.ts$year)
df_blank = merge(tibble(year=years), tibble(party=unique(mp.ts$party)))


#----- Function for ranking party ideologies -----#
ranker = function(x, econ=TRUE) {
  
  # Percentile
  r = rank(x, ties.method="first")
  n = length(x)
  p = r/n
  
  # Rank
  r2 = ifelse(p<.5, round(p*100), round(100-(p*100)))
  
  # Left or right label
  side = ifelse(p<.5, "leftist", "conservative")
  if (!econ) {side = ifelse(p<.5, "libertarian", "authoritarian")}
  
  # More specific label
  adj = ifelse(econ, "Economically ", "Socially ")
  id7 = case_when(p <= .1 ~ "far-left",
                  p <= .35 ~ "left-wing",
                  p <= .45 ~ "center-left",
                  p <= .55 ~ "centrist",
                  p <= .65 ~ "center-right",
                  p <= .9 ~ "right-wing",
                  p <= 1 ~ "far-right" )
  
  # Output
  paste0("<b>", adj, id7, "</b><br>top ", r2, "% most ", side, " parties")
  
}


#----- Function to hold large seat shares for the next 10 years -----#
hold_seats = function(x) {
  last_max = 1
  until = length(x)
  for (i in 2:length(x)) {
    if ((x[i] > x[i-1]) & (i <= until)) {
      last_max = i
      until = i + 20
    } else if (i <= until) {
      x[i] = x[last_max]
    }
  }
  return(x)
}


#----- Prepare data -----#
mp2 = mp.ts |>
  filter(year>=1975) |>
  
  # Store each party's data until the next election
  mutate(election=TRUE) |>
  merge(df_blank, all=TRUE) |>
  arrange(party, year) |>
  group_by(party) |>
  mutate(seats = ifelse(is.na(seats) & row_number()==1, 0, seats),
         t0 = ifelse(is.na(election), FALSE, TRUE),
         seats = ifelse(is.na(seats) & !lag(t0,1) & !lag(t0,2) & !lag(t0,3) &
                       !lag(t0,4), 0, seats),
         election = ifelse(election, year, NA)) |>
  fill(party:election, .direction="downup") |>
  select(-t0) |>
  
  # Remove parties with two elections in one year
  group_by(year, party) |>
  filter(row_number()==max(row_number())) |>
  
  # Get country codes
  mutate(code = countrycode(country, "country.name", "cown"),
         code = ifelse(country=="Serbia", 345, code),
         country = ifelse(country=="German Democratic Republic",
                          "Germany (West)", country)) |>
  
  # Population data
  merge(pops, all=TRUE) |>
  arrange(country, year) |>
  group_by(country) |>
  fill(pop, .direction="downup") |>
  filter(!is.na(partyname)) |>
  select(-code) |>
  
  # Radius sizes
  mutate(pop = ifelse(is.na(pop), mean(pop, na.rm=TRUE), pop),
         w = seats*sqrt(pop),
         seats = seats*100) |>
  ungroup() |>
  
  # Hold large radius sizes for next 10 years
  group_by(party) |>
  mutate(w = hold_seats(w)) |>
  ungroup() |>
  
  # Rescale
  mutate(econlr = (econlr - mp0$x_bar)/mp0$s_x,
         authlib = (authlib - mp0$y_bar)/mp0$s_y) |>
  
  # Summary
  group_by(year) |>
  mutate(x_sum = ranker(econlr),
         y_sum = ranker(authlib, econ=FALSE),
         partyname = wrapper(partyname, 50),
         usa = case_when(
           country=="United States" & partyname=="Democratic Party" ~ "Democrats",
           country=="United States" & partyname=="Republican Party" ~ "Republicans",
           TRUE ~ ""
         )) |>
  ungroup() |>
  
  # Reorder data
  arrange(country, party, year)


#----- Time variant data -----#
mp3a = mp2 |>
  group_by(party) |>
  do(sequence = list_parse(
    select(., x=econlr, y=authlib, z=w, x_sum, y_sum, seats, election)))

# Merge with time variant data
mp3 = mp2  |>
  group_by(party) |>
  summarise(country = last(country),
            partyname = last(partyname),
            usa = last(usa)) |>
  merge(mp3a) |>
  ungroup()

Political, economic, and social indicators around the world (country-level, 1946-2022)

#----- Quality of Government data -----#
qog = read.csv("../../book/data/qog/qog_std_ts_jan23.csv") %>%
  
  # Select variables
  select(code=ccodecow, year, distmag=gol_adm,
         parl=gtm_parl, monarchy=br_mon, pop=wdi_pop,
         reps=ideaesd_lsvm, nparties=cpds_enps,
         incineq=top_top1_income_share, hdi=undp_hdi,
         particip=van_part) %>%
  
  # Fix variables
  mutate(reps = (reps/pop)*1000000,
         parl = (parl==2),
         monarchy = (monarchy==1)) %>%
  select(-pop)


#----- Political outcomes data-----#
vdem = read.csv("../../book/data/vdem/vdem-v13.csv") %>%
  select(code=COWcode, year, libdem=v2x_libdem, civil=v2x_civlib, 
         stability=e_wbgi_pve, gdp=e_gdppc, lifeexp=e_pelifeex,
         edu=e_peaveduc, land=e_area, pop=e_pop, oil=e_total_oil_income_pc,
         urban=e_miurbani, dom_conflict=e_miinterc)

# Population data
pops = vdem %>% select(code, year, pop)
#save(pops, file="../../book/data/vdem/pops.Rdata")


#----- Ethnic fractionalization data -----#
hief = read.csv("../../book/data/hief/Historical_Index_of_Ethnic_Fractionalisation_Dataset.csv") %>%
  mutate(Country = ifelse(Country=="Democratic Republic of Vietnam", 
                          "Vietnam", Country), # rewrite name to prevent error
         code = countrycode(Country, "country.name", "cown", nomatch=340)) %>%
  select(code, year=Year, ethnic=EFindex) %>%
  arrange(code, year)

Constitutional design (country-level, 1789-2021)

#----- Constitutions data -----#
ccp = read.csv("../../book/data/ccp/ccpcnc_v4_small.csv") %>%
  
  # Select variables
  select(code=cowcode, year, const=systid, systyear,
         length, hoselsys, lhterm, supterm, 
         amndappr_8, hosterml, housenum, lhselect_3, uhselect_3,
         judind, fedunit, part, referen, oversght, compvote, 
         camppubf, jury, equal, freerel, socecon, assem) %>%
  
  # Remove constitutions without sufficient data
  filter(!is.na(length)) %>%
  
  # Recode data
  mutate(age = 2023 - systyear, .before=length,
         hoselsys = case_when(hoselsys<4 ~ 1,
                              hoselsys<6 ~ 2,
                              hoselsys<8 ~ 3,
                              hoselsys>89 ~ 4,
                              TRUE ~ NA_real_),
         lhterm = case_when(lhterm<5 ~ 1,
                            lhterm<8 ~ 2,
                            lhterm>7 ~ 3,
                            TRUE ~ NA_real_),
         supterm = case_when(supterm < 7 ~ 1,
                             supterm < 16 ~ 2,
                             supterm < 90 ~ 3,
                             supterm >= 99 ~ 4,
                             TRUE ~ NA_real_),
         amndappr_8 = (amndappr_8==1),
         hosterml = (hosterml<5),
         housenum = (housenum==2),
         lhselect_3 = (lhselect_3==1),
         uhselect_3 = (uhselect_3==1),
         judind = (judind==1),
         fedunit = (fedunit<3),
         part = (part==1),
         referen = (referen==1),
         oversght = (oversght<4),
         compvote = (compvote==1),
         camppubf = (camppubf==1),
         jury = (jury==1),
         equal = (equal==1),
         freerel = (freerel==1),
         socecon = (socecon==1),
         assem = (assem==1)) %>%
  
  # Rename variables
  rename(amdnappr=amndappr_8, bicameral=housenum, lhelect=lhselect_3,
         uhelect=uhselect_3, federal=fedunit, parties=part, referendum=referen,
         oversight=oversght)


#----- Full dataset -----#
const = ccp %>%
  
  # Merge QoG data
  merge(qog, by=c("code", "year"), all=TRUE) %>%
  
  # Merge V-Dem data
  merge(vdem, by=c("code", "year"), all.x=TRUE) %>%
  
  # Merge ethnic fractionalization data
  merge(hief, by=c("code", "year"), all.x=TRUE) %>%
  
  # Fill in missing values
  group_by(code) %>%
  fill(distmag:ethnic, .direction="downup") %>%
  
  # Grab initial variables
  mutate(stability_t0 = first(stability),
         gdp_t0 = first(gdp),
         incineq_t0 = first(incineq),
         hdi_t0 = first(hdi),
         edu_t0 = first(edu),
         land_t0 = first(land),
         pop_t0 = first(pop),
         oil_t0 = first(oil),
         urban_t0 = first(urban),
         ethnic_t0 = first(ethnic)) %>%
  
  # Keep only the most recent year of each constitution
  group_by(const) %>%
  filter(year==max(year)) %>% 
  
  # Rearrange variables
  select(code, const, year, libdem:lifeexp, nparties:particip,
         age, length, distmag, reps, hoselsys:assem,
         parl, monarchy, contains("t0")) %>%
  
  # Turn variables into dummy variables
  mutate(across(hoselsys:supterm, as.character)) %>%
  dummy_cols(remove_selected_columns=TRUE) %>%
  mutate(across(everything(), as.numeric)) %>%

  # Impute missing data
  filter(!is.na(age)) %>%
  mice(method="cart") %>% complete()

Version history

Current: Version 1.12 (May 25, 2024)

Made minor improvements to Section 2 of the Visual Summary
Made minor improvements to the “Parties of the world’s democracies” visualization

Version 1.11 (March 15, 2024)

Made minor improvements to Chapter 5
Fixed typos throughout
Uploaded the PDF version

Version 1.10 (March 11, 2024)

Added “Advancements made by this work” section to Chapter 1
Rewrote most of Chapter 6
Restructured Chapter 5
Revised the abstract
Made minor improvements throughout

Version 1.00 (February 16, 2024)

Final draft for defense

Visual summary

A Bird’s-Eye View of American Politics

Institutions, ideology, and inequality in the United States and abroad

Below is a summary of this dissertation built around data visualizations rather than text. The six visualizations featured here (two from my master’s thesis and one from each empirical chapter of this dissertation) altogether tell a story about how the United States became one of the most polarized and unequal societies in the democratic world. Details about each figure can be found in the captions, footnotes, and chapters of this dissertation.

1. Party sorting in the twentieth century

Over the course of the twentieth century, the South went from being the most economically progressive faction in Congress to being the most economically conservative. By 1900, Southern states had pioneered early anti-trust regulations that protected farmers from Northern corporations.¹ The region then supported most of the New Deal, which brought much needed economic relief to whites while carefully excluding people of color.² As explicit racial discrimination was slowly struck down through laws and court decisions, Southerners lost interest in these programs and became drawn to the conservatism on the other side of the aisle.³

Figure 1.1: This figure, taken from Figure 2 of my master’s thesis (Morse 2021), shows the ideological distributions of members of the House and the Senate based on roll-call voting. Each dot represents a member of Congress. The \(x\)-axis is dimension 1 of Poole and Rosenthal’s (2017) DW-Nominate ideal points, which generally covers issues of socioeconomic welfare; the \(y\)-axis is dimension 2, which generally covers civil rights issues. Members representing states in the South are shown as pink regardless of party affiliation, whereas the dividing lines were only based on party affiliation. The slopes for the dividing lines were computed using linear discriminant analysis, which finds the boundary line that maximizes the separation between the two parties. In most browsers, you can right-click on the graphic and click “Show controls” to enable the pause button and progress bar. Data source: Voteview (Lewis et al. 2021). Produced in R with ggplot2 and gganimate.

Code

#----- Setup -----#
library(tidyverse)
library(ggplot2)
library(gganimate)

# Function to calculate slope of separation line
lda_line = function(dim1, dim2, group) {
  ld = MASS::lda(group ~ dim1 + dim2) # linear discriminant analysis
  b1 = ld$scaling[1] # extract coefficients for dim1
  b2 = ld$scaling[2] # extract coefficients for dim2
  slope = -b1/b2 # calculate slope
  angle1 = atan(slope) * 180/pi # convert to radians then degrees
  angle2 = ifelse(angle1<0, 90+angle1, -90+angle1) # rescale for continuity
  list(`b1`=b1, `b2`=b2, `slope`=slope, `angle`=angle2)
}

# Slopes for separation lines
lines = id_members %>%
  group_by(year, chamber) %>%
  summarise(
    
    # Determine majority party
    majority = getmode(party),
    
    # Slope of separation line between parties
    slope = lda_line(dim1, dim2, party)$slope,
    angle = lda_line(dim1, dim2, party)$angle) %>%
  
  arrange(chamber, year) %>%
  ungroup()


#----- Animated plot -----#
anim_cong = ggplot(id_members, aes(x=dim1, y=dim2)) +
  
  # Dots and contours
  geom_point(aes(group=icpsr, color=faction), 
             alpha=0.6, show.legend=FALSE) +
  stat_density_2d(aes(fill=faction), geom="polygon", size=0, alpha=.4) +
  
  # Separation line
  geom_abline(data=lines, aes(intercept=0, slope=angle), 
              size=0.5, alpha=0.5) +
  
  # Year labels
  geom_text(data=lines, 
            aes(label=as.character(year), color=majority, group=chamber),
            x=.9, y=.9, hjust=1, show.legend = FALSE, check_overlap = TRUE, 
            family="Source Sans Pro", size=5) +
  
  # Party name labels
  geom_text(data=id_factions, aes(x=d1_x, y=d2_x, group=faction, label=faction), 
            fontface="bold", color="black", family="Source Sans Pro",
            check_overlap=TRUE) +
  
  # Graph labels
  labs(title="Parties in Congress",
       subtitle=paste("Ideological scores of members of Congress,",
                      "with Southern members shaded separately"), 
       fill="", color="",
       x="Economic Welfare \n\u2190 Liberal – Conservative \u2192", 
       y="Civil Rights \n\u2190 Liberal – Conserv \u2192") +
  
  # Formatting
  scale_color_manual(values=c("steelblue", "tomato3", "orchid")) +
  scale_fill_manual(values=c("steelblue", "tomato3", "orchid")) +
  scale_x_continuous(limits=c(-1,1)) +
  scale_y_continuous(limits=c(-1,1)) +
  facet_wrap(~chamber) +
  theme_morse() + 
  theme(legend.position="none",
        panel.border = element_rect(fill=NA, color="gray25"),
        panel.spacing = unit(10, "pt"),
        axis.title.y = element_text(size=10),
        axis.line.x = element_line(size=.4, color="gray"),
        axis.ticks.x = element_line(size=.4, color="gray"),
        panel.grid.major.x = element_line(size=.3),
        panel.grid.minor.x = element_line(size=.3),
        panel.grid.major.y = element_line(size=.3),
        panel.grid.minor.y = element_line(size=.3),
        plot.background = element_rect(fill="white", color=NA)) +

  # Animation
  transition_time(year) +
  ease_aes('linear') + enter_grow() + exit_shrink()


#----- Display animation -----#
animate(anim_cong, nframes=150, fps=10, end_pause=5, 
        width=888, height=444, units="px", res=100)

2. Asymmetric polarization

As Southerners switched sides, the Republican Party grew large enough to push some of the most free-market economic agendas on record, according to data extracted from party platforms around the world.⁴ By the 2000s, the party moderated its economic stances but started moving to the right on social issues.⁵ This creates a situation known as asymmetric polarization, where one side has moved farther to the extremes than the other.⁶ The Democratic Party could easily have become the more extreme side if it had the Republican Party’s electoral advantages. Since smaller states tend to favor Republicans, the party can control the Senate and win the Electoral College while catering to a narrower base.⁷

Figure 1.2: Ideological positions of each political party that held at least 5% of seats in a national legislature at some point between 1990 and 2010. Bubble sizes roughly represent the number of voters supporting each party. They are based on the party’s seat share in the legislature times the square root of the country’s population. The classifications shown when hovering over a bubble indicate whether the party is in the top 10% farthest left parties (far-left), the top 10% to 35% (left-wing), 35% to 45% (center-left), the middle 10% (centrist), the 35% to 45% most conservative (center-right), the 10% to 35% most conservative (right-wing), and the top 10% most conservative (far-right). Party position scores were calculated by Elff (2013, 2020) using data from the Manifesto Project (2023). The data were rescaled so that the axes represent the average score of each dimension weighted by bubble size; larger parties and larger countries hold more weight. Produced in R with highcharter.

Code

#----- Setup -----#
library(highcharter)

# Colors
set.seed(6000)
palette = colorize(1:m, colors=c("#662549", "#f15c80", "#f7a35c", "#e4d354", "#90ed7d", "#2b908f", "#8085e9"))[sample(1:m, m)]


#----- Interactive figure -----#
fig3 = highchart() %>%
  
  # Add data
  hc_add_series(data=mp3, hcaes(group=country),
                type="bubble", minSize=0, maxSize="15%", stickyTracking=FALSE,
                dataLabels=list(enabled=TRUE, format="{point.usa}", allowOverlap=TRUE)) %>%
  
  # Chart options
  hc_chart(marker=list(fillOpacity=.65), spacing=c(0,0,0,0),
           panKey="shift", panning=list(enabled=TRUE, type="xy"),
           zooming=list(type="x", mouseWheel=list(enabled=TRUE),
                        pinchType="xy", key="alt")) %>%
  
  # Motion
  hc_motion(enabled=TRUE, labels=years, series=0:(n-1), 
            magnet=list(step=.05), startIndex=24,
            playIcon="bi bi-play-fill", pauseIcon="bi bi-pause") %>%
  
  # Text
  hc_title(text="Parties of the world's democracies", floating=TRUE, x=25, y=40) %>%
  hc_subtitle(text="Ideology based on policy goals from platforms and manifestos",
              floating=TRUE, x=25, y=60) %>%
  
  # Tooltips
  hc_tooltip(
    headerFormat=NULL, useHTML=TRUE, padding=12,
    pointFormat=paste(
      "<p class='country-year'>", bullet, "{point.country} ({point.election})</p>",
      "<h5 style='margin: 5px 0; text-align: center'><b>{point.partyname}</b></h5>", 
      "<p class='sample'>held <b>{point.seats:.1f}%</b> of the legislature</p>",
      "<div class='d-flex gap-2 flex-fill mt-3'>",
      "<p class='rankings flex-fill' style='text-align: right !important;'>{point.x_sum}</p>",
      "<p class='rankings flex-fill'>{point.y_sum}</p></div>",
      "<p class='sample'>out of", n, "parties in", m, "countries</p>"
    )
  ) %>%
  
  # X axis
  hc_xAxis(
    title=list(enabled=FALSE), crosshair=FALSE, min=-2.25, max=2.25,
    tickLength=0, gridLineWidth=1, gridLineColor="#efefef", 
    labels=list(enabled=FALSE, zIndex=0), 
    plotLines=list(
      list(color="transparent", width=2, value=0, zIndex=2, 
          label=list(text="&#8249; Authoritarian")),
      list(color="#aaa", width=2, value=0, zIndex=2, 
          label=list(text="Libertarian &#8250;", verticalAlign="bottom", y=-10,
                     textAlign="right"))
    )
  ) %>%
  
  # Y axis
  hc_yAxis(
    title=list(enabled=FALSE), crosshair=FALSE, min=-2.5, max=2.5,
    tickLength=0, gridLineWidth=1, gridLineColor="#efefef",
    labels=list(enabled=FALSE, zIndex=0),
    plotLines=list(
      list(color="#aaa", width=2, value=0, zIndex=2, 
           label=list(text="&#8249; Socialist", align="left", x=10)),
      list(color="transparent", width=2, value=0, zIndex=2, 
           label=list(text="Capitalist &#8250;", align="right", x=-175))
    )
  ) %>%
  
  # Legend
  hc_legend(layout="vertical", align="right", verticalAlign="top",
            maxHeight=500, floating=TRUE, title=list(text="Country"),
            backgroundColor="rgba(255,255,255,.75)", borderColor="#dddddd",
            borderRadius=4, borderWidth=1, padding=10, x=-25, y=50) %>%
  
  # Formatting
  hc_morse(scatter=TRUE) %>%
  hc_colors(palette) %>%
  hc_size(height=600) %>%
  hc_exporting(enabled=TRUE, filename="pop_parties",
               buttons=list(contextButton=list(x=-25, y=20))) %>%
  hc_plotOptions(series = list(states = list(inactive = list(opacity=.075)))) %>%
  
  # Responsiveness
  hc_responsive(rules=list(
    list(
      condition = list(maxWidth=992),
      chartOptions = list(
        legend=list(enabled=FALSE),
        yAxis=list(min=-3, max=3, plotLines=list(
          list(color="#aaaaaa", width=2, value=0, zIndex=2, 
               label=list(text="&#8249; Socialist", align="left", x=10)),
          list(color="#aaaaaa", width=2, value=0, zIndex=2, 
               label=list(text="Capitalist &#8250;", align="right", x=-10))
        )),
        xAxis=list(min=-2.5, max=2.5, plotLines=list(
          list(color="#aaaaaa", width=2, value=0, zIndex=2, 
              label=list(text="Libertarian &#8250;", verticalAlign="bottom", y=-10,
                         textAlign="right"))
        )),
        exporting=list(buttons=list(contextButton=list(x=0)))
      )
    )
  ))


#----- Save HTML for external use -----#
#htmltools::save_html(fig3, "figures/world-parties2.html")

3. Ideology and inequality

3. Ideology and Inequality

Soon after the parties began drifting apart on economic issues, the divide between the rich and poor started growing.⁸ The United States is now one of the most polarized and unequal democracies in the world.⁹ Economic inequality naturally deepens over time in capitalist economies without routine maintenance to tax codes, labor regulations, and social programs, which becomes especially difficult as parties polarize.¹⁰ These policies tended to fail in the Senate more than the House because of the Senate’s filibuster and inequitable representation.¹¹ Partisanship in the Senate is one of the strongest predictors of income inequality; they follow a very similar path but a decade apart.¹²

Figure 1.3: Partisanship in the Senate alongside the income share of the top 1%. House polarization and the minimum wage gap are hidden but can be enabled by clicking on their labels in the legend. Polarization is measured with an index I developed in my master’s thesis (Morse 2021) which combines the distance between the parties (the difference between the median DW-Nominate ideal points of the two parties) and the homogeneity within each party (the pooled standard deviation of the two parties) using forecastable component analysis (Goerg 2013). Data source: Voteview (Lewis et al. 2021). The minimum wage gap is the difference between the federal minimum wage and the level it would have been at had it kept up with productivity (see Schmitt 2012). Data source: Baker (2020). Income inequality is measured with the income share of the top 1%, meaning the percent of all Americans’ incomes earned by the top 1% of households. Data source: Frank (2014). All series were rescaled to be centered around 0 with a standard deviation of 1. Produced in R with highcharter.

Code

#----- Setup -----#
library(tidyverse)
library(highcharter)

# Ideology and inequality data
polar_long = polar %>%
  merge(inc) %>%
  select(year, sen_pol, hs_pol, top1) %>%
  reshaper("chamber", "variable", "metric", "desc", "hc_color")  # function defined in code on the homepage

# Minimum wage data
gap = read.csv("data/wages.csv") %>%
  mutate(diff = hypothetical - real) %>% # calculate expected min wage
  select(year, diff) %>% # select variables
  na.omit() %>% # remove blank years
  
  # Prepare data for plotting
  mutate(term=NA, chamber="National", variable="Minimum wage gap",
         desc=paste("Projected min. wage (if it kept up with productivity)",
                    "<br>minus real min. wage"),
         norm = round(scale(diff),3)) %>%
  select(year, term:desc, orig=diff, norm) %>% 
  filter(year<2016, year %in% polar_long$year)


#----- Interactive plot -----#
fig2 = highchart() %>%
  
  # Senate polarization
  hc_add_series(
    filter(polar_long, chamber=="Senate"), 'line', 
    hcaes(x=year, y=norm, group=variable),
    tooltip = list(
      pointFormat = paste(bullet, "{point.desc}: <b>{point.orig:.2f}</b><br>")
    )
  ) %>% 
  
  # Income Inequality
  hc_add_series(
    filter(polar_long, chamber=="National"), 'line', 
    hcaes(x=year, y=norm, group=variable),
    tooltip = list(
      pointFormat = paste(bullet, "{point.desc}: <b>{point.orig:.1f}%</b><br>")
    )
  ) %>% 
  
  # Minimum wage gap
  hc_add_series(
    gap, 'line', dashStyle="ShortDot", visible=FALSE,
    hcaes(x=year, y=norm, group=variable),
    tooltip = list(
      pointFormat = paste(bullet, "{point.desc}: <b>${point.orig:.1f}</b><br>")
    )
  ) %>%
  
  # House polarization
  hc_add_series(
    filter(polar_long, chamber=="House"), 'line', visible=FALSE,
    hcaes(x=year, y=norm, group=variable),
    tooltip = list(
      pointFormat = paste(bullet, "{point.desc}: <b>{point.orig:.2f}</b><br>")
    )
  ) %>% 
  
  # Labels and axes
  hc_title(text = "The Link Between Polarization and Inequality") %>% 
  hc_subtitle(text = paste("Congressional polarization", 
                           "and the income share of the top 1%")) %>% 
  hc_xAxis(title = list(enabled=FALSE)) %>%
  hc_yAxis(title = list(text="Standard deviations from mean"), tickInterval=.5) %>%
  
  # Formatting
  hc_morse() %>%
  hc_size(height=500) %>%
  hc_exporting(enabled=TRUE, filename="polinc", sourceWidth=750) %>%
  hc_colors(c("steelblue", "darkseagreen", "seagreen", "orchid")) %>%
  hc_responsive(rules=list(
    list(condition = list(minWidth=700),
         chartOptions = list(legend=list(align="right", layout="vertical")))
  ))

4. Democratic breakdown

The rising polarization, inequality, and authoritarianism reached a breaking point in 2016, when American democracy began deteriorating more quickly than at any point in its history.¹³ While political scientists have ideas on how to turn this around, getting these reforms passed is a puzzle that has yet to be solved.¹⁴ The US has one of the most entrenched political systems in the world; policy changes and constitutional amendments are harder to pass and more elite-run than in other countries.¹⁵ Many democracies and just over half of the US states involve the public when writing and amending their constitutions, which prevents corruption.¹⁶

Figure 1.4: Democracy metrics and constitutional revisions around the world, 1789-2021. This app lets users explore the relationships between constitutional revisions and democracy with a variety of indices. Data sources: V-Dem (Coppedge et al. 2021) and the Comparative Constitutions Project (Elkins, Ginsburg, and Melton 2005). Produced in R with shiny and highcharter.

5. Constitutional design

The US Constitution was one of the first of its kind, and nearly 1,000 national constitutions have been adopted since it was written.¹⁷ Two general constitutional models have emerged: majoritarian and consensus democracies.¹⁸ This table lays out some of the main characteristics of each. The US is majoritarian, whereas most modern democracies are more consensus-based.¹⁹ The majoritarian model gives rise to two-party systems and often struggles in diverse societies.²⁰ Countries nowadays tend to avoid modeling their constitutions on the American system due its poor track record outside of the US (and arguably within the US now too).²¹

Table 3.1: Comparison of majoritarian and consensus democracies based on Lijphart’s (2012) typology. Not all countries have all five of the attributes of one of these models, but most generally fit with one or the other. The electoral system is usually the key characteristic that determines whether a country is a majoritarian or consensus democracy.
Characteristic	Majoritarian model	Consensus model
Overall	Two or three parties go back and forth holding power	Power is shared among several parties
Electoral system	Plurality elections (also called winner-take-all elections or first-past-the-post voting): each district has one seat, which goes to the candidate with the most votes	Proportional representation: districts have multiple seats, which are split among the parties based on how many votes they received
Form of government	Presidential: the chief executive is elected by the public or an electoral college	Parliamentary: the chief executive is elected by the legislature, usually the head of the largest party
Party system	Two-party system: two parties dominate, and minor parties have little to no influence	Multiparty system: three or more parties hold substantial seat shares and influence in the legislature
Executive cabinet	Concentrated power: The president appoints the cabinet so most top officials are of the same party	Shared power: A coalition of parties negotiates on who to appoint to the cabinet, so top officials are not all of the same party
Interest groups	Pluralist: Groups from a wide variety of interests lobby legislators and regulators to influence policy	Corporatist: Groups such as unions and business associations are formally included in the policymaking process and negotiate with each other on certain policies

Figure 1.5: This app lets users configure a constitution on the left to maximize the outcomes in the graph on the right. By default, the app is configured to the United States Constitution. Users can pick whether to use linear regression or random forests for predictions. Linear regression produces less accurate predictions but it is easier to see how the bars move upon changing constitutional features. The plot on the second tab shows variance importance measures from the random forests, and the tooltip also displays coefficients from the linear regression models. Predictions are based on a sample of more than 600 constitutions since 1789, although some data were imputed with the MICE algorithm. All data are from the most recent version of the constitution. For example, the data for the US Constitution specifies that the upper house of the legislature is directly elected (even though the Senate was not directly elected until the 17th amendment) and democracy data for the US is as of 2021. Data sources: CCP (Elkins, Ginsburg, and Melton 2005), QoG (Teorell et al. 2023), and V-Dem (Coppedge et al. 2021). Produced in R with shiny and highcharter.

6. Proportional representation

Consensus democracies are built on proportional representation, an electoral system that started gaining steam around a century after the US Constitution was written.²² These countries tend to have more political parties, less polarization, less economic inequality, better quality governance, and higher public satisfaction with the political system.²³ The US could pass a law adopting proportional representation in the House,²⁴ which would empower more parties and force them to cooperate rather than just compete.²⁵ The Senate and Electoral College are more difficult to change and together hold far more power than the House, so shifting to a more consensus-based model of democracy in the long term would take constitutional reform.

Figure 1.6: This app lets users take a closer look at how the performance of the US Constitution stacks up against constitutions of similar countries. The preamble of the United States Constitution lists five objectives for itself: “establish Justice” (fairness and rule of law), “insure domestic Tranquility” (internal peace and harmony), “provide for the common defence” (security and stability), “promote the general Welfare” (high standard of living), and “secure the Blessings of Liberty” (political and economic freedom). Above, users can explore different approaches to measuring outcomes that correspond to each of the Constitution’s objectives. The countries included for comparison are the 38 member states of the OECD. The statement at the top of each graph describing the relationship between the two variables is based on a simple linear regression model. If a variable other than the United States is selected in the “Color by” input, the model controls for that variable. Data sources: Freedom House (2022), WJP (2021), IEP (2022), Fund for Peace (2022), UNDP (2022), OECD (2021), V-Dem (Coppedge et al. 2021), Heritage Foundation (2022), WID (2022), CPDS (2020), IDEA (2018), DPI (2020), and CCP (Elkins, Ginsburg, and Melton 2005). Produced in R with shiny and highcharter.

Notes

For example, Southern states led the nation in regulating the railroad industry to break up monopolies in the 1880s and 90s, which laid the groundwork for federal anti-trust regulations (Link 1946).↩︎
Southerners were usually against labor regulations and union protections, but they sided with Democrats on most other economic issues in the early twentieth century (Farhang and Katznelson 2005). See also: DeJanes (1985), Biles (1994), and Rothstein (2017).↩︎
Southerners started shifting into the Republican Party not because of its policy positions but because the party began using racist dog-whistles and Southern-style rhetoric to attract Southerners. This became known as the Southern Strategy. As Maxwell and Shields (2019) write: “Republicans had to mirror southern white culture by emphasizing an ‘us vs. them’ outlook, preaching absolutes, accusing the media of bias, prioritizing identity over the economy, depicting one’s way of life as under attack, encouraging defensiveness toward social changes, and championing a politics of vengeance.”↩︎
Most political parties release platforms or manifestos describing their ideology each election cycle, which the Manifesto Project collects and turns into a dataset. Researchers then developed algorithms that search for patterns in the text and score each party’s economic and social ideologies (Elff 2013). The dataset includes all major parties and many minor parties in democratic countries since the 1990s and some earlier platforms. Most countries not in the dataset are autocracies. See the caption below the chart for more details.

Because these ideological scores are based on party manifestos, they do not necessarily capture the party leaders’ actual views (just the ones they believe will help them gain votes) or their supporters’ views. Parties can also backtrack on promises and move in a different direction once they are in power. Additionally, party manifestos do not always reflect the culture of the party’s voter bases. Some groups prefer to use social norms rather than laws to enforce certain rules in society, so their parties may look more libertarian even if their bases are more authoritarian outside of politics. The scores shown in the chart are rough estimates of where parties stand relative to each other, but they are not exact or perfect. Nevertheless, party manifestos (and the information that can be gleaned from them) are generally strong indicators of how a party will govern if it comes to power.↩︎
Authoritarian parties support limiting what kinds of personal decisions people can make (regarding marriage, drugs, guns, abortions, voting, privacy, and more) and who has rights and power (by race, ethnicity, gender, sexuality, religion, class, or other traits).

The Republican Party’s 2008 platform is in the top 2% most authoritarian platforms out of all parties from 1990-2010 in the dataset. The party’s extreme score on authoritarianism may seem odd when reading the platform; it repeatedly professes a comittment to human rights and limited government interference in personal decisions. Keep in mind that nearly every party claims to hold these values in their platforms as well. What sets the Republican Party apart is (a) its support for increased government regulation of marriage, sexuality, abortion, drug use, immigration, educational materials, gambling, and voting rights; (b) support for expanding law enforcement, national security, and military operations; and (c) language with religious, nationalist, and law-and-order buzzwords.

Today, most parties that share Republicans’ views (on paper) are either small fringe movements or part of autocratic regimes. See Golder (2016) for an overview of far-right parties in Europe, most of which are relatively close to the Republican Party in the chart below. These parties have risen in influence in recent years, but only a few far-right parties have won control of their government. The Republican Party’s most powerful neighbor on social issues is United Russia, the ruling party of an autocratic regime (Fish 2018). On paper, United Russia’s 2007 platform appears to be slightly less authoritarian than the Republican Party’s 2008 platform, although in practice it is more authoritarian. For most parties, it wouldn’t make sense to be openly authoritarian in writing, so the scores on the authoritarian-libertarian dimension of the chart below are probably underestimating many parties’ authoritarian tendencies. To be precise, the data in the chart suggest that the Republican Party in 2008 had some of the most openly authoritarian stances in the democratic world, but it was not necessarily the most actively authoritarian institution.

Democrats are also more authoritarian than most other parties. The Democratic Party’s 2008 platform is in the 18th percentile of most authoritarian platforms out of all the parties from 1990-2010 in the dataset. This is likely due to its stances on national security, military expansion, environmental regulations, gun control, and criminal justice. The platform also has some religious and law-and-order buzzwords, but not as much as the Republican platform.↩︎
Research consistently shows that the Republican Party has become more ideologically extreme and more stubbornly partisan than the Democratic Party (Bartels 2008; Grossman and Hopkins 2016; Hacker and Pierson 2015; Poole and Rosenthal 2017). Public opinion data generally suggest that Republican voters are more partisan than Democratic voters, but not by much. However, Republican politicians and policies have moved significantly farther from the center than Democratic politicians and policies, which can be seen in congressional voting records.

Some critics have argued that these data are not reliable or accurate, but these critiques generally misinterpret the data. This author, for example, seems to be unaware that DW-Nominate scores have two dimensions, each dimension changes meaning over time, and individual scores can easily be taken out of context. Poole and Rosenthal (2017) and others have extensively documented that summary measures from DW-Nominate scores (such as party means, differences between party means, and the index used in Figure 1.3) line up well with intuition and other indicators of partisanship.↩︎
As I argue in Chapter 4, the main reason the Republican Party has moved farther from the center is that its voters happen to live more in states that are overrepresented in the Senate and the Electoral College. A party is overrepresented in a body if it has a greater seat share than its vote share in the last election. For example, in 2019 and 2020, Republicans held 53% of the seats in the Senate despite receiving 44% of the total votes cast in the elections for these seats (in 2014, 2016, and 2018). Similarly, in 2016, President Trump won 56% of the votes in the Electoral College despite receiving only 46% of the popular vote.

When a party is overrepresented, it can cater to a narrower base and still win elections. While it may seem wise for a party to try appealing to more voters anyway, this requires compromising on positions in ways that could alienate its hard-line voters. In practice, overrepresented parties have no incentive to moderate their policies just so they can appeal to an extra 5-10% of voters when they can win without those extra voters. The underrepresented party, on the other hand, usually has to appeal to at least 55% of the voters to win the presidency and control Congress, so it has to be more moderate.↩︎
See Levendusky (2013) and Hetherington (2009) for details on the roots of polarization in the twentieth century and McCarty, Poole, and Rosenthal (2006) for analyses of the feedback loop between polarization and economic inequality.↩︎
A democracy is a country that is run by the people either directly or indirectly through elected representatives. For a country to be considered a democracy, it must have generally free elections, universal adult suffrage, and equality before the law. This is closer to James Madison’s definition of a republic in Federalist 10 (1788), and his definition of democracy is closer to what today is called a direct democracy. Some Americans still use the older definition, but most of the world uses the more modern one.

Comparing levels of polarization across different countries is difficult because polarization looks different in each country, but researchers have offered several approaches. Boxell, Gentzkow, and Shapiro (2022) gathered survey data from 12 democracies and measured affective polarization, the degree to which people dislike other political groups. They found that the US was the most polarized country in this sample and that polarization in the US was rising three times more quickly than any of the other countries. Stanig (2011) used similar survey data but measured polarization slightly differently and found the same result. Studies of online polarization are consistent as well: Urman (2019) analyzed Twitter followings in 16 democracies and found that Twitter users in the US were among the least likely to follow politicians from other parties.

Income inequality is more straightforward when comparing countries. According to the OECD, the US has the 5th highest Gini coefficient (a common measure of inequality) out of the 38 OECD nations as of 2021. The only developed democracies with more unequal economies are Bulgaria, Turkey, Mexico, and Costa Rica. Globally, the income distribution of the US is right around average, as the World Inequality Database shows. When looking at the income share of the top 1% of earners, the US has roughly the same level of income inequality as South Africa, India, most of the Middle East, and much of Latin America.↩︎
Piketty (2014) found that wealth naturally accumulates more and more at the top in capitalist environments without carefully planned policies for keeping the economy competitive. On top of that, a wide variety of economic policies need periodic updates to keep up with changing times. Minimum wages and interest rates have to be updated regularly in response to inflation, tax codes and campaign finance laws need continuous attention to close up loopholes, and many regulations lose relevance with new technological and social changes. These updates become blocked more as polarization rises because the parties become gridlocked more frequently (Barber and McCarty 2015)↩︎
For a bill to pass the Senate, it generally needs support from 60% of senators (rather than just over 50% in the House) because bills can be blocked by only 40% of senators with a process known as filibustering. On top of that, each state has the same number of senators regardless of population, so senators representing even less than 40% of the population can block policies from passing. Enns et al. (2014) found that these features have fueled economic inequality in the last half century. See also: Lee and Oppenheimer (1999).↩︎
The standard measure of polarization in a chamber of Congress is the difference between the mean or median DW-Nominate scores the parties’ members. This only captures one dimension of polarization, which is typically defined as parties moving farther apart and each party becoming more ideologically uniform. The measure shown in the chart combines both of these dimensions. In my master’s thesis (2021), I showed that this measure is a robust and strong predictor of future polarization and income inequality.

Gridlock on economic policy may be part of the reason economic inequality rose so closely behind polarization in the Senate, but for the most part polarization and income inequality both rose in response to a different trend: the Republican Party was becoming larger and more economically conservative. This affects polarization more quickly than economic inequality because a party’s rightward shift immediately increases the distance between the parties but does not have noticeable effects on income distributions for several years.↩︎
Between 2015 and 2020, the US dropped more than 10% on V-Dem’s Electoral Democracy Index and Liberal Democracy Index, which are widely used indicators of the performance of each country’s democratic institutions. A similar dip can be found in the US’s Polity score, another common democracy index. Before 2016, these metrics had never fallen this sharply in the nation’s history.↩︎
For example, an open letter to Congress signed by more than 200 political scientists calls for adopting multi-member districts elected with proportional representation in the US House.↩︎
In a study of 32 democracies, Lutz (1994) measures the difficulty of amending each country’s constitution based on how many institutions’ approval is needed and the vote shares needed to pass in each body. He reports that the US has the second most difficult process for constitutional amendents in the sample. (The data in the appendix do not show any countries with a higher difficulty score, so it’s not clear why he didn’t call the US Constitution the single most difficult one to amend.) Ginsburg and Melton (2015) note that the comparative difficulty of amending the US Constitution is widely accepted by experts.

The US is also one of the only democracies that has no direct participation in its constitutional amendment process. Requiring amendments to pass a public vote or letting the public submit amendments are common nowadays (Qvortrup 2017). The US Constitution can only be changed by conventions of political elites or reinterpreted by a small court of unelected judges.↩︎
Around half of the countries in the world use public referendums at some level for amendments, and most of the ones that do are generally democratic (Anckar 2014). A study of 62 countries that adopted a new constitution between 1974 and 2011 found that more public participation in the drafting process led to significantly higher levels of democracy in the following years (Eisenstadt, LeVan, and Maboudi 2015).↩︎
Constitutions vary greatly from country to country. Elkins, Ginsburg, and Melton (2009, 49) define a constitution as a document or set of documents that fit one or more of the following criteria: “(1) are identified explicitly as the Constitution, Fundamental Law, or Basic Law of a country; OR (2) contain explicit provisions that establish the documents as the highest law, either through entrenchment or limits on future law; OR (3) define the basic pattern of authority by establishing or suspending an executive branch of government.” These same researchers compiled a full dataset of national constitutions that fit this description from 1789 through the present (Elkins, Ginsburg, and Melton 2005).↩︎
Consensus democracies are also known as consociational or consensual democracies. The majoritarian model is also known as the Westminster model. Most majoritarian democracies are presidential systems and most consensus democracies are parliamentary systems, but these concepts do not overlap entirely. For example, the UK is parliamentary but has a majoritarian system overall, and Brazil is presidential but has a consensus system.↩︎
According to the Electoral Reform Society, more than 100 countries around the world use some form of proportional representation, the key characteristic of consensus democracy (discussed more in the next section). Less than 50 use plurality elections, the key characteristic of majoritarian democracy.↩︎
The tendency for majoritarian elections to produce two-party systems is known as Duverger’s Law (see Riker 1982). Note that it is less of a “law” and more of a general trend. Some scholars such as Dunleavy and Diwakar (2011) are critical of Duverger’s Law because of two major exceptions, the UK and India. However, these countries are each controlled by a large majority party, which is not usually found in countries with proportional representation. At the very least, nearly every country with majoritarian elections is dominated by one or two major parties, so their legislatures operate in fundamentally different ways than full multiparty systems.

Regardless, the trend is clear when looking at a larger sample of countries. As Figure 1.6 shows, no majoritarian democracy in the OECD effectively has more than three parties, and all OECD countries with more than three parties use proportional representation for at least some elections.

Lijphart (1977) finds that “majority rule and democracy are incompatible” in societies that have deep social and ethnic divisions. Majoritarian systems concentrate too much power in the hands of a single group, whereas consensus systems encourage groups to cooperate and share power.↩︎
As the chart below shows when checking and unchecking the “Parliamentary system” box, parliamentary systems on average have fairer elections, stronger protections of civil liberties, higher voter turnout rates, greater stability, higher GDPs per capita, less income inequality, longer life expectancies, and higher standards of living than presidential systems.

The most famous critique of American-style presidential systems comes from Linz (1990), who argues that gridlock between the legislature and executive leads to presidencies gaining too much power, and that parliamentary systems bring more stability and more effective checks on power. Lijphart’s work (1977, 1984, 2017) mostly corroborates Linz’s argument. Some scholars disagree, arguing that the presidential model only appears to be less stable because most of the countries that have implemented it (especially in Latin America) were already unstable before adopting it (Cheibub, Elkins, and Ginsburg 2011; Mainwaring and Shugart 1997).

The declining influence of the US Constitution over other constitutions is well-documented. Klug (2000) shows that while constitutional courts around the world once cited the US Constitution and Supreme Court cases as precedents, they now tend to mention these more to distance themselves from the American model. Law and Versteeg (2012) document the growing differences between the US Constitution and current mainstream constitutions. They note that a commonly cited reason for the declining influence of the US Constitution is that the US is “increasingly out of sync with an evolving global consensus on issues of human rights” (767). Another possible reason is that the bulk of countries that have democratized in the last several decades were in the former Soviet Bloc, so their constitutions were influenced more by parliamentary countries in Europe than by majoritarian countries in the Americas due to their proximity.↩︎
In majoritarian systems like the US, many people live in districts or states represented in Congress by a party they don’t support. Under proportional representation, each person has not just a representative from their district, but a representative from their district and party. Each district has multiple representatives, not just one, and its seats are divided up based on how many votes each party receives. If a party gets 20% of the votes for a district with five seats, it wins one seat from the district. If another party gets 60% of the votes, it gets three of the district’s seats.

There are two main ways for each party to select their individual candidates. In closed list proportional representation, party leaders release lists of nominees in order from their strongest candidates to their backup candidates. Voters then choose a party rather than an individual candidate. If a party wins two seats for a district, its top two nominees win the seats. Parties could also hold primary elections to choose and order their nominees, but these are not really needed in multiparty systems. Voters have many viable parties to choose from and still be somewhat satisified, so party leaders have a stronger incentive to pick good quality candidates. Another option is open list proportional representation, which merges primary elections and general elections into a single election. Voters first select a party or independent candidate (which determines each party’s seat share) and then check off candidates within their party that they like (which determines the individual winners for each party’s seats). If 40% of voters in a five-seat district pick a particular party’s ballot, its two candidates with the most votes are elected.

Independents can run in many forms of proportional representation. Continuing the previous example, if just 20% of the voters in a five-seat district select a particular independent candidate instead of a party, that candidate wins a seat. Many countries with proportional representation have more independent politicians in office than the US because they only need to appeal to a small fraction of their district, not a majority (Brancati 2008).↩︎
These are general trends with plenty of exceptions. Furthermore, many of these studies only establish correlation, not causation, but they lay out a clear reason why proportional representation leads to better outcomes: it prevents any party from gaining too much power, whereas majoritarian systems can easily be controlled by a single party. See Riker (1982), Bernaerts, Blanckaert, and Caluwaerts (2022), Birchfield and Crepaz (1998), Lijphart (2017), and Anderson and Guillory (1997).↩︎
One way proportional representation could be implemented in the US is by eliminating congressional districts and letting House members represent their entire state at-large. In Pennsylvania, which has 17 representatives, each party would nominate up to 17 candidates. When Pennsylvanian voters go to the polls, they would get a ballot for the party they want to vote for (similar to primary elections, which would no longer be needed) and would then check off the nominees they like. If 40% of people select the Democrat ballot, around 40% of the seats (7 seats) would go to Democrats. The top 7 Democratic nominees with the most votes would win those seats. If the Libertarian Party gets at least 6% of the ballots, its top candidate would also become a representative.

The state could also be split into, say, three large districts with five to six members each instead of one large district with 17 members. Alternatively, the House could keep its current districts but add more representatives to each. If each district had five representatives, the overall size of the House would go from 435 members to 2,175 members. This may sound large, but the US already has very few representatives per capita compared to most countries, and many experts recommend greatly increasing the size of the House anyway.↩︎
Proportional representation creates an environment where many small parties can hold seats in the legislature, which makes it harder for a single party to be large enough to hold a majority in either chamber. To get anything done, parties have to form alliances (called coalitions) with other parties where they compromise with each other so they can achieve common goals (Colomer and Negretto 2005). One could argue that parties have to do this internally anyway in two-party systems, but the leaders of a majority party can consolidate power much more easily than the leaders of a similarly sized coalition of parties.↩︎