gianlucaliehner. Blog/Mapping-Zip-Codes-To-Shapefiles-In-R-With-Ggplot-G/ (2024)

What do we want to achieve?

Say you have a dataset that includes the zip-codes for each observation. Now you want to display your observations on a map for each zip-code area. Imagine we conducted a survey in which we asked citizens of Germany about their current level of happiness (on a scale from 1 to 10). We also queried their ZIP-Code to be able to display the happiness level on the map. The survey results —when not considering metadata— contain three variables:

  • Response ID (id)
  • ZIP-Code (zip)
  • Hapinsess Level (happiness)

Packages we need:

library(sf)library(tidyverse)library(ggplot2)library(ggiraph)library(colorspace)library(plotly)

Let’s load the dataset…

survey <- read.csv('happiness_germany.csv')head(survey)
 zip ID happiness BL.Name1 14532 112 6.40 Brandenburg2 16837 330 10.00 Brandenburg3 17495 593 6.40 Meckelnburg-Vorpommern4 18196 500 3.70 Meckelnburg-Vorpommern5 21228 565 10.00 Niedersachsen6 21481 188 5.95 Schleswig-Holstein

Now we need to load the geographic data. For this we will use shapefiles. Shapefiles can be found online for free for either world maps or specific countries. In this case, I downloaded a shapefile for Germany. Shapefiles usually consists of different files, and for them to be readably by R they need to be within a folder. I called the folder shapefiles_germany. Within the folder, we can find the following files.

  • gadm36_DEU_1.cpg
  • gadm36_DEU_1.dbf
  • gadm36_DEU_1.prj
  • gadm36_DEU_1.shp
  • gadm36_DEU_1.shx

Let’s load the shapefiles into R

shapefile <- sf::read_sf("shapefile_germany/")

Note that we are loading the complete folder and not only a single file!

Mapping ZIP-Code to states

Here we have a problem. The shapefile contains only state names and does not have any ZIP codes. Hence it will be difficult to map the ZIP-Codes to our shapefile. What the shapefiles do have however is the state names. So, what we need to look for, is a data table that maps ZIP codes to states. Luckily, that’s also something that can be found online pretty easily.

Let’s load the file which looks something like this:

mapping_zip_state <- read.csv('zip_code_mappings.csv')str(mapping_zip_state)
'data.frame': 8180 obs. of 2 variables: $ PLZ_Zahl: int 1067 1069 1097 1099 1108 1109 1127 1129 1139 1156 ... $ BL.Name : chr "Sachsen" "Sachsen" "Sachsen" "Sachsen" ..

Now we need to merge our happiness data and ZIP code mappings. First we only selected the variables we need:

  • ZIP code (PLZ_Zahl)
  • State name (BL.Name)

Then we merge the two datasets with merge()

mapping_zip_state %>% select(PLZ_Zahl, BL.Name) -> mapping_zip_statemerge(survey, # left dataset mapping_zip_state, #right dataset by.x = "zip", #common variable in left dataset by.y = "PLZ_Zahl", #common variable in right dataset all.x = TRUE # type of join (here left join) ) -> survey

Before we continue, let’s drop the columns that contain NAs.

Be careful: d Here rows are only dropped for demonstrating purposes. On real data sets, you should have a strategy for handling missing values!

survey %>% drop_na() -> survey

Aggregating data

Before we can match our happiness scores to the shapefile, we need to aggregate the data per state. Here I will just compute the average happiness score for each German state. I am first grouping by state (BL.Name), then summarising.

survey %>% group_by(BL.Name) %>% summarise(happiness = mean(happiness)) -> happiness_scoreshappiness_scores

I notice that 2 states are missing so I add them to the dataset.

happiness_scores <- rbind(happiness_scores, c("Bremen", NA), c("Hamburg", NA))

Matching to shapefile

Now I am ready to merge to the shapefile. Here you need to look through your shapefile and find the right matching variable. In my case this is NAME_1.

plot_data <- merge(shapefile, happiness_scores, by.x="NAME_1", by.y = "BL.Name", all.x=TRUE)

Plotting the map

We are now ready to plot our first map with ggplot. For this we use our plot data plot_dataand pipe it into ggplot.

# plot data use intention plot_data %>% ggplot + aes(fill=as.numeric(happiness)) + # Fill with color by score geom_sf(size=.1) + # Create the map from shapefile scale_fill_continuous_sequential(palette = "RedOr", begin= 0, end=1, limits=c(1,10), na.value = "grey 97")+ # Set fill color with the package 'colorspace' theme_minimal()+ labs(fill="Happiness level", title = "Level of happiness in Germany by state") # Add some labels

The final map looks something like this:

gianlucaliehner. Blog/Mapping-Zip-Codes-To-Shapefiles-In-R-With-Ggplot-G/ (1)

Interactive Map

If you want to make the map interactive, try something like this:

plot_data$happiness <- round(as.numeric(plot_data$happiness), 2)# plot data use intention plot_data %>% ggplot + aes(fill=as.numeric(happiness)) + # Fill with color by score geom_sf_interactive(size=.1, aes(tooltip=c(happiness))) + # Create the map from shapefile scale_fill_continuous_sequential(palette = "RedOr", begin= 0, end=1, limits=c(1,10), na.value = "grey 97")+ # Set fill color with the package 'colorspace' theme_minimal()+ labs(fill="Happiness level", title = "Level of happiness in Germany by state") -> happiness_plotggiraph::girafe( ggobj = happiness_plot, width_svg = 10, height_svg = 6, options = list( opts_tooltip(opacity = .9, css = "background-color:gray;color:white;padding:2px;border-radius:2px;"), opts_hover(css = "fill: #00868B;"), opts_hover_inv(css = "opacity: 0.3;") ))

The important bit here is geom_sf_interactive(size=.1, aes(tooltip=c(happiness))). This comment makes the shapefile interactive and allows you to add interactive tooltips. The ggiraph::girafe()command takes care of the styling.

Cheers!

gianlucaliehner. Blog/Mapping-Zip-Codes-To-Shapefiles-In-R-With-Ggplot-G/ (2024)

References

Top Articles
Latest Posts
Article information

Author: Carmelo Roob

Last Updated:

Views: 5315

Rating: 4.4 / 5 (45 voted)

Reviews: 92% of readers found this page helpful

Author information

Name: Carmelo Roob

Birthday: 1995-01-09

Address: Apt. 915 481 Sipes Cliff, New Gonzalobury, CO 80176

Phone: +6773780339780

Job: Sales Executive

Hobby: Gaming, Jogging, Rugby, Video gaming, Handball, Ice skating, Web surfing

Introduction: My name is Carmelo Roob, I am a modern, handsome, delightful, comfortable, attractive, vast, good person who loves writing and wants to share my knowledge and understanding with you.