Geocoding US Address Data with zipcode Package & Visualize it (2024)

Geocoding US Address Data with zipcode Package & Visualize it (1)

Amazing things about R is that there are more than 13,000 packages (as of writing on 2/8/2019) that are available at the official repository called CRAN (The Comprehensive R Archive Network) and a lot more at other repositories like Github.

Among many things those R packages do ranging from data wrangling, visualization, modeling, etc., some packages include data that is super useful.

Today, I want to introduce this package called ‘zipcode’ from Jeffrey Breen that provides literally US zip code related data including zip code, city name, state name, longitude, and latitude.

This is super useful when you have US address data that contains zip code and want to obtain geocodes (longitude and latitude).

You can import US zip code data from this package and join it with your data mapping by the zip code.

Yes, by using the zipcode you can ‘geocode’ your US address data and visualize with Map like the below!

Geocoding US Address Data with zipcode Package & Visualize it (4)

This could be a good enough ‘geocoding’ solution especially when you don’t want to do the costly and slow geocoding operations using the third party web services like Google’s geocode API, etc.

Importing data from R package is actually super simple in Exploratory.

Let’s take a look how to import the data and use it to visualize US address data.

To demonstrate, I’m going to use this US hospital rating data from The Centers for Medicare & Medicaid Services, for which I want to geocode all the hospital locations.

First, let’s install ‘zipcode’ R package.

Select ‘Manage R Packages’ from the project dropdown menu.

Geocoding US Address Data with zipcode Package & Visualize it (5)

Type ‘zipcode’ and click ‘Install’ button under ‘Install New Packages’ tab.

Geocoding US Address Data with zipcode Package & Visualize it (6)

Make sure that ‘zipcode’ package is installed and showing up under ‘Installed Package’ tab.

Geocoding US Address Data with zipcode Package & Visualize it (7)

Let’s import data from the ‘zipcode’ package.

Select ‘R script’ under Data Frames dropdown menu.

Geocoding US Address Data with zipcode Package & Visualize it (8)

Type the following in the code editor area.

library(zipcode)
data(zipcode)
zipcode
Geocoding US Address Data with zipcode Package & Visualize it (9)

If you are not familiar with R, the first line is loading the ‘zipcode’ package into the current R session. The second line is using ‘data’ function to extract the ‘zipcode’ data from the package as a data frame called ‘zipcode’. The last line is calling the data frame to return the data.

If you want to know more about ‘zipcode’ package functionality, take a look the reference doc for ‘zipcode’ package.

Click ‘Run’ button to get the data and click ‘Save button to create a data frame inside Exploratory.

Now you have the zip code data imported from ‘zipcode’ R package.

Geocoding US Address Data with zipcode Package & Visualize it (10)

It’s that simple!

Here is the hospital data that has a list of the hospitals that are surveyed for the service quality by the patients.

Geocoding US Address Data with zipcode Package & Visualize it (11)

This data can be downloaded from here.

And we want to visualize the hospital locations on Map.

There are a few data problems I need to address before.

First, there are multiple rows per hospital because there are different survey questions/answers for each hospital. I want to keep only the unique row, one hospital per row.

Geocoding US Address Data with zipcode Package & Visualize it (12)

Second, the zip code column in the hospital data is numeric data type and this is a problem.

Geocoding US Address Data with zipcode Package & Visualize it (13)

US zip code is always 5 digits. For example, the above case, the zip codes for the hospitals in Massachusetts have only 4 digits but these should be filled with zero at the biginning. So 1040 should be 01040.

We can take two steps to address this problem. First, we’ll convert this to Character data type, then fill with zero at the biginning.

Third, we want to join this data with the zip code data that we previously imported from ‘zipcode’ package.

There is a hospital id column and we can use this to keep only the unique hospital rows.

Geocoding US Address Data with zipcode Package & Visualize it (14)

This will remove all the duplicated rows and keep only the unique rows based on the hospital id.

Geocoding US Address Data with zipcode Package & Visualize it (15)

Convert to Character Type

We can change the data type of Zip Code column by selecting

Change Data Type -> Convert to Character

from the column header menu.

Geocoding US Address Data with zipcode Package & Visualize it (16)

Notice that the data type for ZIP Code column indicates as Character type.

Geocoding US Address Data with zipcode Package & Visualize it (17)

Fill (Pad) 0 at the beginning

We can select

Work with Text -> Pad Text

from the column header menu.

Geocoding US Address Data with zipcode Package & Visualize it (18)

This will bring up Mutate dialog with ‘str_pad’ function being pre-populated.

We can make it to be something like this.

str_pad(`ZIP Code`, pad="0", side="left", width=5)

Here’s how it looks in the dialog.

Geocoding US Address Data with zipcode Package & Visualize it (19)

Once you run it, we can see that ‘0’ is filled at the left hand side of each text to make all the zipcode values to have 5 digits width.

Geocoding US Address Data with zipcode Package & Visualize it (20)

Now it’s ready to join with the zip code data!

We’ll use ‘ZIP Code’ column as the join key to join with the zip code data frame.

Select ‘Join (Add Columns)’ from the column header menu.

Geocoding US Address Data with zipcode Package & Visualize it (21)

In the dialog, select the data frame that has the zip code and longitude/latitude information.

In my case, that is ‘zipcode_data’, and select ‘zip’ column as the key column of the target data frame.

Geocoding US Address Data with zipcode Package & Visualize it (22)

Once that’s done, you’ll see the new column being added at the end.

Geocoding US Address Data with zipcode Package & Visualize it (23)

The columns with the orange bar at the top are the ones from the zip code data frame.

We can quickly visualize the hospital locations by using the longitude and latitude columns with Map.

Under the Chart view, select Map — Long/Lat as the chart type.

Geocoding US Address Data with zipcode Package & Visualize it (24)

You can zoom in if you like.

Geocoding US Address Data with zipcode Package & Visualize it (25)

That’s it!

If you don’t have Exploratory Desktop you can sign up from the website!

Geocoding US Address Data with zipcode Package & Visualize it (2024)

References

Top Articles
Latest Posts
Article information

Author: Dan Stracke

Last Updated:

Views: 5311

Rating: 4.2 / 5 (43 voted)

Reviews: 90% of readers found this page helpful

Author information

Name: Dan Stracke

Birthday: 1992-08-25

Address: 2253 Brown Springs, East Alla, OH 38634-0309

Phone: +398735162064

Job: Investor Government Associate

Hobby: Shopping, LARPing, Scrapbooking, Surfing, Slacklining, Dance, Glassblowing

Introduction: My name is Dan Stracke, I am a homely, gleaming, glamorous, inquisitive, homely, gorgeous, light person who loves writing and wants to share my knowledge and understanding with you.