Mapping County Demographic Data in R



Ari Lamstein, a technology consultant and author of the free email course,  L​earn to Map Census Data in R, provides an introduction to mapping US demographic data using open source software R.  

Today I will demonstrate how to map US County demographic data in R. Esri recently announced​ that it is adding additional support for R. This, in turn, has led to an increased interest in R from the GIS community. While R is not a full­fledged GIS program, its ability to import, manipulate and visualize data is phenomenal. Additionally, its packaging system makes it easy for users to create, package and share additional functionality.

We will use the c​horoplethr ​package to map our data. The name “choroplethr” is a play on the words “choropleth” and “R”. In addition to facilitating the creation of choropleth maps, choroplethr ships with demographic statistics from the US Census Bureau.

If you are new to R, you might want to take a quick primer (such as h​ere​ or h​ere)​ before continuing.

Step 1: Install and Load the Packages

As I mentioned above, we will be using the choroplethr package to generate our maps. We will also need the “choroplethrMaps” package. From the R command line, type the following commands. This will install and load the packages:

install.packages(c("choroplethr", "choroplethrMaps")) 

Step 2: Create a Simple Map

The choroplethr package comes with a data frame containing 2012 US County Population Estimates. The data frame is called d​f_pop_county.​ We can load it and see the first few elements like this:

## region value 
##1 1001 54590 
##2 1003183226 
##3 1005 27469 
##4 1007 22769
##5 1009 57466
##6 1011 10779

An important point is that the one column is named r​egion​and one column is named value.​ The regions are c​ounty FIPS codes.​

The function we will use to create county choropleth maps is called c​ounty_choropleth. ​It requires you to pass it a data frame with one column named r​egion ​and one column named v​alue.​



Adding a title and legend is as simple as adding parameters to county_choropleth:​

                 title ="2012County Population Estimates", 
                 legend = "Population")


Step 3: Experiment with the Colors

By default c​ounty_choropleth​ uses seven quantiles to display the color. That is, seven colors are used, and an equal number of regions have the same color. The number of quantiles can be changed with the n​um_colors​ parameter. For example, n​um_colors=2​ will show which counties are above and below the median:


                  title = "2012 State Population Estimates",
                  legend = "Population", num_colors = 2)


Using one color will use a continuous scale. This is useful for seeing outliers in the data:

                  title = "2012 County Population Estimates",
                  legend = "Population", num_colors = 1)


Los Angeles County (FIPS code 6037) has a population of almost 10 million, which is far larger than any other county in the US.

Step 4: More Demographics

Eight demographic statistics from 2013 are available in the data frame df_country_demographics:

##[1]  "region"            "total_population" "percent_white"
## [4] "percent_black"     "percent_asian"    "percent_hispanic" 
## [7] "per_capita_income" "median_rent"      "median_age"

We can map any of them by creating a new column in the data frame called “value”, and setting it equal to the value we want to map:

df_county_demographics$value = df_county_demographics$percent_white 
                  title = "2013 County Demographics\nPercent White", 
                  legend = "Percent White")



I hope that you have enjoyed this introduction to mapping county demographics in R. Similar functionality exists for mapping state demographics; see the function ?state_choropleth​ for details.


Like this article and want more?

Enter your email to receive the weekly GIS Lounge newsletter: