This note shows a quick way to draw convex hulls, using
Our example data is a dataset of European parliamentary constituencies, some of which have been successfully geocoded with the help of the
ggmap package. The package taps into Google Maps to find approximates coordinates for addresses, which worked well for most constituencies after some light tweaking of their names.
You can get the data by running this script.
Assuming that you have loaded
ggplot2, the data can be represented as a set of (sometimes duplicated) coordinates within each country. Some of the scatterplots below should be familiar to European readers, especially those for France and Italy:
Let's now draw lines around the points of each country, i.e. convex hulls. R comes with a convex hull function that returns an ordered list of row numbers; the coordinates located on these rows are part of the convex hull.
For every country, let's number the rows from 1 to n, the total number of rows. Let's then encode these numbers as a factor, while setting the levels of that factor to the results of the convex hull function. Last, let's order the data based on this new variable.
dplyr package offers a simple way to perform all these operations:
hull variable now contains either missing values on rows that are not in the convex hull, or numbers corresponding to the position of the row in the convex hull. Let's send those specific rows to a polygon geometry, which will draw the convex hulls of each country, and overlay the full set of coordinates:
The result is correct only because we took the precaution of ordering the data according to the row numbers returned by the convex hull function. Try plotting the unordered data, and you will get a messy set of polygons that will not reflect the correct boundaries of the convex hulls.
The hulls of some countries, such as France, Italy, or Portugal, include some constituencies that are located overseas. In the case of Portugal, those constituencies are the Autonomous Region of the Azores and Madeira:
The code for this note appears in this Gist, along with the data, which might still contain some mistakes. Please leave a comment on the Gist if you find an error in the constituency geocodes.
- First published on January 11th, 2016