Unit tests for R packages

Karl Broman has written a nice blog post to recommend writing unit tests for R packages. Here are a few more pointers on how to write these tests.

I was introduced to unit tests by Barret Schloerke, who asked me to write some for the network plotting functions that I have contributed to the GGally package. His request mentioned two packages that I had never used before:

the testthat package, which provides the core unit test functions; and
the covr package, which is useful to find untested lines in the code.

Once both packages are installed, the basic logic of unit testing with testthat can be learnt from the “Testing” chapter in Hadley Wickham's R Packages book. Reasons for writing unit tests in the first place are listed in Wickham's article about testthat in the R Journal, and are well illustrated in Broman's blog post.

How many tests lie ahead

At the very least, writing unit tests for a function means trying out every argument of that function, as well as every situation where the function sends back an error, message or warning. As a consequence, functions with many arguments and/or conditional statements will require many tests.

Furthermore, because the first thing that you will take away from writing these tests is a proper idea of the many mistakes that users might make while trying to use your function, writing tests will lead you to add some control flow statements to your code, which will in turn, lead to more tests.

Also note that although it is usually possible to write unit tests for every aspect of a function, it is not always the case:

For instance, if your function tries to download something and knows what to do if a network error occurs, the only way to test that part of the code is by provoking that error programmatically, which might or might not be doable.
Similarly, if your function tests for the presence of a package and returns an error if that package is not installed, the only way to test that part of the code would be to uninstall that package, run the test, and then re-install the package before resuming testing.

The remarks above suggest that, if your code is made of lengthy functions with many optional arguments placed in "if/else" conditional statements, then you are looking at code that will require an extensive battery of unit tests to reach high coverage.

Computing tests coverage

A simple way to get started with unit tests is to use the examples of a function as the first set of tests, and then to calculate the percentage of code that gets tested – or "covered" – through these examples.

The covr package makes it very easy to calculate this percentage, thanks to its function_coverage and package_coverage functions. The former is well illustrated in Barret's tests request, and the latter does not require any argument to compute the coverage percentage at the level of the entire package.

The function_coverage and package_coverage functions produce objects that can be passed to the zero_coverage function, which will list the lines of your code that are not getting tested. Even better, if you use zero_coverage from within RStudio, the lines will be shown using markers similar to those used by code diagnostics.

The coverage functions above are meant to help you add tests until every function in the package gets as close to full (100%) coverage as possible.

First published on December 8th, 2015

Other notes