Unit testing & continuous integrationHow to find bugs as soon as you create themDavid A. SelbyR-thritis Computing Group19 November 20211 / 23

Typically, we test code to ensure its output meets our expectations.

2 / 23

Great Expectations

is_plausible <- function(value, minimum = -Inf, maximum = Inf)
  value >= minimum & value <= maximum

What do you expect the output to be?

is_plausible(c(10, 37, -999), min = 0, max = 112)

3 / 23

Great Expectations

is_plausible <- function(value, minimum = -Inf, maximum = Inf)
  value >= minimum & value <= maximum

What do you expect the output to be?

is_plausible(c(10, 37, -999), min = 0, max = 112)

# [1]  TRUE  TRUE FALSE

3 / 23

Great Expectations

is_plausible <- function(value, minimum = -Inf, maximum = Inf)
  value >= minimum & value <= maximum

What do you expect the output to be?

is_plausible(c(10, 37, -999), min = 0, max = 112)

# [1]  TRUE  TRUE FALSE

is_plausible("2", min = 0, max = 10)

3 / 23

Great Expectations

is_plausible <- function(value, minimum = -Inf, maximum = Inf)
  value >= minimum & value <= maximum

What do you expect the output to be?

is_plausible(c(10, 37, -999), min = 0, max = 112)

# [1]  TRUE  TRUE FALSE

is_plausible("2", min = 0, max = 10)

# [1] FALSE

3 / 23

Great Expectations

is_plausible <- function(value, minimum = -Inf, maximum = Inf)
  value >= minimum & value <= maximum

What do you expect the output to be?

is_plausible(c(10, 37, -999), min = 0, max = 112)

# [1]  TRUE  TRUE FALSE

is_plausible("2", min = 0, max = 10)

# [1] FALSE

Why? Because "10" < "2" in lexical order

3 / 23

Should is_plausible.character:

always return NA?
throw an error?
try to coerce to numeric?
always return FALSE? (But is it really implausible?)

As an exercise, try sorting the vector c(2, 3, "a", 11, "b", "+4")

Unit testing4 / 23

Unit testing in R

Unit testing is the process in which the smallest testable parts of source code are tested individually and independently, usually in an automated way.

Formalises the testing of code
Makes it easier to identify bugs when they are introduced
Helps ensure that the code meets all necessary criteria
Reassures the user that the code works correctly

5 / 23

Unit testing in R

testthat is an R package created by Hadley Wickham for the purpose of writing unit tests for R code. It is available on CRAN.

Other unit testing packages are available (but not covered here), e.g. RUnit, testrmd.

6 / 23

Unit testing with testthat

The testthat framework comprises three parts:

Expectations: the core functions. They all have prefix expect_
Tests: a series of expectations about one feature, wrapped in test_that()
Files: containing a set of tests of related functionality

7 / 23

Unit testing with testthat

Expectations in testthat compare an object with a reference value or property. If they do not match, an error is thrown.

(If they do match, the tested object is returned, invisibly.)

expect_equal
expect_is
expect_length
expect_true
expect_error
...

8 / 23

Unit testing with testthat

expect_equal compares a number with a reference value

hyp <- 3^2 + 4^2
expect_equal(hyp, 25) # Runs without error
expect_equal(hyp, 26) # Throws an error:

# Error: `hyp` not equal to 26.
# 1/1 mismatches
# [1] 25 - 26 == -1

9 / 23

Unit testing with testthat

expect_equal compares a number with a reference value

hyp <- 3^2 + 4^2
expect_equal(hyp, 25) # Runs without error
expect_equal(hyp, 26) # Throws an error:

# Error: `hyp` not equal to 26.
# 1/1 mismatches
# [1] 25 - 26 == -1

expect_equal(sqrt(2), 1.41) # Throws an error:

# Error: sqrt(2) not equal to 1.41.
# 1/1 mismatches
# [1] 1.41 - 1.41 == 0.00421

expect_equal(sqrt(2), 1.41, tolerance = .01) # No error thrown

9 / 23

Unit testing with testthat

expect_is checks the data type of an object

val <- 43.7
expect_is(val, 'numeric')   # Runs without error
expect_is(val, 'character') # Produces an error:

# Error: `val` inherits from `'numeric'` not `'character'`.

expect_length checks the number of elements in a vector

expect_length(letters, 26) # Runs without error
expect_length(letters, 25) # Throws an error:

# Error: `letters` has length 26, not length 25.

10 / 23

Unit testing with testthat

Some expectation functions check against a (fixed) condition, rather than a custom reference value.

codes <- c(no = F, yes = T)
expect_true(codes['yes'])    # Runs without error
expect_true('yes')           # Throws an error:

# Error: "yes" is not TRUE
# 
# `actual` is a character vector ('yes')
# `expected` is a logical vector (TRUE)

expect_named(codes)          # Runs without error
expect_false(as.logical(0))  # Runs without error

11 / 23

Unit testing with testthat

Other such functions include expect_error, which checks that a function call throws an error when evaluated.

Analogously, there are expect_warning, expect_message, etc.

expect_error(3.14 + 'hello')  # Runs without error
expect_error(3.14 < 'hello')  # Throws an error:

# Error: `3.14 < "hello"` did not throw an error.

12 / 23

Back to our example

expect_true( is_plausible(5, min = 0, max = 10) )   # No error
expect_false( is_plausible(-1, min = 0, max = 1) )  # No error
expect_true( is_plausible('2', min = 0, max = 10) ) # Error:

# Error: is_plausible("2", min = 0, max = 10) is not TRUE
# 
# `actual`:   FALSE
# `expected`: TRUE

13 / 23

Back to our example

expect_true( is_plausible(5, min = 0, max = 10) )   # No error
expect_false( is_plausible(-1, min = 0, max = 1) )  # No error
expect_true( is_plausible('2', min = 0, max = 10) ) # Error:

# Error: is_plausible("2", min = 0, max = 10) is not TRUE
# 
# `actual`:   FALSE
# `expected`: TRUE

Now decide:

Fix the code just enough to satisfy the expectation,
Or: have a rethink?

13 / 23

Back to our example

Fix to pass the test (not recommended here; will cause warnings):

is_plausible <- function(value, minimum = -Inf, maximum = Inf) {
  value <- as.numeric(value)
  value >= minimum & value <= maximum
}
expect_true( is_plausible('2', min = 0, max = 10) ) # No error

14 / 23

Back to our example

Fix to pass the test (not recommended here; will cause warnings):

is_plausible <- function(value, minimum = -Inf, maximum = Inf) {
  value <- as.numeric(value)
  value >= minimum & value <= maximum
}
expect_true( is_plausible('2', min = 0, max = 10) ) # No error

Or rethink, by expecting stricter handling of inputs:

is_plausible <- function(value, minimum = -Inf, maximum = Inf) {
  stopifnot(is.numeric(value))
  value >= minimum & value <= maximum
}
expect_error( is_plausible('2', min = 0, max = 10) ) # No error

14 / 23

Writing tests

test_that('Scalar integers', {
  expect_true(is_plausible(5, min = 0, max = 10))
  expect_false(is_plausible(-1, min = 0, max = 1))
})

# Test passed

test_that('Integer vectors', {
  expect_equal(is_plausible(0:10, min = 0, max = 10),
               rep(TRUE, 11))
  expect_equal(is_plausible(-1:2, min = 0, max = 1),
               c(F, T, T, F))
})

# Test passed

15 / 23

Writing tests

test_that('Handle unusual or missing inputs', {
  expect_error(is_plausible('2', min = 0, max = 10))
  expect_error(is_plausible(min = 0, max = 10))
  expect_error(is_plausible(2, min = "0", max = 10))
  expect_error(is_plausible(2, min = 0, max = "10"))
})

# -- Failure (<text>:4:3): Handle unusual or missing inputs ----------------------
# `is_plausible(2, min = "0", max = 10)` did not throw an error.
# 
# -- Failure (<text>:5:3): Handle unusual or missing inputs ----------------------
# `is_plausible(2, min = 0, max = "10")` did not throw an error.

16 / 23

Unit testing workflows

Add a tests/ folder to an R package
- run tests with <Ctrl> + <Shift> + T
- tests will also run during R CMD CHECK
Or run tests locally in analysis folder:
- test_file() / test_dir()
- Re-test with every edit: auto_test()
R Markdown 'test chunks' with error=TRUE or testrmd

Note: most documentation for testthat assumes you are writing a package. Future talk: how to do analysis as an R package...

17 / 23

Test-driven development

Write a test before any other code.
Check that the test fails.
Write enough code to make it pass.
Add another test and iterate steps 1–3.
Refactor the code, while ensuring it passes all tests.

18 / 23

In a way, R CMD CHECK could be considered test-driven development, because the tests your package has to pass were written long before the package.

Continuous integration19 / 23

Continuous integration

In software engineering, continuous integration (CI) is the practice of merging all changes to a central repository, and automatically rebuilding & testing the code after every change.

Use version control to track/manage changes
Use CI software to automatically rerun/retest code
Identify bugs/conflicts as soon as they're created

Tools: ~~Travis CI~~, GitHub Actions, Bitbucket Pipelines, Gitlab CI/CD, Jenkins

20 / 23

People used to recommend Travis CI, but the company sold out and stopped offering a free service. If browsing the web you might still find old articles talking about an R + GitHub + Travis workflow.

Continuous integration

Make life easier for yourself:

Project metadata in a (dummy) DESCRIPTION file
R code in R/ subfolder
Tests in a tests/testthat/ subfolder
Run tests quickly: test_local() / <Ctrl>+<Shift>+T
Use Git for version control & CI for automatic testing

Minimal working DESCRIPTION file:

Package: blah
Version: 0.1

Example repo: https://github.com/Selbosh/unittesting

21 / 23

Thanks!

Based on a talk by Lewis Rendell
at Warwick R User Group, 2017.

22 / 23

Next meeting

Friday 3 December

Advent of Code discussion

https://adventofcode.com/

Attempt days 1–2 and share your approach

23 / 23

Help

Keyboard shortcuts

↑, ←, Pg Up, k

Go to previous slide

↓, →, Pg Dn, Space, j

Go to next slide

Home

Go to first slide

End

Go to last slide

Number + Return

Go to specific slide

b / m / f

Toggle blackout / mirrored / fullscreen mode

Clone slideshow

Toggle presenter mode

Restart the presentation timer

?, h

Toggle this help