Testing an R package against a Julia package on github actions
Something I have learned the hard way: every single line of code that I write should be accompanied by a unit test. Unit tests are not only useful for me as a developer to spot bugs, but a thorough sequence of unit tests should also increase others’ confidence in the quality of my code.
For statistical methods and algorithms, one great way to test if everything is working as intended is to test one’s own implementation of an algorithm against someone else’s.
This idea is expressed in ropensci’s statistical software standards:
G5.4 Correctness tests to test that statistical algorithms produce expected results to some fixed test data sets (potentially through comparisons using binding frameworks such as RStata).
- G5.4b For new implementations of existing methods, correctness tests should include tests against previous implementations. Such testing may explicitly call those implementations in testing, preferably from fixed-versions of other software, or use stored outputs from those where that is not possible.
Preferably, these unit tests should be automized via continuous integration tools such as github actions. CI is super useful - a CI workflow might e.g. trigger to run all tests at every commit, so I definitely won’t forget to run them. Second, CI integration via github actions communicates to my package’s user that I have actually and successfully run all unit tests on the latest development version. Last, all tests are run “in the cloud”, so I am not blocking my own computer for (potentially) multiple minutes.1
As expressed in ropensci’s guideline G5.4b, in scientific computing, the same algorithm might only be implemented just once, or, in the lucky case that another implementation exists, it might be written in another language.
This applies to the “fast” cluster bootstrap algorithm implemented in fwildclusterboot. To my knowledge, there exists an R implementation (fwildclusterboot
), the original Stata version (boottest), and recently, David Roodman has written a Julia package, WildBootTests.jl.
Up until now, fwildclusterboot's
test sequence has been divided in two separate parts. Part one’s job was to test if the methods for different regression packages in R producec consistent results (“internal consistency”). These tests could easily be run on CRAN and github actions, as they only required fwildclusterboot
and a working R distribution.
Part two of the test sequence tested fwildclusterboot's
“external validity” by running the bootstrap in both R and Stata (via boottest & the RStata package). Before a CRAN submission, I would run all these tests on my local machine. Of course, the unit tests involving Stata could not be run on CRAN, and I did not think (and am now uncertain) that these tests could be automatized via github actions.2
As it happend, my Stata student license expired the other week, and when I checked StataCorps website for renewal fees, Stata - for non-academics - had a 840 € price tag for a one-year license. To me, this seemed to be a hefty price tag for running a few unit tests for fwildclusterboot
, and was a good incentive to update my unit tests.
Luckily, WildBootTests.jl
is now around, and as Julia is open source, I was fairly certain that I could transfer my “external” tests from Stata to Julia and deploy all tests on github actions.
How to set up a workflow that tests an R package against a Julia package
So, how does a workflow to test an R package against a Julia package looks like?
First of all, I have used the skeleton of fwildclusterboot
to produce a wrapper around WildBootTests.jl
, wildboottestjlr. All data pre-processing is handled by functions copied from fwildclusterboot
. For running the bootstrap, R sends all required objects to Julia (via the excellent JuliaConnectoR
package, on which I will blog in the future). In a second step, I have added wildboottestjlr
to fwildclusterboot's
dependencies and have initiated a github actions workflow via usethis::use_github_actions()
. (Note that it is in fact not neccessariy to write an entire R wrapper-package for the Julia algorithm you want to test. wildboottestjlr
exists because WildBootTests.jl
is actually a magnitude faster than fwildclusterboot
and because it has many additional feature, as e.g. a wild cluster bootstrap for instrumental variables.)
The created workflow yaml looks like this:
# For help debugging build failures open an issue on the RStudio community with the 'github-actions' tag.
# https://community.rstudio.com/new-topic?category=Package%20development&tags=github-actions
on:
push:
branches:
- main
- master
pull_request:
branches:
- main
- master
name: R-CMD-check
jobs:
R-CMD-check:
runs-on: ${{ matrix.config.os }}
name: ${{ matrix.config.os }} (${{ matrix.config.r }})
strategy:
fail-fast: false
matrix:
config:
- {os: windows-latest, r: 'release'}
- {os: macOS-latest, r: 'release'}
- {os: ubuntu-20.04, r: 'release', rspm: "https://packagemanager.rstudio.com/cran/__linux__/focal/latest"}
env:
R_REMOTES_NO_ERRORS_FROM_WARNINGS: true
RSPM: ${{ matrix.config.rspm }}
GITHUB_PAT: ${{ secrets.GITHUB_TOKEN }}
steps:
- uses: actions/checkout@v2
- uses: r-lib/actions/setup-r@v2
with:
r-version: ${{ matrix.config.r }}
- uses: r-lib/actions/setup-pandoc@v2
- name: Query dependencies
run: |
install.packages('remotes')
saveRDS(remotes::dev_package_deps(dependencies = TRUE), ".github/depends.Rds", version = 2)
writeLines(sprintf("R-%i.%i", getRversion()$major, getRversion()$minor), ".github/R-version")
shell: Rscript {0}
- name: Cache R packages
if: runner.os != 'Windows'
uses: actions/cache@v2
with:
path: ${{ env.R_LIBS_USER }}
key: ${{ runner.os }}-${{ hashFiles('.github/R-version') }}-1-${{ hashFiles('.github/depends.Rds') }}
restore-keys: ${{ runner.os }}-${{ hashFiles('.github/R-version') }}-1-
- name: Install system dependencies
if: runner.os == 'Linux'
run: |
while read -r cmd
do
eval sudo $cmd
done < <(Rscript -e 'writeLines(remotes::system_requirements("ubuntu", "20.04"))')
- name: Install dependencies
run: |
remotes::install_deps(dependencies = TRUE)
remotes::install_cran("rcmdcheck")
shell: Rscript {0}
# install julia
- uses: julia-actions/setup-julia@v1
# add julia to renviron
- name: Create and populate .Renviron file
run: echo JULIA_BINDIR= "${{ env.juliaLocation }}" >> ~/.Renviron
shell: bash
# install WildBootTests.jl
- name: install WildBootTests.jl
run: julia -e 'using Pkg; Pkg.add("WildBootTests")'
# use shell bash to ensure consistent behavior across OS
shell: bash
- name: Check
env:
_R_CHECK_CRAN_INCOMING_REMOTE_: false
run: rcmdcheck::rcmdcheck(args = c("--no-manual", "--as-cran"), error_on = "warning", check_dir = "check")
shell: Rscript {0}
- name: Upload check results
if: failure()
uses: actions/upload-artifact@main
with:
name: ${{ runner.os }}-r${{ matrix.config.r }}-results
path: check
In a nutshell, this workflow installs R and fwildclusterboot
in a clean ubuntu environment, conducts a cmd-check and initiates all of fwildclusterboot's
unit tests.
To enable testing against Julia and WildBootTests.jl
, it further adds three ‘steps’ to the workflow: the workflow has to install Julia
, WildBootTests.jl
(and all its Julia dependencies) and link R and Julia so that JuliaConnectoR
could communicate between the two languages.
This is easily be achieved by adding the lines below to the yaml file create by usethis::use_github_action()
:
# install julia
- uses: julia-actions/setup-julia@v1
# add julia to renviron
- name: Create and populate .Renviron file
run: echo JULIA_BINDIR= "${{ env.juliaLocation }}" >> ~/.Renviron
shell: bash
# install WildBootTests.jl
- name: install WildBootTests.jl
run: julia -e 'using Pkg; Pkg.add("WildBootTests")'
# use shell bash to ensure consistent behavior across OS
shell: bash
Placing these few lines in the workflow yaml before the cmd check is triggered completes the workflow. It now downloads R and Julia, installs fwildclusterboot
, wildboottestjlr
and WildBootTests.jl
, links R and Julia, and runs all of fwildclusterboot's
unit tests. The remaining step is to push to github, and then all unit tests are running!
The unit tests for fwildclusterboot currently run for about 30 minutes.↩︎
I have not implemented test that compare
fwildclusterboot
against other R implementations of the wild bootstrap, mainly due to performance reasons. As resampling methods are usually time intensive and fwildclusterboot is much faster than other R implementations, running unit tests against other R implementations would simply take a very long time.↩︎