Learning Objectives

Package Checking

Common issues

Continuous Integration

  • You can set up GitHub Actions so that it will run R CMD check on multiple virtual machines (Windows, Mac, or Ubuntu) each time you push. This is really great for making sure your package is robust and constantly being checked.

  • Automatic checking each time you make a change is called continuous integration.

  • In a package, run

    usethis::use_github_action_check_standard()
  • Running this will create a new file in a hidden folder via the path “.github/workflows/R-CMD-check.yaml”. This YAML file contains instructions for setting up a virtual machine, installing R and your dependencies, and running R CMD check.

  • To use it, simply commit your files and push to GitHub, then wait for the checks to run. You can see their progress by clicking on the “Actions” tab on the GitHub page of your package.

  • It’s not too important to know what that file does, but there are some parts that you may need to edit.

  • You may comment out one of the operating systems for the check if you know that the error is artificial. Use # for comments in a YAML file. Below, I comment out the Mac.

    strategy:
      fail-fast: false
      matrix:
        config:
          - {os: windows-latest, r: 'release'}
          # - {os: macOS-latest, r: 'release'}
          - {os: ubuntu-20.04, r: 'release', rspm: "https://packagemanager.rstudio.com/cran/__linux__/focal/latest"}
          - {os: ubuntu-20.04, r: 'devel', rspm: "https://packagemanager.rstudio.com/cran/__linux__/focal/latest"}
  • Sometimes (but rarely) you need to fix the install code for the dependencies. Onetime {remotes} was failing to install the correct Bioconductor packages I needed, so I had to edit it this way:

    - name: Install dependencies
      run: |
        remotes::install_deps(dependencies = TRUE)
        remotes::install_cran("rcmdcheck")
        install.packages("BiocManager") # new line
        BiocManager::install("VariantAnnotation") # new line
      shell: Rscript {0}
  • You can see a variety of other YAML files at https://github.com/r-lib/actions/tree/v1/examples

Exercise

Recall the simreg() example from the Testing lecture. Use the edit-check workflow to further develop your package with the following capabilities:

  1. In simreg(), instead of simulating \(x\) from a standard normal, give the user the ability to choose the variance of \(x\), which we will call \(\tau^2\).

  2. It is probably difficult for the user to specify both \(\sigma^2\) (the residual variance) and \(\tau^2\) (the variance of the predictors). A better option would allow the user to provide more intuitive inputs. One possible input would be the proportion of variance explained (PVE), which we will define as \[ PVE = \frac{\beta_1^2\tau^2}{\beta_1^2\tau^2 + \sigma^2}. \] This follows from \[ var(y_i) = var(\beta_0 + \beta_1x_i + \epsilon_i) = \beta_1^2\tau^2 + \sigma^2, \] and so \(\beta_1^2\tau^2\) is how much of the variance in \(y\) is explained by the predictors.

    Allow the user to set the PVE, the residual variance (\(\sigma^2\)), and the regression coefficient (\(\beta_1\)). To do this, you should create a new function called tau_from_pve() which will calculate the proper \(\tau^2\) given the PVE, the residual variance, and the regression coefficient. Then you can just use that \(\tau^2\) to simulate \(x\).

  3. It would probably be better to include many options to choose \(x\). Create a new function called simx() that will generate \(x\) values under different conditions:

    1. From \(N(0,\tau^2)\) after specifying \(\tau\)
    2. From sample(x = c(a, b), size = n, replace = TRUE) for different numeric values of a and b.