R Packages

Author

David Gerard

Published

June 3, 2025

Learning Objectives

  • Structure of an R package.
  • Documenting R packages.
  • Workflow for building R packages.
  • Required: Chapters 1–10 and 13–14 from R Packages.
  • Resource: Writing R Extensions

Prereqs

  • Make sure you have the following packages installed.

    pkgvec <- c("usethis", "devtools", "roxygen2", "testthat", "knitr", "covr")
    for (pkg in pkgvec) {
      if (!requireNamespace(pkg, quietly = TRUE)) 
        install.packages(pkg)  
    }
  • The {usethis} and {devtools} packages automate many of the tedious tasks of package development, allowing you to focus on writing R code. These are the packages we will mostly use.

Motivation

  • Why build an R package?
  1. Share your code/methods with others.

  2. Re-use functions for yourself.

Package States

  • The same R package is in a different format/state at different points of development.

    • Source -> bundled -> binary -> installed -> in-memory.
  • Source package: A directory of files (R scripts, documentation files, test scripts, etc) with a specific structure. This lecture is about developing source packages.

  • Bundled package: A source package that has been compressed into a single file (along with a few other operations). These usually end in “.tar.gz”. We use the following to create a bundled package from a source package:

    devtools::build()

    You typically only do this when you are about to submit to CRAN.

  • Binary package: A ready-to-install version for folks who do not have R development tools. You typically don’t need to worry about this. If you submit to CRAN, then they will create binaries for you.

  • Installed package: Installing a package decompresses/places your package in the library directory. This makes it so that you can use library() to load a package.

    • Terminology: A package is a collection of functions, along with documentation, in a specific format. A library is a directory (folder) on your computer that contains installed packages.

    • Confusingly, you use the library() function to load a package.

    • You can see/control your active libraries with

      .libPaths()
      [1] "/Library/Frameworks/R.framework/Versions/4.5-arm64/Resources/library"
    • For example, these are some of the packages in /Library/Frameworks/R.framework/Versions/4.5-arm64/Resources/library

      head(list.files(path = .libPaths()[[1]]))
      [1] "abind"             "adbcdrivermanager" "airway"           
      [4] "annotate"          "AnnotationDbi"     "AnnotationHub"    
  • Ways to install a package:

    • From CRAN: install.packages().
    • From Bioconductor: BiocManager::install().
    • From source package: devtools::install().
    • From GitHub devtools::install_github().
  • In-memory package: makes functions in a package available for use.

    • Use library() to place an installed package in memory.
    • Use devtools::load_all() to place a source package in memory. You typically do this during your workflow when you are building your package.

Package Structure

  • A typical package will have this directory/file structure

    .
    ├── DESCRIPTION
    ├── .git
    ├── .gitignore
    ├── LICENSE
    ├── LICENSE.md
    ├── man
       ├── f1.Rd
       └── f2.Rd
    ├── NAMESPACE
    ├── R
       └── rcode.R
    ├── .Rbuildignore
    ├── README.md
    ├── README.Rmd
    ├── src
       └── cppcode.cpp
    └── tests
        ├── testthat
           └── test-file.R
        └── testthat.R
  • Most of these files/folders will be generated by {devtools} and {usethis}, but you should still know what they are.

  • .git is a hidden directory that git uses to store your version control history. Don’t touch this.

  • .gitignore is a hidden file used to tell git what files/folders to not place under version control. See the Pro Git Book.

  • LICENSE and LICENSE.md contain the license that your code is distributed under. Typical open-source licenses are MIT and GPL-3.

  • The man (for “manual”) folder contains files that hold your package’s documentation. E.g. whenever you use help() it uses information from a file in the man folder. This package has two functions which are documented f1() and f2().

  • NAMESPACE is a file that determines

    1. What functions are available to the user of your package (versus what functions are for internal use only), and
    2. What functions from other packages are you importing.
  • The R folder contains R script files (ending in “.R”) that hold all of your R code.

    • R code only goes in R scripts (ending in “.R”), not R Markdown Files (ending in “.Rmd”).
  • .Rbuildignore is a hidden file which tells R which files/folders to exclude from the package bundle. You use regular expressions to determine which files to ignore.

    • E.g. if you have a website in your package folder, then you can exclude it from the bundle by placing that folder’s name in .Rbuildignore.
    • You typically just use usethis::use_build_ignore() to add files/folders to .Rbuildignore.
  • README.md is the file that other developers typically first look at, and it is the front page of your package’s GitHub website. README.Rmd is an R markdown file that generates README.md.

  • src is a folder that contains C++ files (ending in “.cpp”).

  • tests is a folder that contains R code for unit-tests, which are automatic checks that you write to determine if your R package works as you intend.

Create a package skeleton

  • You can create a package skeleton with the usethis::create_package().

  • Before running this, change your working directory to where you want to create your R package with “Session > Set Working Directory > Choose Directory…”.

  • This will be the “source” state of the package, so you can choose it to be almost anywhere on your computer.

  • Choose a location that is not inside an RStudio project, another R package, another git repo, or inside an R library.

  • Then just type

    usethis::create_package(path = ".")
  • I don’t like RStudio projects, so I typically run

    usethis::create_package(path = ".", rstudio = FALSE, open = FALSE)

    You can use RStudio Projects if you want. But I won’t help with any issues you have with RStudio projects.

  • Example: For this lecture, we will create a simple R package called forloop that reproduces some Base R functions using for-loops. Create a folder called “forloop”, set the working directory to this folder, and run

    usethis::create_package(path = ".", rstudio = FALSE, open = FALSE)

The R folder

  • Here, we will discuss how programming is a little different compared to working in an R script or an R Markdown file in interactive mode.

  • All R code in package should be a function definition (with very few exceptions).

    fname <- function(arg1 = val1, arg2 = val2, ...) {
      ## code here
      return(retval)
    }
  • Don’t have R code outside of a function definition in your package until you really understand the benefits of exceptions to this rule.

  • All R code should go in R scripts (ending in “.R”) not R markdowns (ending in “.Rmd”)

  • Use informative file names. Put only related R functions into the same file (e.g. a main function and some helpers).

  • As you add or modify function definitions, you should test interactively test them. That is, iteratively:

    1. Use devtools::load_all() to load a source package into memory.
    2. Play the function you are working on, edit it.
    3. Repeat 1 and 2 until you are happy with the function.
  • In a typical R script (outside of an R package), code is run when you run it. In an R package, code is run when the package is built. So, for example, if you include the following line of code in your package.

    x <- Sys.time()
    x
    [1] "2025-06-03 10:28:59 EDT"

    Then x be the time of the package build. If you want the time that a user runs some code, include this statement in a function.

    ftime <- function() {
      return(Sys.time())
    }
    ftime()
    [1] "2025-06-03 10:28:59 EDT"
  • When you alias a function from another package, don’t do

    foo <- pkg::bar

    instead, do

    foo <- function(...) pkg::bar(...)

    This is since foo is defined as pkg::bar during build time of your package. So if the {pkg} maintainers fix an issue in bar(), your aliased function will still be the incorrect version of bar() until a user rebuilds your package.

  • Don’t modify a user’s R landscape (the global settings and the behavior of functions/objects outside of your package). With rare exceptions, here are some things to not do:

    • Never use setwd().
    • Never use library() or require().
      • See below for using other packages in your package.
    • Never use source().
      • Use devtools::load_all() while developing a package (but never have devtools::load_all() in your package).
    • Never change the options via options() or par().
    • Never use set.seed() to alter the random number generation for a user.
      • Except possibly in examples, vignettes, and unit tests. But never in anything in the /R folder.
    • Never use Sys.setenv() or Sys.setlocale().
  • Example: Let’s work together to build a function called col_means() that will take as input a data frame and return a vector of column means. We will not use the colMeans() function.

  • Exercise: Create an R script file in your package called “sum.R” via

    usethis::use_r(name = "sum")

    In this file, create a function called sum2() that takes as input a numeric vector x and returns its sum. Use a for-loop to calculate the sum (not the sum() function).

  • Exercise: Include an na.rm argument that defaults to FALSE. It removes NA’s if TRUE and does not if FALSE.

  • Exercise: Create a function called count_na() that will use a for-loop to count how many NA’s there are in a vector.

  • Exercise: There are a couple edge cases you should worry about. If the length of x is 0, then you should return NA_real_. If all values of x are NA, then you should return NA_real_ (use count_na() to check for this). Edit your function to make these changes now. Test it out on

    sum2(c(NA, NA, 1), na.rm = TRUE) ## should be 1
    sum2(c(), na.rm = TRUE) ## should be NA
    sum2(c(NA, NA, NA), na.rm = TRUE) ## should be NA

Documentation

  • Documentation: Describing:

    1. What a function does.
    2. What are the inputs of the function.
    3. What are the outputs of the function.
    4. Example usage of the function.
  • Documentation is vital for

    1. Maintaining packages (you will forget what your functions do)
    2. Having other folks use your package (they need a way to learn the functions).
  • You should be writing documentation while you are writing R code

    • Not only after the code is “done”.
  • Documentation in an R package is in “.rd” files in the “man” folder. This is rather esoteric, so we’ll use {roxygen2} to generate them automatically.

  • {roxygen2} documentation is provided by comments above a function, where each line begins with #'.

    #'
    #' Documentation goes here
    #'
    fn <- function() {
      ## Function code here.
    }
  • After you write some documentation, you can run

    devtools::document()

    and {roxygen2} will automatically update your documentation.

  • You can then look at your documentation by using ? or help().

  • {roxygen2} comments are formatted as tag-value pairs, where tags begin with an ampersand @.

  • Values of a tag extend from the tag to the next tag.

  • A typical {roxygen2} documentation looks like this

    #' @title One line description of what the function does.
    #' 
    #' @description One paragraph description of what the function does
    #'
    #' @details
    #' Long documentation, detailing exactly what the function does
    #' 
    #' @param arg1 What is arg1?
    #' @param arg2 What is arg2?
    #' 
    #' @return What is returned?
    #' 
    #' @author Your name
    #' 
    #' @examples
    #' ## Some example code goes here
    fn <- function(arg1, arg2) {
      ## Function code here
    }
  • @param: Each argument should be documented. You should state

    1. What is the format of the argument (character vector? data frame? numeric matrix?)
    2. What affect the argument has on the function’s behavior.
  • @examples: Include a few lines of example R code. Do not use @example as this expects only one line.

  • @return: What does your function return (numeric vector, character matrix, etc). Describe not just its type but what it is (posterior probabilities, summation, geometric means, etc)

  • Use @inheritParams to use the parameter documentation from a function in a different function.

  • The following will use fn1()’s documentation for a in fn2().

    #' @param a An argument
    #'
    fn1 <- function(a) {
    }
    
    #' @inheritParams fn1
    #' @param b Another argument
    #' 
    fn2 <- function(a, b) {
    }
  • If you want to document your function, but do not want {roxygen2} to create a man file for it, then add the @noRd tag.

  • Documenting is very good, but having a manual page means that you need to maintain it for other users, so I usually document all of my functions but only have man pages for my exported functions.

Formatting

  • Within {roxygen2} blocks, you can format your documentation with LaTeX-style code:

  • \emph{italics}: italics.

  • \strong{bold}: bold.

  • \code{fn()}: formatted in-line code

  • \link{fn}: Link to the documentation offn(). Do *not* include parentheses inside`

  • I often do \code{\link{fn}()} when I link to a function so that it is both linked, code-formatted, and has a parentheses.

  • To link to functions in other packages, use \link[pkg]{fn}

  • \url{}: Include a URL.

  • \href{www}{word}: Provide a hyperlink.

  • \email{}: Provide an email.

  • \doi{}: Provide the DOI of a paper (with a link to that paper).

  • You can make an itemized list with

    #' \itemize{
    #'   \item Item 1
    #'   \item Item 2
    #' }
  • You can make an enumerated list with

    #' \enumerate{
    #'   \item Item 1
    #'   \item Item 2
    #' }
  • You can make a named list with

    #' \describe{
    #'   \item{Name 1}{Item 1}
    #'   \item{Name 2}{Item 2}
    #' }
  • Example: Let’s work together to document our col_means() function.

  • Exercise: Document sum2() and count_na(). Make sure to include the following tags @title, @description, @details, @param, @return, @author, and @examples.

  • Exercise: In the @seealso tag, provide a link to each function. Also provide a link to base::sum(). Use an itemized list.

Namespace

  • A namespace tells R what functions come from what packages.

  • Each package has a namespace. You use :: to tell R which package to use a function from. Otherwise, it wouldn’t know to distinguish between, e.g.

    dplyr::lag()
    stats::lag()
  • Your package namespace will determine

    1. From what packages are you importing which functions?
    2. What functions are you exporting to users?

Exporting

  • Include the following tag in the documentation of any function that you want a user to have access to.

    @export
  • This will add the function to the “NAMESPACE” file (which you should not edit by hand).

  • Note devtools::load_all() will attach all functions from your package so you can test them out. But if a user installs your package and uses library(), they will only have access to exported functions.

  • You should only export functions you want others to use.

  • Exporting a function means that you have to be very wary about changing it in future versions since that might break other folks’ code.

  • Exercise: Look at the “NAMESPACE” file in {forloop}. Now export all of your functions in {forloop}. Look at the “NAMESPACE” file again.

Importing

  • Never use library() or require() in an R package.

  • Package dependencies go in the DESCRIPTION folder. You can tell R that your package depends on another package by running:

    usethis::use_package()
  • This will make it so that the package is available when your package is installed.

  • Then, you call functions from other packages via package::function(), where package is the name of the package and function() is the function name from package.

  • You can suggest (but not require) a package to be installed. This is usually done if the functions from the suggested package are not vital, or the suggested package is hard to install (or has a lot of dependencies itself). To do so, also use usethis::use_package() with type = "Suggests".

  • If you suggest a package, whenever you use a function from that package you should first check to see if it is installed with requireNamespace("pkg", quietly = TRUE). E.g.

    if (requireNamespace("pkg", quietly = TRUE)) {
      pkg::fun()
    }

Technical notes on importing

  • Importing functions from a package is different than including a package in the “Imports” field of the DESCRIPTION file.

    • Importing a function attaches it so that you do not need to use ::.
    • Including a package in the Imports field makes sure the package is installed.
  • The importing part of a namespace is important to avoid ambiguity.

  • E.g. many packages use c(). We can (rather foolishly) redefine it via

    ## Don't run this
    c <- function(x) {
      return(NA)
    }

    and no package will be affected because they all import c() from the {base} R package.

  • Search Path: The list of packages that R will automatically search for functions. See your search path by search().

    search()
    [1] ".GlobalEnv"        "package:stats"     "package:graphics" 
    [4] "package:grDevices" "package:utils"     "package:datasets" 
    [7] "package:methods"   "Autoloads"         "package:base"     
  • Loading a package: Put a package in memory, but do not put it in the search path. You can then use the function by using :: as in dplyr::mutate(). If you call a function via :: then it will load that package automatically.

  • Attaching a package: Both load a package and place it in the search path, so you don’t need to use :: to call a function from that package. This is what library() does.

  • If you import a package (via the DESCRIPTION file), then it loads it, it does not attach it, so you need to use ::.

  • If you import a function (via the namespace), then it attaches it, so you do not need to use ::.

  • Generally, I do not recommend importing functions. but you can do it by using @importFrom anywhere in your package.

  • E.g. adding this line anywhere in your package will attach dplyr::mutate() whenever your package is attached.

    #' @importFrom dplyr mutate
  • Why do I recommend rarely using @importFrom? Because that could make coding more complicated for your users

    • E.g. if you import dplyr::lag() then when a user attached your package, R will now think lag() comes from {dplyr} and not {stats}, which could break the user’s code.

Practical suggestions

  • You should try to have as few dependencies as possible. When packages change, that can affect (or break) your package, which means more work on your part to maintain your package.

  • Try to avoid dependencies on new packages or on packages from folks who do not have a history of maintaining their packages.

  • Try to avoid dependencies on packages from the tidyverse (dplyr, tidyr, etc). These packages have changed frequently in the past. The maintainers are great about notifying folks about breaking changes, but it still means more work on your part.

    • E.g., if you only use string manipulation in one spot in your package, try using grepl() instead of stringr::str_detect().
  • Here is a list of nice essays on limiting dependencies: https://www.tinyverse.org/

  • If you do import functions from other packages, put all of those {roxygen2} tags in the same location in one file.

  • Example: Together, let’s modify our col_means() function to one called col_stats() that also allows for calculating the standard deviation. However, sd() comes from the {stats} package, and so we need to make sure to tell R which package it is from.

  • Exercise: Instead of using count_na(), you decide to use the n_na() function from the {na.tools} package. Make these edits to your package now.

  • Exercise: Now import n_na() and remove your use of :: from the previous exercise.

DESCRIPTION file

  • The file called “DESCRIPTION” (with no filename extension) contains “meta” information about your package (like the authors, the dependencies, the license, etc)

  • usethis::create_package() gives you a template for DESCRIPTION which you can fill in. Most of the options are self-explanatory.

    Package: mypackage
    Title: What The Package Does (one line, title case required)
    Version: 0.1
    Authors@R: person("First", "Last", email = "first.last@example.com",
                      role = c("aut", "cre"))
    Description: What the package does (one paragraph)
    Depends: R (>= 3.1.0)
    License: What license is it under?
    LazyData: true
  • Each line consists of a field and a value, separated by colon. For example, above the “Package” field has the value “mypackage”.

  • “Package”: Choose a package name that is

    1. Only letters,
    2. All lower-case,
    3. Relatively short,
    4. Unique (e.g. don’t choose “google” as your package name).
  • “Title”: In less than 65 characters, in title case, describe what your package does. Must be in title case and cannot end in a period.

  • “Description”: A paragraph of what the package does. Each line should be less than 80 characters long and each newline should be indented by 4 white spaces.

  • “Authors@R:” This field contains executable R code. See the help file for person().

    • The first and last arguments are the first and last names of the person.

    • The email should be the email of the individual.

    • The role should be a vector containing possible values of "aut" for author, "cre" for creator/maintainer (one one person should be "cre"), "ctb" for contributor (only providing minor edits). There are other roles that are possible.

    • The comment argument is a named character vector with additional notes. The most common value of comment is your ORCID number with comment = c(ORCID = "number here").

    • If you have more than one author, put the person() calls in a vector.

      Authors@R:: c(person("Jane", "Doe", email = "janedoe@american.edu", role = c("aut", "cre")),
                    person("John", "Doe", email = "johndoe@american.edu", role = c("aut")))
  • “License”: What license should you distribute this package under? Don’t edit this by hand. You should use either (in decreasing order of restrictiveness)

    • Proprietary license: No one can use your package (CRAN won’t except this)

      usethis::use_proprietary_license()
    • GPL-3 license: If other folks make derivatives of your package, they have to also place it as open-source under a GPL-3 license:

      usethis::use_gpl3_license()
    • MIT license: Folks can use your stuff as long they distribute the license with your code.

      usethis::use_mit_license()
    • CC0 license: You place your stuff in the public domain, and anyone can use it for any reason.

      usethis::use_cc0_license()
  • “Version”: At least two integers separated by dots (.) or dashes (-) like 1.0.2 or 1.0-11. You should usually only have three integers (at most).

    • I would recommend just sticking to exactly three integers, separated by dots.
    • The first integer is the “major release”.
    • The second integer is the “minor release”.
    • The third integer is the “bug fixes”.
    • You change a version number when you feel that you have a completed version of the package.
    • Only increment version numbers up (that is, if you are on version 1.0.1, don’t change it to 0.9.0.
  • “Imports”: What packages does your package depend on? If you run usethis::use_package(), then {usethis} will edit the DESCRIPTION file for you. But typically the imports field looks something like this

    Imports:
      pkg1,
      pkg2
  • “Suggests”: What packages do you suggest installing to use your package (but are not required)? Again, use usethis::use_package(), but this time with the Type = "Suggests" argument. It will look like this in the DESCRIPTION folder:

    Suggests:
      pkg1,
      pkg2
  • Exercise: Edit the “Package”, “Title”, “Authors@R” and “Description” sections of your “DESCRIPTION” file

  • Exercise: Use a GPL-3 license.

Workflow

  • Keep working directory at all times at top level of your R package.

  • Iterate the following until done:

    1. Tweak a function
    2. Update documentation (if necessary)
    3. Run devtools::document() (if you’ve made any changes that impact help files or NAMESPACE)
      • Or run devtools::load_all() if you haven’t made those changes.
    4. Run some examples interactively.
    5. Run devtools::test()
    6. Run devtools::check()
  • load_all() is how we can load a source package into memory. This will load all functions (both exported and non-exported).

Including Datasets

External data

  • External data is available to the user. For example, the mpg dataset from the {ggplot2} is available to us by running

    data("mpg", package = "ggplot2")
    str(mpg)
    tibble [234 × 11] (S3: tbl_df/tbl/data.frame)
     $ manufacturer: chr [1:234] "audi" "audi" "audi" "audi" ...
     $ model       : chr [1:234] "a4" "a4" "a4" "a4" ...
     $ displ       : num [1:234] 1.8 1.8 2 2 2.8 2.8 3.1 1.8 1.8 2 ...
     $ year        : int [1:234] 1999 1999 2008 2008 1999 1999 2008 1999 1999 2008 ...
     $ cyl         : int [1:234] 4 4 4 4 6 6 6 4 4 4 ...
     $ trans       : chr [1:234] "auto(l5)" "manual(m5)" "manual(m6)" "auto(av)" ...
     $ drv         : chr [1:234] "f" "f" "f" "f" ...
     $ cty         : int [1:234] 18 21 20 21 16 18 18 18 16 20 ...
     $ hwy         : int [1:234] 29 29 31 30 26 26 27 26 25 28 ...
     $ fl          : chr [1:234] "p" "p" "p" "p" ...
     $ class       : chr [1:234] "compact" "compact" "compact" "compact" ...
  • To include data in a package, simply add it, in the format of an RData file, in a directory called “data”.

  • You can use usethis::use_data() to save a dataset in the “data” directory. The first argument is the data you want to save.

  • You should document your dataset, using roxygen, in a separate R file in the “R” directory. Typically, folks document all of their data in “R/data.R”.

  • Instead of a function declaration, you just type the name of the dataset. E.g., from my {updog} R package I have the following documentation for the snpdat tibble.

    #' @title GBS data from Shirasawa et al (2017)
    #'
    #' @description Contains counts of reference alleles and total read counts 
    #' from the GBS data of Shirasawa et al (2017) for
    #' the three SNPs used as examples in Gerard et. al. (2018).
    #'
    #' @format A \code{tibble} with 419 rows and 4 columns:
    #' \describe{
    #'     \item{id}{The identification label of the individuals.}
    #'     \item{snp}{The SNP label.}
    #'     \item{counts}{The number of read-counts that support the reference allele.}
    #'     \item{size}{The total number of read-counts at a given SNP.}
    #' }
    #'
    #' @source \doi{10.1038/srep44207}
    #'
    #' @references
    #' \itemize{
    #'   \item{Shirasawa, Kenta, Masaru Tanaka, Yasuhiro Takahata, Daifu Ma, Qinghe Cao, Qingchang Liu, Hong Zhai, Sang-Soo Kwak, Jae Cheol Jeong, Ung-Han Yoon, Hyeong-Un Lee, Hideki Hirakawa, and Sahiko Isobe "A high-density SNP genetic map consisting of a complete set of homologous groups in autohexaploid sweetpotato (Ipomoea batatas)." \emph{Scientific Reports 7} (2017). \doi{10.1038/srep44207}}
    #'   \item{Gerard, D., Ferrão, L. F. V., Garcia, A. A. F., & Stephens, M. (2018). Genotyping Polyploids from Messy Sequencing Data. \emph{Genetics}, 210(3), 789-807. \doi{10.1534/genetics.118.301468}.}
    #' }
    #'
    "snpdat"
  • Never export a dataset (via the @export tag).

  • The @format tag is useful for describing how the data are structured.

  • The @source tag is useful to describe the URL/papers/collection process for the data.

Internal Data

  • To use pre-computed data, you place all internal data in the “R/sysdata.rda” file.

  • usethis::use_data() will do this automatically if you use the internal = TRUE argument.

  • E.g. the following will put x and y in “R/sysdata.rda”

    x <- c(1, 10, 100)
    y <- data.frame(hello = c("a", "b", "c"), goodbye = 1:3)
    usethis::use_data(x, y, internal = TRUE)
  • You can use internal data in a package as you normally would use an object that is loaded into memory.

  • Exercise: Create a function called fib() that takes as input n and returns the nth Fibonacci number. Recall that the sequence is \[ 0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144, \ldots \] where the next number is the sum of the previous two numbers. Put this function in a new R script file (called “fib.R”) and make sure your function is well documented.

  • Exercise: Save the first 1000 Fibonacci numbers as a vector for internal data. Then create a function called fib1000() that just looks up the nth Fibonacci number from this internal vector.

Documenting a Package

  • It is pretty standard to have a help page for your package called “<package>-package”. E.g., for our {forloop} package we would call it forloop-package. Then, if a user wanted to see more about the package, they could look at that help file via ?`forloop-package`.

  • This help file is also where I typically put function imports.

  • Here is my documentation for my {ldsep} package

    #' Linkage Disequilibrium Shrinkage Estimation for Polyploids
    #'
    #' Estimate haplotypic or composite pairwise linkage disequilibrium
    #' (LD) in polyploids, using either genotypes or genotype likelihoods. Support is
    #' provided to estimate the popular measures of LD: the LD coefficient D,
    #' the standardized LD coefficient D', and the Pearson correlation
    #' coefficient r. All estimates are returned with corresponding
    #' standard errors. These estimates and standard errors can then be used
    #' for shrinkage estimation.
    #'
    #' @section Functions:
    #'
    #' The main functions are:
    #' \describe{
    #'   \item{\code{\link{ldfast}()}}{Fast, moment-based, bias-corrected LD
    #'       LD estimates from marginal posterior distributions.}
    #'   \item{\code{\link{ldest}()}}{Estimates pairwise LD.}
    #'   \item{\code{\link{mldest}()}}{Iteratively apply \code{\link{ldest}()}
    #'       across many pairs of SNPs.}
    #'   \item{\code{\link{sldest}()}}{Iteratively apply \code{\link{ldest}()}
    #'       along a sliding window of fixed length.}
    #'   \item{\code{\link{plot.lddf}()}}{Plot method for the output of
    #'       \code{\link{mldest}()} and \code{\link{sldest}()}.}
    #'   \item{\code{\link{format_lddf}()}}{Format the output of
    #'       \code{\link{mldest}()} and \code{\link{sldest}()} into a matrix.}
    #'   \item{\code{\link{ldshrink}()}}{Shrink correlation estimates
    #'       using adaptive shrinkage (Stephens, 2017; Dey and Stephens, 2018).}
    #' }
    #'
    #' @section Citation:
    #' If you find the methods in this package useful, please run the following
    #' in R for citation information: \code{citation("ldsep")}
    #'
    #'
    #' @importFrom stats var
    #' @importFrom foreach %dopar%
    #' @useDynLib ldsep, .registration = TRUE
    #' @importFrom Rcpp sourceCpp
    #'
    #' @docType package
    #' @name ldsep-package
    #' @aliases ldsep
    #'
    #' @author David Gerard
    NULL
    NULL
  • I use the @section tag to make custom sections for (i) the important functions and (ii) how to cite the package. This is not required, but it is a good standard.

    • The format for this tag is @section Name:. Make sure you end the name of the section with a colon.
  • @docType package: Should be included to show that this is not a function.

  • @name ldsep-package: Makes it so that if a user types ?`ldsep-package`, then they will reach this help file.

  • @aliases ldsep: This makes it so that a user type ?ldsep, then they will reach this help file as well.

  • Just include NULL below the documentation so that {roxygen2} knows to make a help file for it.