red_apple
Style Guide
Learning Objectives
Style Guides
Each organization has a style guide on how code should be formatted that you should adhere to.
When everyone on a project uses a consistent style, it makes code easier to read and understand, and it makes collaboration easier and faster.
There are lots of style guides (see links in the Learning Objectives). This document contains the style guide for our class.
This style guide is obviously opinionated, and others have their own thoughts (which is perfectly fine!). The important thing is consistency among collaborators.
We will mostly follow the tidyverse style guide. Below I place some points of emphasis and note differences.
I expect you to follow this style guide in all homeworks and assignments.
File Names
File and folder names should only have
- Letters
- Numbers
- Underscores (
_
). - Possibly dashes
-
. But these are discouraged.
In particular, never use spaces or periods in a file name.
Capital letters are discouraged. You should work almost entirely with lower-case letters.
Always begin a file name with a lower-case letter.
Exceptions to this are:
- Hidden files/folders begin with a period
.
- Standard/Required files, such as
NAMESPACE
,README.md
, etc…
- Hidden files/folders begin with a period
R scripts should end in
.R
(not.r
).R markdown files should end in
.Rmd
.
Syntax
Names
Only use lower-case snake_case.
Good
Bad
Red_apple red.apple redApple RedApple
Variables should be nouns and functions should be verbs
Never use single letters as variables/functions
Good:
<- 10 num_sim
Bad
<- 10 ## verb simulate <- 10 ## single letter x
Exceptions: Some letters are standard. Such as
n
for the sample size inrnorm()
,runif()
, etc…
Commas
- Always put a space after a comma, not before (like English).
Good:
1, ] mat[
Bad:
1 ,] mat[1 , ] mat[1,] mat[
Parentheses
- Don’t put a space in or around parentheses for functions.
Good:
mean(x)
Bad:
mean (x) mean(x )
- Put spaces around parentheses for
if
statements, andfor
andwhile
loops.Good:
if (x) { }
Bad:
if(x){ }
- Put a space only after
()
for function creations.Good:
<- function(x) { sim }
Bad:
<- function (x) { sim } <- function(x){ sim }
Curley Braces
- Whenever you use curly braces
{}
, the opening brace should be the last character on a line, and the closing brace should be the first character on a line.Good:
if (condition) { dostuff() }
Bad
if (condition) {dostuff() } if (condition) { dostuff() }
if-else
else
statements should be on the same line as a closing brace.Good:
if (condition) { else if (condition2) { } else { } }
Only use
ifelse()
where vectorization is important. Ifcondition
should be length 1, then use fullif-else
statements.In a
if-then
statement, use||
or&&
, not|
or&
, since the latter two vectorize operations.
Infix Characters
An infix operator is one where arguments on both sides of it are used in a function. The alternative is prefix notation. Compare
5 + 10 ## infix notation
[1] 15
`+`(5, 10) ## prefix notation
[1] 15
Put spaces around all infix characters
==
,+
,-
,*
,/
,^
,|>
, etc…Good:
+ 10 x
Bad:
+10 x+ 10 x+10 x
Exceptions:
::
,:::
,$
,@
,[
,[[
, unary-
, unary+
,:
, and?
.- E.g. do
ggplot2::qplot()
or-1
, notggplot2 :: qplot()
and- 1
- E.g. do
Code Length
No lines should be greater than 80 characters.
To get a vertical line displaying the code length, in R studio go to “Tools > Global Options… > Code > Display”. Make sure “Show margin” is checked with “80” in the text box.
If a function call/definition is too long, break up arguments on new line.
<- is_a_very_long_function_call( this that = "is", broken = "up", into = "many", indented = "lines", that = "are", easier = "to", read = NULL )
Other things
Always use
<-
for assignment, not=
.Always use
"
for strings, not'
.Always use
TRUE
orFALSE
, notT
orF
T
andF
are aliases forTRUE
andFALSE
, and so may be overwritten by the user, which is scary.
Don’t include non-ASCII characters in your code.
ASCII characters are lower case letters (
a
throughz
), upper case letters (A
throughZ
), digits (0
through9
), and common punctuation.Including non-ASCII characters will give you a CRAN note.
Non-ASCII characters usually show up when you copy and paste from the web. E.g. the following look normal but are non-ASCII (and are all different):
- En Dash: “–”
- Em Dash: “—”
- Horizontal Bar: “―”
- En Quad: “ ”
- Em Quad: “ ”
- En Space: “ ”
- Em Space: “ ”
If you accidentally include such characters, you can find them with
::showNonASCIIfile() tools
Functions
Function Argument Length
If you have a lot of arguments, indent the arguments on new lines.
<- function(this, run_me is,a = "lot", of = "arguments", that = "are longer than 80 characters") { }
Function Length
- You should break up your functions into discrete tasks.
- Reduces duplicating code, so less prone to bugs.
- Allows you to think more modularly about tasks, which makes code easier to reason about.
- Makes it easier to combine code in new ways.
- To force you to do this, make all functions be less than 50 lines. This is what Bioconductor does.
Explicit returns
In R, the last value evaluated in a function will be implicitly returned. I think this is bad practice since it makes it harder to reason about what R is returning. So always include a
return()
statement. Never do<- function(x, y) { add_two + y x }
Always do
<- function(x, y) { add_two return(x + y) }
Importing
Never use the
@import
tag in a package to bring all of a package’s exported functions into theNAMESPACE
. This creates too much risk for name collision.In a package, never import functions, always type the package where the function came from. This makes it easier to reason about namespaces. Never do
#' @importFrom ggplot2 qplot <- function(x, y) { plot_red qplot(x, y, color = I("red")) }
Always do
<- function(x, y) { plot_red ::qplot(x, y, color = I("red")) ggplot2 }
Exceptions:
You will have to import infix functions (surrounded by percent signs). Such as
#' @importFrom magrittr %>% #' @importFrom foreach %dopar%
There is a small performance penalty for using
::
(about 5 µs). So import a function if you are iterating it \(\sim\) million times, and each iteration takes on the order of 1 ns.
Order of Arguments
Always place arguments with defaults after arguments without defaults.
Good:
function(arg1, arg2, arg3 = NULL) { }
Bad:
function(arg1, arg3 = NULL, arg2) { }
lintr
The
lintr
package will check many coding issues. Try running the following in the top directory of your package.::lint_package() lintr