Control flow, loops, and functions

Dr. Alexander Fisher

Duke University

January 18, 2023

Control flow

if statements

if (condition) {
  do stuff # when condition is TRUE
}

Examples

x = c(1,2,3)
if (2 %in% x) {
  print("2 is in x!")
}
[1] "2 is in x!"
if (-6) {
  print("Other types are coerced to logical if possible.")
}
[1] "Other types are coerced to logical if possible."
if (5 %in% x) 
  print("5 is in x!")

if the statement is FALSE, {the code} does not execute

if is not vectorized

While many operators and functions in R are vectorized,

x = c(1,2,3)
exp(x)
[1]  2.718282  7.389056 20.085537
log(x)
[1] 0.0000000 0.6931472 1.0986123
x + 2
[1] 3 4 5

if statements are not

if (x == 1) {
  print("x is 1!")
}
Warning in if (x == 1) {: the condition has length > 1 and only the first
element will be used
[1] "x is 1!"

Collapsing logical vectors

x = c(1, 2, 3)
x > 0
[1] TRUE TRUE TRUE

any() and all() can help us collapse this to a single argument (like “or”/“and” logic)

any(x > 0)
[1] TRUE
all(x > 0)
[1] TRUE
if (any(x > 0)) {
  print("At least one element of x is greater than 0.")
}
[1] "At least one element of x is greater than 0."

else if, else and ifelse

x = 3
if (x < 0) {
  "x is negative"
} else if (x > 0) {
  "x is positive"
} else {
  "x is zero"
}
[1] "x is positive"
x = 0
if (x < 0) {
  "x is negative"
} else if (x > 0) {
  "x is positive"
} else {
  "x is zero"
}
[1] "x is zero"
x = -1
ifelse(x > 0,
       "positive",
       "not positive")
[1] "not positive"

Error checking

stop and stopifnot

We often need to validate user input and function arguments. If our validation fails, we want to report the error and stop execution.

ok = FALSE
if (!ok)
  stop("Things are not ok.")
Error in eval(expr, envir, enclos): Things are not ok.
stopifnot(ok)
Error: ok is not TRUE

Note

An error (like the one generated by stop) will prevent a quarto document from rendering unless #| error: true is set for that code chunk

Placing checkpoints

Always place checkpoints upstream (find errors as quickly as possible).

Bad checkpoint placement

if (condition_one) {
  ##
  ## Do stuff
  ##
} else if (condition_two) {
  ##
  ## Do other stuff
  ##
} else if (condition_error) {
  stop("Condition error occured")
}

Good checkpoint placement

# Do stuff better
if (condition_error) {
  stop("Condition error occured")
}
if (condition_one) {
  ##
  ## Do stuff
  ##
} else if (condition_two) {
  ##
  ## Do other stuff
  ##
}

Exercise 1

Consider two vectors, x and y, each of length one. Write a set of conditionals that satisfy the following.

  • If x is positive and y is negative or y is positive and x is negative, print “knits”.
  • If x divided by y is positive, print “stink”.
  • Stop execution if x or y are zero.

Test your code with various x and y values. Where did you place the stop execution code?

Loops

Loop types

R supports three types of loops: for, while, and repeat.

for (item in vector) {
##
## Iterate this code
##
}
while (we_have_a_true_condition) {
##
## Iterate this code
##
}
repeat {
##
## Iterate this code
##
}

for loops

for loops are used to iterate over items in a vector. They have the following basic form:

for (item in vector) perform_action
for (nakama in c("Luffy", "Nami", "Zoro")) {
  print(nakama)
}
[1] "Luffy"
[1] "Nami"
[1] "Zoro"
for (i in 1:4) {
  log(i)
}

Automatic printing is turned off in loops.

while loops

while loops interate until a condition is false

squares = rep(0, 5)
squares
[1] 0 0 0 0 0
i = 1
while (i < 6) {
  squares[i] = i^2
  i = i + 1
}
squares
[1]  1  4  9 16 25

repeat loops

repeat loops repeatedly iterate code until a break is reached.

i = 1
squares = rep(0, 5)
repeat {
squares[i] = i ^ 2
i = i + 1
if (i > 5) {break}
}
squares
[1]  1  4  9 16 25

loop keywords: next and break

  • next exits the current iteration and advances the looping index
  • break exits the loop
  • both break andnext apply only to the innermost of nested loops.
for (i in 1:10) {
  if (i %% 2 == 0) {next}
  print(paste("Number", i, "is odd."))
  if (i %% 7 == 0) {break}
  }
[1] "Number 1 is odd."
[1] "Number 3 is odd."
[1] "Number 5 is odd."
[1] "Number 7 is odd."

Auxiliary loop functions

You may want to loop over indices of an object as opposed to the object’s values. To do this, consider using one of length(), seq(), seq_along(), and seq_len().

seq_along(x) is preferred to 1:length(x) e.g.

x = list()
length(x)
[1] 0
1:length(x)
[1] 1 0
seq_along(x)
integer(0)

Many ways to generate sequences…

1:5
[1] 1 2 3 4 5
seq(1:5)
[1] 1 2 3 4 5
seq_len(5)
[1] 1 2 3 4 5

Exercises

Exercise 2

Consider the vector x below.

x = c(3, 4, 12, 19, 23, 49, 100, 63, 70)

Write R code that prints the perfect squares in x.

Exercise 3

Consider z = c(-1, .5, 0, .5, 1). Write R code that prints the smallest non-negative integer k satisfying the inequality

\[ |cos(k)−z|<0.001 \]

for each component of z.

Functions

Function composition

A function is comprised of arguments (formals), and code (body).

quadraticRoots = function(a, b, c) {
  x1 = (-b + sqrt((b^2) - (4*a*c))) / (2*a)
  x2 = (-b - sqrt((b^2) - (4*a*c))) / (2*a)
  return(c(x1, x2))
}

quadraticRoots(1, -2, -3)
[1]  3 -1
formals(quadraticRoots)
$a


$b


$c
body(quadraticRoots)
{
    x1 = (-b + sqrt((b^2) - (4 * a * c)))/(2 * a)
    x2 = (-b - sqrt((b^2) - (4 * a * c)))/(2 * a)
    return(c(x1, x2))
}

Returns

There are two approaches to returning values from functions in R - explicit and implicit returns.

Explicit - using one or more return function calls

f = function(x) {
  return(x * x)
}
f(2)
[1] 4

Implicit - return value of the last expression is returned.

g = function(x) {
  x * x
}
g(3)
[1] 9

Invisible returns

Many functions in R make use of an invisible return value

  • visible
f = function(x) {
  print(x)
}
y = f(1)
[1] 1
y
[1] 1
  • invisible
g = function(x) {
  invisible(x)
}
g(2)
z = g(2)
z
[1] 2

Arguments

When defining a function we explicitly define names for the arguments, which become variables within the scope of the function.

When calling a function we can use these names to pass arguments in an alternative order.

f = function(x, y, z = 1) { # z defaults to 1
  paste0("x=", x, " y=", y, " z=", z)
}
f(1, 2, 3)
[1] "x=1 y=2 z=3"
f(z = 1, x = 2, y = 3)
[1] "x=2 y=3 z=1"
f(y = 2, 1, 3)
[1] "x=1 y=2 z=3"
f(1)
Error in paste0("x=", x, " y=", y, " z=", z): argument "y" is missing, with no default
f(1, 2)
[1] "x=1 y=2 z=1"
f(1, 2, 3, 4)
Error in f(1, 2, 3, 4): unused argument (4)
f(1 , 2, m = 3)
Error in f(1, 2, m = 3): unused argument (m = 3)

Scope

R has generous scoping rules, if it can’t find a variable in the current scope (e.g. a function’s body) it will look for it in the next higher scope, and so on.

y = 1
f = function(x) {
  x + y
}
f(3)
[1] 4
y = 1
g = function(x) {
  y = 2
  x + y
}
g(3)
[1] 5

Additionally, variables defined within a scope only persist for the duration of that scope, and do not overwrite variables at a higher scopes

Lazy evaluation

Arguments to R functions are not evaluated until needed.

f = function(a, b, x) {
  print(a)
  print(b ^ 2)
  0 * x
}
f(5, 6)
[1] 5
[1] 36
Error in f(5, 6): argument "x" is missing, with no default

Function forms

Form Description Example(s)
Prefix name comes before arguments log(x, base = exp(1))
Infix name between arguments +, %>%, %/%
Replacement replace values by assignment names(x) <- c("a", "b")
Special all others not defined above [[, for, break, (

Help

To get help on any function, type ?fcn_name in your console, where fcn_name is the function’s name. For infix, replacement, and special functions you will need to surround the function with backticks.

?mean
?`for`
?`+`

For functions not in the base package, you can generally see their implementation by entering the function name without parentheses (or using the body function).

lm |>
  head()
                                                                         
1 function (formula, data, subset, weights, na.action, method = "qr",    
2     model = TRUE, x = FALSE, y = FALSE, qr = TRUE, singular.ok = TRUE, 
3     contrasts = NULL, offset, ...)                                     
4 {                                                                      
5     ret.x <- x                                                         
6     ret.y <- y                                                         

Function best practices

  • Write a function when you have copied code more than twice.

  • Try to use a verb for your function’s name.

  • Keep argument names short but descriptive.

  • Add code comments to explain the “why” of your code.

  • Link a family of functions with a common prefix: pnorm(), pbinom(), ppois().

  • Keep data arguments first, then other required arguments, then followed by default arguments. The … argument can be placed last.

A summary of R

To understand computations in R, two slogans are helpful:

Everything that exists is an object.

Everything that happens is a function call.

— John Chambers

John McKinley Chambers is the creator of the S programming language, and core member of the R programming language project. The R programming is often called a successor to the S programming language.