[1] "logical"
Dr. Alexander Fisher
Duke University
January 20, 2023
R uses NA
to represent missing values in its data structures.NA
is a logical type. What may not be obvious is that NA
may be treated as a different type thanks to coercion.
NA
stickinessBecause NAs represent missing values it makes sense that any calculation using them should also be missing.
A useful mental model for NAs is to consider them as a unknown value that could take any of the possible values for that type.
For numbers or characters this isn’t very helpful, but for a logical value we know that the value must either be TRUE or FALSE and we can use that when deciding what value to return.
If the value of NA
affects the logical outcome, it is indeterminate and the operation will return NA
. If the value of NA
does not affect the logical outcome, the operation will return the outcome.
NA
Because NA
could take any value, the result of, for example, 2 != NA
or 1 == NA
is inconclusive and returns NA
.
These are defined as part of the IEEE floating point standard (not unique to R)
NaN
- Not a numberInf
- Positive infinity-Inf
- Negative infinityInf
and NaN
You can coerce one type to another with as.()
Write a function that takes vector input x
and returns the smallest and largest non-infinite value. Test your function on
Two types of vectors in R. Atomic vectors (elements are all the same type) and generic vectors, aka lists (heterogeneous collection of elements). For example, a list can contain atomic vectors, functions, other lists, etc.
We can view the contents of a list and a brief description of the contents compactly with the structure function str()
List of 4
$ : chr "A"
$ : num [1:4] 0.5 1 1.5 2
$ :List of 2
..$ : logi TRUE
..$ : num 1
$ :function (x)
..- attr(*, "srcref")= 'srcref' int [1:8] 1 39 1 53 39 53 1 1
.. ..- attr(*, "srcfile")=Classes 'srcfilecopy', 'srcfile' <environment: 0x7fd5caeb0d58>
List of 1
$ :List of 1
..$ :List of 1
.. ..$ : list()
List of 3
$ : num 1
$ :List of 1
..$ : num 2
$ :List of 2
..$ : num 3
..$ : num 2
Because of this, lists become the most natural way of representing tree-like structures within R
By default a vector will be coerced to a list (as a list is more general) if needed
We can coerce a list into an atomic vector using unlist
- the usual type coercion rules then apply to determine the final type.
as.integer
and similar functions can be used, but only if the list is flat (i.e. no lists inside your base list)
Because of their more complex structure we often want to name the elements of a list (we can also do this with atomic vectors).
This can make accessing list elements more straight forward.
List of 2
$ A: num 1
$ B:List of 2
..$ C: num 2
..$ D: num 3
More complex names need to be quoted,
Represent the following JSON (JavaScript Object Notation) data as a list in R.
NULL
valuesNULL
typeNULL
is a special value within R that represents nothing - it always has length zero and type “NULL” and cannot have any attributes.
When combined in a vector, it disappears.
Previously we saw that in multi-vector operations, short vectors get re-used until the length of the long vector is matched.
0-length length coercion is a special case of length coercion when one of the arguments has length 0. In this case the longer vector’s length is not used and result will have length 0.
As a NULL values always have length 0, this coercion rule will apply (note type coercion is also occurring here)
NULL
and comparisonGiven the previous issue, comparisons and conditional with NULL
s can be problematic.
Error in if (x > 0) print("Hello"): argument is of length zero
Error in if (!is.null(x) & (x > 0)) print("Hello"): argument is of length zero
Attributes are named lists that can be attached to objects in R. Attributes contain metadata about an object, e.g. the object’s names
, dim
, class
, levels
etc.
Attributes can be interacted with via attr
and attributes
functions.
L M N
1 2 3
$names
[1] "L" "M" "N"
List of 1
$ names: chr [1:3] "L" "M" "N"
Factor objects are how R represents categorical data (e.g. a variable where there are a fixed # of possible outcomes).
[1] Sunny Cloudy Rainy Cloudy Cloudy
Levels: Cloudy Rainy Sunny
We can build our own factor from scratch using,
y = c(3L, 1L, 2L, 1L, 1L)
attr(y, "levels") = c("Cloudy", "Rainy", "Sunny")
attr(y, "class") = "factor"
y
[1] Sunny Cloudy Rainy Cloudy Cloudy
Levels: Cloudy Rainy Sunny
The approach we just used is a bit clunky - generally the preferred method for construction an object with attributes from scratch is to use the structure function.
Create a factor vector based on the vector of airport codes below.
All of the possible levels are