Dr. Alexander Fisher
Duke University
exam 1 date on schedule
teams for lab 03; see announcement on slack
ggplot2
is the package. ggplot
is the main function.
the first argument is a data frame we want to plot from
the next argument is a list of variables (columns) of our data frame that we want to visualize. These go in the aesthetic aes()
function.
penguins %>%
ggplot(aes(x = flipper_length_mm, y = bill_depth_mm,
color = island)) +
geom_point() +
labs(x = "Flipper length (mm)", y = "Bill depth (mm)",
color = "Island",
title = "Bill depth vs flipper length distribution",
subtitle = "Penguins from the arctic",
caption = "data from palmerpenguins R package") +
theme_bw()
penguins %>%
filter(!is.na(sex)) %>%
ggplot(aes(x = flipper_length_mm, y = bill_depth_mm,
color = island)) +
geom_point() +
labs(x = "Flipper length (mm)", y = "Bill depth (mm)",
color = "Island",
title = "Bill depth vs flipper length distribution",
subtitle = "Penguins from the arctic",
caption = "data from palmerpenguins R package") +
theme_bw() +
facet_wrap(~ sex)
ggplot(
data = [dataframe],
aes(
x = [var_x], y = [var_y],
color = [var_for_color],
fill = [var_for_fill],
shape = [var_for_shape],
size = [var_for_size],
alpha = [var_for_alpha],
...#other aesthetics
)
) +
geom_<some_geom>([geom_arguments]) +
... # other geoms
scale_<some_axis>_<some_scale>() +
facet_<some_facet>([formula]) +
... # other options
To visualize multivariate relationships we can add variables to our visualization by specifying aesthetics: color, size, shape, linetype, alpha, or fill; we can also add facets based on variable levels.
The name of the argument is mapping
because it says how to “map” variables to a visual aesthetic.
When does an aesthetic (visual) go inside function aes()
?
If you want an aesthetic to be reflective of a variable’s values, it must go inside aes()
.
If you want to set an aesthetic manually and not have it convey information about a variable, use the aesthetic’s name outside of aes()
, e.g. in the geometry, and set it to your desired value.
Aesthetics for continuous and discrete variables are measured on continuous and discrete scales, respectively.
Rows: 234
Columns: 11
$ manufacturer <chr> "audi", "audi", "audi", "audi", "audi", "audi", "audi", "…
$ model <chr> "a4", "a4", "a4", "a4", "a4", "a4", "a4", "a4 quattro", "…
$ displ <dbl> 1.8, 1.8, 2.0, 2.0, 2.8, 2.8, 3.1, 1.8, 1.8, 2.0, 2.0, 2.…
$ year <int> 1999, 1999, 2008, 2008, 1999, 1999, 2008, 1999, 1999, 200…
$ cyl <int> 4, 4, 4, 4, 6, 6, 6, 4, 4, 4, 4, 6, 6, 6, 6, 6, 6, 8, 8, …
$ trans <chr> "auto(l5)", "manual(m5)", "manual(m6)", "auto(av)", "auto…
$ drv <chr> "f", "f", "f", "f", "f", "f", "f", "4", "4", "4", "4", "4…
$ cty <int> 18, 21, 20, 21, 16, 18, 18, 18, 16, 20, 19, 15, 17, 17, 1…
$ hwy <int> 29, 29, 31, 30, 26, 26, 27, 26, 25, 28, 27, 25, 25, 25, 2…
$ fl <chr> "p", "p", "p", "p", "p", "p", "p", "p", "p", "p", "p", "p…
$ class <chr> "compact", "compact", "compact", "compact", "compact", "c…
geometry | description |
---|---|
geom_point() |
scatter plot |
geom_histogram() |
histogram |
geom_boxplot() |
box plot |
geom_density() |
density plot |
geom_violin() |
violin plot |
geom_raster() |
heat map |
geom_line() |
connect observations in a line |
geom_bar() |
bar plot (try with argument position = fill ) |
geom_smooth() |
add a smooth trend line (try with argument method = lm |
geom_abline() |
add an algebraic line |
See https://ggplot2.tidyverse.org/reference/ for more geometries.
Some geometries are in additional packages, e.g. see geom_density_ridges()
within the package ggridges
to create plots like these
image credit: tvthemes
package by Ryo Nakagawra
See https://ggplot2.tidyverse.org/reference/ggtheme.html for a list of default themes.
stat_function()
is a powerful tool
ggsave()
p1 = penguins %>%
ggplot(aes(x = species, y = bill_depth_mm)) +
geom_violin() +
labs(x = "Species", y = "Bill depth (mm)",
title = "Violin plots")
p2 = penguins %>%
ggplot(aes(x = bill_depth_mm, y = flipper_length_mm, color = island)) +
geom_point() +
labs(x ="Bill depth (mm)",
y = "Flipper length (mm)",
color = "Island",
title = "Flipper length vs bill depth")
p3 = penguins %>%
ggplot(aes(x = body_mass_g)) +
geom_histogram(fill = "steelblue") +
labs(x = "Body mass (g)",
y = "Count",
title = "Distribution of penguin body mass")
(p1 + p2) / p3
ggproto
Encircle the data points that have the minimum x-value
# create ggproto object
StatMin = ggproto("StatMin", Stat,
compute_group = function(data, scales) {
xvar = data$x
yvar = data$y
data[xvar == min(xvar), ,drop = FALSE]
},
required_aes = c("x", "y")
)
# create stat function
stat_min = function(mapping = NULL, data = NULL, geom = "point",
position = "identity", na.rm = FALSE, show.legend = NA,
inherit.aes = TRUE,
shape = 21, size = 5, color = "red",
alpha = 1, ...) {
layer(
stat = StatMin, data = data, mapping = mapping, geom = geom,
position = position, show.legend = show.legend, inherit.aes = inherit.aes,
params = list(color = color, shape = shape, size = size, alpha = alpha,
na.rm = na.rm, ...)
)
}
gganimate
exampleggplot(gapminder, aes(x = gdpPercap, y = lifeExp, size = pop, colour = country)) +
geom_point(alpha = 0.7, show.legend = FALSE) +
scale_colour_manual(values = country_colors) +
scale_size(range = c(2, 12)) +
scale_x_log10() +
facet_wrap(~continent) +
theme_bw(base_size = 16) +
labs(title = 'Year: {frame_time}', x = 'GDP per capita', y = 'Life expectancy') +
transition_time(year) +
ease_aes('linear')
gganimate
summaryCore functions
transition_*()
defines how the data should be spread out and how it relates to itself across time.
view_*()
defines how the positional scales should change along the animation.
shadow_*()
defines how data from other points in time should be presented in the given point in time.
enter_*()
/ exit_*()
defines how new data should appear and how old data should disappear during the course of the animation.
ease_aes()
defines how different aesthetics should be eased during transitions.
Label variables
theme_bw()
is always better.ggplot2
documentation
ggplot2
extensions: https://exts.ggplot2.tidyverse.org/gallery/
top 50 ggplot2 visualizations with code!
extending ggplot2 with ggproto