is the package. ggplot
is the main function.
the first argument is a data frame we want to plot from
the next argument is a list of variables (columns) of our data frame that we want to visualize. These go in the aesthetic aes()
penguins %>%
ggplot(aes(x = flipper_length_mm, y = bill_depth_mm,
color = island)) +
geom_point() +
labs(x = "Flipper length (mm)", y = "Bill depth (mm)",
color = "Island",
title = "Bill depth vs flipper length distribution",
subtitle = "Penguins from the arctic",
caption = "data from palmerpenguins R package") +
penguins %>%
filter(!is.na(sex)) %>%
ggplot(aes(x = flipper_length_mm, y = bill_depth_mm,
color = island)) +
geom_point() +
labs(x = "Flipper length (mm)", y = "Bill depth (mm)",
color = "Island",
title = "Bill depth vs flipper length distribution",
subtitle = "Penguins from the arctic",
caption = "data from palmerpenguins R package") +
theme_bw() +
facet_wrap(~ sex)
data = [dataframe],
x = [var_x], y = [var_y],
color = [var_for_color],
fill = [var_for_fill],
shape = [var_for_shape],
size = [var_for_size],
alpha = [var_for_alpha],
...#other aesthetics
) +
geom_<some_geom>([geom_arguments]) +
... # other geoms
scale_<some_axis>_<some_scale>() +
facet_<some_facet>([formula]) +
... # other options
To visualize multivariate relationships we can add variables to our visualization by specifying aesthetics: color, size, shape, linetype, alpha, or fill; we can also add facets based on variable levels.
The name of the argument is mapping
because it says how to “map” variables to a visual aesthetic.
When does an aesthetic (visual) go inside function aes()
If you want an aesthetic to be reflective of a variable’s values, it must go inside aes()
If you want to set an aesthetic manually and not have it convey information about a variable, use the aesthetic’s name outside of aes()
, e.g. in the geometry, and set it to your desired value.
Aesthetics for continuous and discrete variables are measured on continuous and discrete scales, respectively.
Rows: 234
Columns: 11
$ manufacturer <chr> "audi", "audi", "audi", "audi", "audi", "audi", "audi", "…
$ model <chr> "a4", "a4", "a4", "a4", "a4", "a4", "a4", "a4 quattro", "…
$ displ <dbl> 1.8, 1.8, 2.0, 2.0, 2.8, 2.8, 3.1, 1.8, 1.8, 2.0, 2.0, 2.…
$ year <int> 1999, 1999, 2008, 2008, 1999, 1999, 2008, 1999, 1999, 200…
$ cyl <int> 4, 4, 4, 4, 6, 6, 6, 4, 4, 4, 4, 6, 6, 6, 6, 6, 6, 8, 8, …
$ trans <chr> "auto(l5)", "manual(m5)", "manual(m6)", "auto(av)", "auto…
$ drv <chr> "f", "f", "f", "f", "f", "f", "f", "4", "4", "4", "4", "4…
$ cty <int> 18, 21, 20, 21, 16, 18, 18, 18, 16, 20, 19, 15, 17, 17, 1…
$ hwy <int> 29, 29, 31, 30, 26, 26, 27, 26, 25, 28, 27, 25, 25, 25, 2…
$ fl <chr> "p", "p", "p", "p", "p", "p", "p", "p", "p", "p", "p", "p…
$ class <chr> "compact", "compact", "compact", "compact", "compact", "c…
geometry | description |
geom_point() |
scatter plot |
geom_histogram() |
histogram |
geom_boxplot() |
box plot |
geom_density() |
density plot |
geom_violin() |
violin plot |
geom_raster() |
heat map |
geom_line() |
connect observations in a line |
geom_bar() |
bar plot (try with argument position = fill ) |
geom_smooth() |
add a smooth trend line (try with argument method = lm |
geom_abline() |
add an algebraic line |
is a powerful tool
p1 = penguins %>%
ggplot(aes(x = species, y = bill_depth_mm)) +
geom_violin() +
labs(x = "Species", y = "Bill depth (mm)",
title = "Violin plots")
p2 = penguins %>%
ggplot(aes(x = bill_depth_mm, y = flipper_length_mm, color = island)) +
geom_point() +
labs(x ="Bill depth (mm)",
y = "Flipper length (mm)",
color = "Island",
title = "Flipper length vs bill depth")
p3 = penguins %>%
ggplot(aes(x = body_mass_g)) +
geom_histogram(fill = "steelblue") +
labs(x = "Body mass (g)",
y = "Count",
title = "Distribution of penguin body mass")
(p1 + p2) / p3
Encircle the data points that have the minimum x-value
# create ggproto object
StatMin = ggproto("StatMin", Stat,
compute_group = function(data, scales) {
xvar = data$x
yvar = data$y
data[xvar == min(xvar), ,drop = FALSE]
required_aes = c("x", "y")
# create stat function
stat_min = function(mapping = NULL, data = NULL, geom = "point",
position = "identity", na.rm = FALSE, show.legend = NA,
inherit.aes = TRUE,
shape = 21, size = 5, color = "red",
alpha = 1, ...) {
stat = StatMin, data = data, mapping = mapping, geom = geom,
position = position, show.legend = show.legend, inherit.aes = inherit.aes,
params = list(color = color, shape = shape, size = size, alpha = alpha,
na.rm = na.rm, ...)
exampleggplot(gapminder, aes(x = gdpPercap, y = lifeExp, size = pop, colour = country)) +
geom_point(alpha = 0.7, show.legend = FALSE) +
scale_colour_manual(values = country_colors) +
scale_size(range = c(2, 12)) +
scale_x_log10() +
facet_wrap(~continent) +
theme_bw(base_size = 16) +
labs(title = 'Year: {frame_time}', x = 'GDP per capita', y = 'Life expectancy') +
transition_time(year) +
summaryCore functions
defines how the data should be spread out and how it relates to itself across time.
defines how the positional scales should change along the animation.
defines how data from other points in time should be presented in the given point in time.
/ exit_*()
defines how new data should appear and how old data should disappear during the course of the animation.
defines how different aesthetics should be eased during transitions.
Label variables
is always better.ggplot2
