A philosophy for data visualisation, based on the foundations formalised in Leland Wilkinson’s seminal book, now implemented in various software packages.
Can you name the following data visualisations?
“The grammar of graphics determines how algebra, geometry, aesthetics, statistics, scales and coordinates interact”
Operations applied before scale transformations
sin
, log
)Let \(A := (x,y,z)\) and \(B := (a,a,b)\).
“Graphics do not care about the scales on which they are drawn”
After scale & variable transformations
Before coordinate transformations
Things we can actually see
Functions | Partitions | Networks |
---|---|---|
point line area interval path schema |
contour polygon |
edge |
Collision modifiers: stack, dodge, jitter
“Ordinary graphics such as intervals and polygons take on radically different appearances under different planar transformations.”
A new graphic? Or a transformation of an existing one?
Form | Surface | Motion | Sound | Text |
---|---|---|---|---|
position size shape rotation resolution |
colour texture blur transparency |
direction speed acceleration |
tone volume rhythm voice |
label |
“Much of the skill in graphic design is knowing what combinations of attributes to avoid.”
Demonstration using R package ggplot2
(Wickham 2010).
Titanic
dataset
head(Titanic)
Class Sex Age Survived Freq
1 1st Male Child No 0
2 2nd Male Child No 0
3 3rd Male Child No 35
4 Crew Male Child No 0
5 1st Female Child No 0
6 2nd Female Child No 0
Algebra: Class
× Survived
× Freq
pie <- ggplot(Titanic) +
scale_x_discrete() + scale_y_continuous() + scale_fill_discrete() +
geom_bar(width = 1, position = "fill") +
aes(x = "", fill = Survived, weight = Freq) +
theme_titanic()
pie + coord_polar(theta = "y")
pie + coord_polar(theta = "y") +
facet_grid( ~ Class)
gg <- ggplot(Titanic) +
scale_x_discrete() + scale_y_continuous() + scale_fill_discrete() +
geom_bar(width = 1, position = 'stack') +
aes(x = Class, fill = Class, weight = Freq) +
theme_titanic()
gg + coord_polar(theta = 'x') + ggtitle('Rose')
gg + coord_polar(theta = 'y') + ggtitle('Racetrack???')
gg <- ggplot(troops, aes(x = long, y = lat))
gg <- gg + geom_path(aes(size = survivors,
colour = direction,
group = group), lineend = "round")
gg <- gg + geom_text(data = cities, aes(label = city),
size = 3, family = "serif", fontface = "italic")
Or, with the more familiar default theme_grey()
:
Graphics Production Language (SPSS)
Gadfly (Julia)
Grammarphone (Spotify API)
source('asciiplot.R')
asciiplot(
df = iris,
aes = list(x = 'Petal.Width', y = 'Petal.Length', shape = 'Species'),
geom = 'point'
)
@ @ @
@
@ @
@ @ @ @
@ @ @ @ @ @
@ @ @ @ @
@ @ @ @
@ X X @ @ @ @ @
X X X X @ @
X X X X @
X X X X
X X X
X X X
X X
X
X
O O
O O O O O O
O O O O
O O
O
Article: Chernoff faces in ggplot2 🌝
Article: Using ggplot2 with Stata
Cheatsheet: Data Visualisation with ggplot2 (by RStudio)