The Grammar of Graphics

A philosophy for data visualisation, based on the foundations formalised in Leland Wilkinson’s seminal book, now implemented in various software packages.

David Selby https://www.research.manchester.ac.uk/portal/david.selby.html (Centre for Epidemiology Versus Arthritis)https://www.cfe.manchester.ac.uk
2021-02-12

Presentation

Types of charts

Can you name the following data visualisations?

The Grammar of Graphics

The book

Wilkinson (2006) Figure 2.2

Figure 1: Wilkinson (2006) Figure 2.2

“The grammar of graphics determines how algebra, geometry, aesthetics, statistics, scales and coordinates interact”

Variables

Operations applied before scale transformations

Algebra

Let \(A := (x,y,z)\) and \(B := (a,a,b)\).

Scales

“Graphics do not care about the scales on which they are drawn”
Wilkinson (2006)

Statistics

After scale & variable transformations
Before coordinate transformations

Geometry

Things we can actually see

Functions Partitions Networks
point
line
area
interval
path
schema
contour
polygon
edge

Collision modifiers: stack, dodge, jitter

Coordinates

“Ordinary graphics such as intervals and polygons take on radically different appearances under different planar transformations.”

A new graphic? Or a transformation of an existing one?

Aesthetics

Form Surface Motion Sound Text
position
size
shape
rotation
resolution
colour
texture
blur
transparency
direction
speed
acceleration
tone
volume
rhythm
voice
label

“Much of the skill in graphic design is knowing what combinations of attributes to avoid.”

Wilkinson (2006) Figure 10.11

Figure 2: Wilkinson (2006) Figure 10.11

Putting it all together

Demonstration using R package ggplot2 (Wickham 2010).

Let’s make a pie chart

Titanic dataset

head(Titanic)
  Class    Sex   Age Survived Freq
1   1st   Male Child       No    0
2   2nd   Male Child       No    0
3   3rd   Male Child       No   35
4  Crew   Male Child       No    0
5   1st Female Child       No    0
6   2nd Female Child       No    0

Algebra: Class × Survived × Freq

pie <- ggplot(Titanic) +
  scale_x_discrete() + scale_y_continuous() + scale_fill_discrete() +
  geom_bar(width = 1, position = "fill") +
  aes(x = "", fill = Survived, weight = Freq) +
  theme_titanic()

pie + coord_polar(theta = "y")

pie + coord_polar(theta = "y") +
  facet_grid( ~ Class)

gg <- ggplot(Titanic) +
  scale_x_discrete() + scale_y_continuous() + scale_fill_discrete() +
  geom_bar(width = 1, position = 'stack') +
  aes(x = Class, fill = Class, weight = Freq) +
  theme_titanic()

gg + coord_polar(theta = 'x') + ggtitle('Rose')
gg + coord_polar(theta = 'y') + ggtitle('Racetrack???')

Nightingale (1858)

Figure 3: Nightingale (1858)

Napoleon’s March

Minard (1869) (view larger)

Figure 4: Minard (1869) (view larger)

gg <- ggplot(troops, aes(x = long, y = lat))
gg <- gg + geom_path(aes(size = survivors,
                         colour = direction,
                         group = group), lineend = "round")
gg <- gg + geom_text(data = cities, aes(label = city),
                     size = 3, family = "serif", fontface = "italic")

Or, with the more familiar default theme_grey():

Resources

Implementations

source('asciiplot.R')
asciiplot(
  df = iris,
  aes = list(x = 'Petal.Width', y = 'Petal.Length', shape = 'Species'),
  geom = 'point'
)
                                                    @     @   @      
                                                        @            
                                                @   @                
                                                @ @           @     @
                                          @     @       @ @   @     @
                                    @           @       @ @     @    
                                                  @ @   @     @      
                                      @   X X   @ @ @         @ @    
                              X     X X   X     @   @                
                                  X X X   X @                        
                              X   X X X                              
                        X     X   X                                  
                            X X     X                                
                        X         X                                  
                        X                                            
                            X                                        
                                                                     
                                                                     
                                                                     
                                                                     
  O     O                                                            
O O O   O O   O                                                      
O O O   O                                                            
O O                                                                  
  O                                                                  

Further reading

Minard, Charles Joseph. 1869. “Carte Figurative Des Pertes Successives En Hommes de l’armée Française Dans La Campagne de Russie 1812–1813.”
Nightingale, Florence. 1858. “Notes on Matters Affecting the Health, Efficiency, and Hospital Administration of the British Army.” In. Florence Nightingale Museum Collection.
Wickham, Hadley. 2010. “A Layered Grammar of Graphics.” Journal of Computational and Graphical Statistics 19 (1): 3–28. https://doi.org/10.1198/jcgs.2009.07098.
Wilkinson, Leland. 2006. The Grammar of Graphics. Springer-Verlag GmbH. https://www.ebook.de/de/product/11607543/leland_wilkinson_the_grammar_of_graphics.html.

References