Get familiarized with metadata - Acacia drepanolobium Surveys
ggplot
- Very popular plotting package
- Good plots quickly
- Declarative - describe what you want not how to build it
- Constrasts w/Imperative - how to build it step by step
Data
- Data on acacia size in an experiment in Africa excluding large herbivores
- Data is tab separated
- Includes information on if the plant is dead in the HEIGHT column
acacia <- read.csv("http://www.esapubs.org/archive/ecol/E095/064/ACACIA_DREPANOLOBIUM_SURVEY.txt", sep="\t", na.strings = c("dead"))
Basics
library(ggplot2)
ggplot()
arguments:- default dataset - what data are we working with
- set of mappings
- ‘Aesthetics’ from variables
- what columns should we use for different aspects of the plot
ggplot(data = acacia, mapping = aes(x = CIRC, y = HEIGHT))
- Add components of figures with layers
- Scatter plot showing branch circumference and height
ggplot(acacia, aes(x = CIRC, y = HEIGHT)) +
geom_point()
- To change things about the layer pass arguments to the geom
ggplot(acacia, aes(x = CIRC, y = HEIGHT)) +
geom_point(size = 3, color = "blue", alpha = 0.5)
- Rescale axes
ggplot(acacia, aes(x = CIRC, y = HEIGHT)) +
geom_point(size = 3, color = "blue", alpha = 0.5) +
scale_y_log10() +
scale_x_log10()
-
Not changing the data itself, just the presentation of it
-
Add Labels (documentation for your graphs!)
ggplot(acacia, aes(x = CIRC, y = HEIGHT)) +
geom_point(size = 3, color = "blue", alpha = 0.5) +
labs(x = "Circumference [cm]", y = "Height [m]",
title = "Acacia Survey at UHURU")
Do Tasks 1-2 in Mass vs Metabolism.
Grouping
- Group on a single graph
- Look at influence of experimental treatment
ggplot(acacia, aes(x = CIRC, y = HEIGHT, color = TREATMENT)) +
geom_point(size = 3, alpha = 0.5)
- Facet specification
ggplot(acacia, aes(x = CIRC, y = HEIGHT)) +
geom_point(size = 3, alpha = 0.5) +
facet_wrap(~TREATMENT)
- Where are all the acacia in the open plots? (eaten?)
Do Tasks 3-4 in Mass vs Metabolism.
Assign Tasks 1-4 in Adult vs Newborn Size.
Layers
- We’ve seen that ggplot makes graphs by combining information on
- Data
- Mapping of parts of that data to aspects of the plot
- A geometric object to represent the data
ggplot(acacia, aes(x = CIRC, y = HEIGHT)) +
geom_point()
-
Many kinds of geometric object (type
geom_
and show completions) - Usage
ggplot()
sets defaults for layers- Can combine multiple layers using
+
- Order matters
- Combine different kinds of layers
- Add a linear model
ggplot(acacia, aes(x = CIRC, y = HEIGHT)) +
geom_point() +
geom_smooth(method = "lm")
- Do this by treatment
ggplot(acacia, aes(x = CIRC, y = HEIGHT, color = TREATMENT)) +
geom_point() +
geom_smooth(method = "lm")
Do Task 5 in Adult vs Newborn Size.
Statistical transformations
- Geoms include statistical transformations
- So far we’ve seen
identity
: the raw form of the data or no transformationsmooth
: model line (e.g.,loess
,lm
)
- Transformations also exist to make things like histograms, bar plots, etc.
-
Occur as defaults in associated Geoms
- To look at the number of acacia in each treatment use a bar plot
ggplot(acacia, aes(x = TREATMENT)) +
geom_bar()
- Uses the transformation
stat_count()
- Counts the number of rows for each treatment
- To look at the distribution of circumferences in the dataset use a histogram
ggplot(acacia, aes(x = CIRC)) +
geom_histogram()
- Uses
stat_bins()
for data transformation- Splits circumferences into bins and counts rows in each bin
- Uses
bins
argument to split data into groups- Defaults to
bins = 30
- Defaults to
- These can be combined with all of the other
ggplot2
features we’ve learned
ggplot(acacia, aes(x = CIRC)) +
geom_histogram(bins = 15) +
scale_x_log10() +
facet_wrap(~TREATMENT) +
labs(x = "Circumference", y = "Number of Individuals")
Do Tasks 1-2 in Sexual Dimorphism Exploration.
Combining different data and aesthetics
- Add tree size data for context
- Layers are plotted in the order they are added
trees <- read.csv("http://www.esapubs.org/archive/ecol/E095/064/TREE_SURVEYS.txt",
sep="\t", na.strings = c("dead", "missing", "MISSING", "NA"))
ggplot() +
geom_point(data = trees, aes(x = CIRC, y = HEIGHT), color = "gray") +
geom_point(data = acacia, aes(x = CIRC, y = HEIGHT), color = "red") +
labs(x = "Circumference [cm]", y = "Height [m]")
- Each layer will default to
ggplot()
mappings unless modified- So, we don’t have to specify the arguments that are the same
ggplot(mapping = aes(x = CIRC, y = HEIGHT)) +
geom_point(data = trees, color = "gray") +
geom_point(data = acacia, color = "red") +
labs(x = "Circumference [cm]", y = "Height [m]")
Do Task 3 in Sexual Dimorphism Exploration.
Grammar of graphics
- Geometric object(s)
- Data
- Mapping
- Statistical transformation
- Position
- Coordinates
- Facets
- In combination uniquely describes any plot
Saving plots as new files
ggsave(“acacia_by_treatment.jpg”)
- Lots of optional arguments
- Location
- Type
- Size
ggsave(“figures/acacia_by_treatment.pdf”, height = 5, width = 5)
Assign the rest of the exercises.