This is a follow up to Mammal Body Size.
Looking at the average mass of extinct and extant species overall is useful, but there are lots of different processes that could cause size-biased extinctions so it’s not as informative as we might like. However, if we see the exact same pattern on each of the different continents that might really tell us something. Repeat the analysis in Mammal Body Size, but this time compare the mean masses within each of the different continents.
Using the dplyr
and tidyr
libraries, group the data by continent and status.
Summarize the average mass for each group. Spread the groups by status and select the statuses extant and extinct. Calculate the difference in average
masses between extant and extinct groups.
Export your results to a csv
file where the first entry on each line is the
continent, the second entry is the average mass of the extant species on that
continent, the third entry is the average mass of the extinct species on that
continent, and the forth entry is the difference between the average extant and
average extinct masses. Call the file continent_mass_differences.csv
. If you
notice anything strange think about what’s going on and present the final data
in the way that makes the most sense to you.
Because you’re working with real data, some things might look strange that are not results but barriers to analysis. These inconsistencies in the data will require some cleaning.
The unknown value used in the dataset is -999
. R assumes your unknown value is
NA
, but "NA"
in the data is the code for North America.
Use the additional arguments stringsAsFactors = FALSE, na.strings = "-999"
in
read.csv()
to get R to keep "NA"
as a string and transform -999
to NA
. You will still need to remove the NA
from the data before doing any averaging.
You might also notice Africa is represented by both "Af"
and "AF"
. Be sure
to chose one and use str_replace_all()
to make the change.