Genomics

Messy Data


Structuring Data in Spreadsheets Solution

Some problems include not all data sets having the same columns, datasets split into their own tables, color to encode information, different column names, spaces in some columns names. A full set of types of issues with spreadsheet data is at the Data Carpentry Ecology spreadsheet lesson. Not all are present in this example. Discuss with the group what they found.

Here is a “clean” version of the same spreadsheet (you might need to “right-click” to save the file before opening it):

Cleaned spreadsheet

Return to the Lesson