Introduction
- Learn how to organize a computational project
- Many benefits to good organization & documentation
- Understand all project components
- Work more efficiently
- Easier for collaboration
- Including with your future self
- Save a lot of time & frustration in long run, even if it’s time-consuming and frustrating to do it right initially
Example project walk-through
Download example project.
- Good file structure
- All project files in one main folder
- Subfolders (data, R, etc.)
- Main folder is R project
- Self-contained project
- Use relative instead of absolute paths
- Good folder & file names
- Descriptive but not too long
- No spaces
- Consistent format
- README
- Everything someone needs to know to understand project
- Raw data
- In separate folder from cleaned data
- Never change!
- Each file should have metadata
- Scripts with code
- Relative file paths to read in and create files
- Lots of comments
- Order: libraries, data, user-created functions, everything else
- Good variable & column names
- Two methods to deal with different versions of same file
- Good = consistent naming scheme
- Each copy has date or version number
- Don’t call anything final, cause no file ever is!
- Better = version control, only have one copy of each file
- Good = consistent naming scheme
- Running all scripts
- Good = description of order to run each script in README
- Better = script to run all scripts
- Best = Make
- Combine code output and text in single document, e.g., knitr
Goal of good project structure
- Reproducibility = ability to get same results from set of data & code
- Give entire project folder to someone else (or your future self!), they can understand all parts and recreate your results, figures, documents, etc.
- Improving reproducibility makes YOUR life easier!