Exploring data
This unit focuses on data visualization and data wrangling.
Specifically we cover fundamentals of data and data visualization, confounding variables, and Simpson’s paradox as well as the concept of tidy data, data import, data cleaning, and data curation.
We end the unit with web scraping and introduce the idea of iteration in preparation for the next unit.
Also in this unit students are introduced to the toolkit: R, RStudio, R Markdown, Git, and GitHub.
Visualising data
Unit 2 - Deck 1: Data and visualisation
Unit 2 - Deck 2: Visualising data with ggplot2
Unit 2 - Deck 3: Visualising numerical data
Unit 2 - Deck 4: Visualising categorical data
Wrangling and tidying data
Unit 2 - Deck 5: Tidy data
Unit 2 - Deck 6: Grammar of data wrangling
Unit 2 - Deck 7: Working with a single data frame
Unit 2 - Deck 8: Working with multiple data frames
Unit 2 - Deck 9: Tidying data
Importing and recoding data
Unit 2 - Deck 10: Data types
Unit 2 - Deck 11: Data classes
Unit 2 - Deck 12: Importing data
Unit 2 - Deck 13: Recoding data
Communicating data science results effectively
Unit 2 - Deck 14: Tips for effective data visualization
Unit 2 - Deck 15: Scientific studies and confounding
Unit 2 - Deck 16: Simpson’s paradox
Unit 2 - Deck 17: Doing data science
Web scraping and programming
Unit 2 - Deck 18: Web scraping
Unit 2 - Deck 19: Scraping top 250 movies on IMDB
Unit 2 - Deck 20: Web scraping considerations
Unit 2 - Deck 21: Functions
Unit 2 - Deck 22: Iteration