Launch Your Career in Data Science. A nine-course introduction to data science, developed and taught by leading professors

 

Launch Your Career in Data Science. A nine-course introduction to data science, developed and taught by leading professors

 

About This Specialization

Ask the right questions, manipulate data sets, and create visualizations to communicate results.

This Specialization covers the concepts and tools you’ll need throughout the entire data science pipeline, from asking the right kinds of questions to making inferences and publishing results. In the final Capstone Project, you’ll apply the skills learned by building a data product using real-world data. At completion, students will have a portfolio demonstrating their mastery of the material.

Created by:   

Johns Hopkins University

                          Industry Partners:                       

 

 

courses

10 courses

Follow the suggested order or choose your own.

projects

Projects

Designed to help you practice and apply the skills you learn.

certificates

Certificates

Highlight your new skills on your resume or LinkedIn

 

  1. COURSE 1

    The Data Scientist’s Toolbox

    Commitment
    1-4 hours/week
    Subtitles
    English, French, Chinese (Simplified), Greek, Italian, Portuguese (Brazilian), Vietnamese, Russian, Turkish, Hebrew

    About the Course

    In this course you will get an introduction to the main tools and ideas in the data scientist’s toolbox. The course gives an overview of the data, questions, and tools that data analysts and data scientists work with. There are two components to this course. The first is a conceptual introduction to the ideas behind turning data into actionable knowledge. The second is a practical introduction to the tools that will be used in the program like version control, markdown, git, GitHub, R, and RStudio.


    WEEK 1
    Week 1
    During Week 1, you’ll learn about the goals and objectives of the Data Science Specialization and each of its components. You’ll also get an overview of the field as well as instructions on how to install R.

     

    Reading · Welcome to the Data Scientist’s Toolbox

     

    Reading · Pre-Course Survey

     

    Reading · Syllabus

     

    Reading · Specialization Textbooks

     

    Video · Specialization Motivation

     

    Reading · The Elements of Data Analytic Style

     

    Video · The Data Scientist’s Toolbox

     

    Video · Getting Help

     

    Video · Finding Answers

     

    Video · R Programming Overview

     

    Video · Getting Data Overview

     

    Video · Exploratory Data Analysis Overview

     

    Video · Reproducible Research Overview

     

    Video · Statistical Inference Overview

     

    Video · Regression Models Overview

     

    Video · Practical Machine Learning Overview

     

    Video · Building Data Products Overview

     

    Video · Installing R on Windows {Roger Peng}

     

    Video · Install R on a Mac {Roger Peng}

     

    Video · Installing Rstudio {Roger Peng}

     

    Video · Installing Outside Software on Mac (OS X Mavericks)

     

    Quiz · Week 1 Quiz

    WEEK 2
    Week 2: Installing the Toolbox
    This is the most lecture-intensive week of the course. The primary goal is to get you set up with R, Rstudio, Github, and the other tools we will use throughout the Data Science Specialization and your ongoing work as a data scientist.

     

    Video · Tips from Coursera Users – Optional Video

     

    Video · Command Line Interface

     

    Video · Introduction to Git

     

    Video · Introduction to Github

     

    Video · Creating a Github Repository

     

    Video · Basic Git Commands

     

    Video · Basic Markdown

     

    Video · Installing R Packages

     

    Video · Installing Rtools

     

    Quiz · Week 2 Quiz

    WEEK 3
    Week 3: Conceptual Issues
    The Week 3 lectures focus on conceptual issues behind study design and turning data into knowledge. If you have trouble or want to explore issues in more depth, please seek out answers on the forums. They are a great resource! If you happen to be a superstar who already gets it, please take the time to help your classmates by answering their questions as well. This is one of the best ways to practice using and explaining your skills to others. These are two of the key characteristics of excellent data scientists.

     

    Video · Types of Questions

     

    Video · What is Data?

     

    Video · What About Big Data?

     

    Video · Experimental Design

     

    Quiz · Week 3 Quiz

    WEEK 4
    Week 4: Course Project Submission & Evaluation
    In Week 4, we’ll focus on the Course Project. This is your opportunity to install the tools and set up the accounts that you’ll need for the rest of the specialization and for work in data science.

     

    Peer Review · Course Project

     

    Reading · Post-Course Survey

  2. COURSE 2

    R Programming

    Subtitles
    English, French, Japanese, Chinese (Simplified)

    About the Course

    In this course you will learn how to program in R and how to use R for effective data analysis. You will learn how to install and configure software necessary for a statistical programming environment and describe generic programming language

    WEEK 1
    Week 1: Background, Getting Started, and Nuts & Bolts
    This week covers the basics to get you started up with R. The Background Materials lesson contains information about course mechanics and some videos on installing R. The Week 1 videos cover the history of R and S, go over the basic data types in R, and describe the functions for reading and writing data. I recommend that you watch the videos in the listed order, but watching the videos out of order isn’t going to ruin the story.

     

    Reading · Welcome to R Programming

     

    Reading · About the Instructor

     

    Reading · Pre-Course Survey

     

    Reading · Syllabus

     

    Reading · Course Textbook

     

    Reading · Course Supplement: The Art of Data Science

     

    Reading · Data Science Podcast: Not So Standard Deviations

     

    Video · Installing R on a Mac

     

    Video · Installing R on Windows

     

    Video · Installing R Studio (Mac)

     

    Video · Writing Code / Setting Your Working Directory (Windows)

     

    Video · Writing Code / Setting Your Working Directory (Mac)

     

    Reading · Getting Started and R Nuts and Bolts

     

    Video · Introduction

     

    Video · Overview and History of R

     

    Video · Getting Help

     

    Video · R Console Input and Evaluation

     

    Video · Data Types – R Objects and Attributes

     

    Video · Data Types – Vectors and Lists

     

    Video · Data Types – Matrices

     

    Video · Data Types – Factors

     

    Video · Data Types – Missing Values

     

    Video · Data Types – Data Frames

     

    Video · Data Types – Names Attribute

     

    Video · Data Types – Summary

     

    Video · Reading Tabular Data

     

    Video · Reading Large Tables

     

    Video · Textual Data Formats

     

    Video · Connections: Interfaces to the Outside World

     

    Video · Subsetting – Basics

     

    Video · Subsetting – Lists

     

    Video · Subsetting – Matrices

     

    Video · Subsetting – Partial Matching

     

    Video · Subsetting – Removing Missing Values

     

    Video · Vectorized Operations

     

    Quiz · Week 1 Quiz

     

    Video · Introduction to swirl

     

    Reading · Practical R Exercises in swirl Part 1

     

    Practice Programming Assignment · swirl Lesson 1: Basic Building Blocks

     

    Practice Programming Assignment · swirl Lesson 2: Workspace and Files

     

    Practice Programming Assignment · swirl Lesson 3: Sequences of Numbers

     

    Practice Programming Assignment · swirl Lesson 4: Vectors

     

    Practice Programming Assignment · swirl Lesson 5: Missing Values

     

    Practice Programming Assignment · swirl Lesson 6: Subsetting Vectors

     

    Practice Programming Assignment · swirl Lesson 7: Matrices and Data Frames

    WEEK 2
    Week 2: Programming with R
    Welcome to Week 2 of R Programming. This week, we take the gloves off, and the lectures cover key topics like control structures and functions. We also introduce the first programming assignment for the course, which is due at the end of the week.

     

    Reading · Week 2: Programming with R

     

    Video · Control Structures – Introduction

     

    Video · Control Structures – If-else

     

    Video · Control Structures – For loops

     

    Video · Control Structures – While loops

     

    Video · Control Structures – Repeat, Next, Break

     

    Video · Your First R Function

     

    Video · Functions (part 1)

     

    Video · Functions (part 2)

     

    Video · Scoping Rules – Symbol Binding

     

    Video · Scoping Rules – R Scoping Rules

     

    Video · Scoping Rules – Optimization Example (OPTIONAL)

     

    Video · Coding Standards

     

    Video · Dates and Times

     

    Reading · Practical R Exercises in swirl Part 2

     

    Practice Programming Assignment · swirl Lesson 1: Logic

     

    Practice Programming Assignment · swirl Lesson 2: Functions

     

    Practice Programming Assignment · swirl Lesson 3: Dates and Times

     

    Quiz · Week 2 Quiz

     

    Reading · Programming Assignment 1 INSTRUCTIONS: Air Pollution

     

    Quiz · Programming Assignment 1: Quiz

    WEEK 3
    Week 3: Loop Functions and Debugging
    We have now entered the third week of R Programming, which also marks the halfway point. The lectures this week cover loop functions and the debugging tools in R. These aspects of R make R useful for both interactive work and writing longer code, and so they are commonly used in practice.

     

    Reading · Week 3: Loop Functions and Debugging

     

    Video · Loop Functions – lapply

     

    Video · Loop Functions – apply

     

    Video · Loop Functions – mapply

     

    Video · Loop Functions – tapply

     

    Video · Loop Functions – split

     

    Video · Debugging Tools – Diagnosing the Problem

     

    Video · Debugging Tools – Basic Tools

     

    Video · Debugging Tools – Using the Tools

     

    Reading · Practical R Exercises in swirl Part 3

     

    Practice Programming Assignment · swirl Lesson 1: lapply and sapply

     

    Practice Programming Assignment · swirl Lesson 2: vapply and tapply

     

    Quiz · Week 3 Quiz

     

    Peer Review · Programming Assignment 2: Lexical Scoping

    WEEK 4
    Week 4: Simulation & Profiling
    This week covers how to simulate data in R, which serves as the basis for doing simulation studies. We also cover the profiler in R which lets you collect detailed information on how your R functions are running and to identify bottlenecks that can be addressed. The profiler is a key tool in helping you optimize your programs. Finally, we cover the str function, which I personally believe is the most useful function in R.

     

    Reading · Week 4: Simulation & Profiling

     

    Video · The str Function

     

    Video · Simulation – Generating Random Numbers

     

    Video · Simulation – Simulating a Linear Model

     

    Video · Simulation – Random Sampling

     

    Video · R Profiler (part 1)

     

    Video · R Profiler (part 2)

     

    Quiz · Week 4 Quiz

     

    Reading · Practical R Exercises in swirl Part 4

     

    Practice Programming Assignment · swirl Lesson 1: Looking at Data

     

    Practice Programming Assignment · swrl Lesson 2: Simulation

     

    Practice Programming Assignment · swirl Lesson 3: Base Graphics

     

    Reading · Programming Assignment 3 INSTRUCTIONS: Hospital Quality

     

    Quiz · Programming Assignment 3: Quiz

     

    Reading · Post-Course Survey
  3. COURSE 3

    Getting and Cleaning Data

    Subtitles
    English, Russian, French, Chinese (Simplified)

    About the Course

    Before you can work with data you have to get some. This course will cover the basic ways that data can be obtained. The course will cover obtaining data from the web, from APIs, from databases and from colleagues in various formats. It will also cover the basics of data cleaning and how to make data “tidy”. Tidy data dramatically speed downstream data analysis tasks. The course will also cover the components of a complete data set including raw data, processing instructions, codebooks, and processed data. The course will cover the basics needed for collecting, cleaning, and sharing data.


    WEEK 1
    Week 1
    In this first week of the course, we look at finding data and reading different file types.

     

    Reading · Welcome to Week 1

     

    Reading · Syllabus

     

    Reading · Pre-Course Survey

     

    Video · Obtaining Data Motivation

     

    Video · Raw and Processed Data

     

    Video · Components of Tidy Data

     

    Video · Downloading Files

     

    Video · Reading Local Files

     

    Video · Reading Excel Files

     

    Video · Reading XML

     

    Video · Reading JSON

     

    Video · The data.table Package

     

    Reading · Practical R Exercises in swirl Part 1

     

    Quiz · Week 1 Quiz

    WEEK 2
    Week 2
    Welcome to Week 2 of Getting and Cleaning Data! The primary goal is to introduce you to the most common data storage systems and the appropriate tools to extract data from web or from databases like MySQL.

     

    Video · Reading from MySQL

     

    Video · Reading from HDF5

     

    Video · Reading from The Web

     

    Video · Reading From APIs

     

    Video · Reading From Other Sources

     

    Quiz · Week 2 Quiz

    WEEK 3
    Week 3
    Welcome to Week 3 of Getting and Cleaning Data! This week the lectures will focus on organizing, merging and managing the data you have collected using the lectures from Weeks 1 and 2.

     

    Video · Subsetting and Sorting

     

    Video · Summarizing Data

     

    Video · Creating New Variables

     

    Video · Reshaping Data

     

    Video · Managing Data Frames with dplyr – Introduction

     

    Video · Managing Data Frames with dplyr – Basic Tools

     

    Video · Merging Data

     

    Reading · Practical R Exercises in swirl Part 2

     

    Practice Programming Assignment · swirl Lesson 1: Manipulating Data with dplyr

     

    Practice Programming Assignment · swirl Lesson 2: Grouping and Chaining with dplyr

     

    Practice Programming Assignment · swirl Lesson 3: Tidying Data with tidyr

     

    Quiz · Week 3 Quiz

    WEEK 4
    Week 4
    Welcome to Week 4 of Getting and Cleaning Data! This week we finish up with lectures on text and date manipulation in R. In this final week we will also focus on peer grading of Course Projects.

     

    Video · Editing Text Variables

     

    Video · Regular Expressions I

     

    Video · Regular Expressions II

     

    Video · Working with Dates

     

    Video · Data Resources

     

    Reading · Practical R Exercises in swirl Part 4

     

    Practice Programming Assignment · swirl Lesson 1: Dates and Times with lubridate

     

    Quiz · Week 4 Quiz

     

    Peer Review · Getting and Cleaning Data Course Project

     

    Reading · Post-Course Survey

  4. COURSE 4

    Exploratory Data Analysis

    Subtitles
    English, Chinese (Simplified)

    About the Course

    This course covers the essential exploratory techniques for summarizing data. These techniques are typically applied before formal modeling commences and can help inform the development of more complex statistical models. Exploratory techniques are also important for eliminating or sharpening potential hypotheses about the world that can be addressed by the data. We will cover in detail the plotting systems in R as well as some of the basic principles of constructing data graphics. We will also cover some of the common multivariate statistical techniques used to visualize high-dimensional data.


    WEEK 1
    Week 1
    This week covers the basics of analytic graphics and the base plotting system in R. We’ve also included some background material to help you install R if you haven’t done so already.

     

    Reading · Welcome to Exploratory Data Analysis

     

    Reading · Syllabus

     

    Reading · Pre-Course Survey

     

    Video · Introduction

     

    Reading · Exploratory Data Analysis with R Book

     

    Reading · The Art of Data Science

     

    Video · Installing R on Windows (3.2.1)

     

    Video · Installing R on a Mac (3.2.1)

     

    Video · Installing R Studio (Mac)

     

    Video · Setting Your Working Directory (Windows)

     

    Video · Setting Your Working Directory (Mac)

     

    Video · Principles of Analytic Graphics

     

    Video · Exploratory Graphs (part 1)

     

    Video · Exploratory Graphs (part 2)

     

    Video · Plotting Systems in R

     

    Video · Base Plotting System (part 1)

     

    Video · Base Plotting System (part 2)

     

    Video · Base Plotting Demonstration

     

    Video · Graphics Devices in R (part 1)

     

    Video · Graphics Devices in R (part 2)

     

    Reading · Practical R Exercises in swirl Part 1

     

    Practice Programming Assignment · swirl Lesson 1: Principles of Analytic Graphs

     

    Practice Programming Assignment · swirl Lesson 2: Exploratory Graphs

     

    Practice Programming Assignment · swirl Lesson 3: Graphics Devices in R

     

    Practice Programming Assignment · swirl Lesson 4: Plotting Systems

     

    Practice Programming Assignment · swirl Lesson 5: Base Plotting System

     

    Quiz · Week 1 Quiz

     

    Peer Review · Course Project 1

    WEEK 2
    Week 2
    Welcome to Week 2 of Exploratory Data Analysis. This week covers some of the more advanced graphing systems available in R: the Lattice system and the ggplot2 system. While the base graphics system provides many important tools for visualizing data, it was part of the original R system and lacks many features that may be desirable in a plotting system, particularly when visualizing high dimensional data. The Lattice and ggplot2 systems also simplify the laying out of plots making it a much less tedious process.

     

    Video · Lattice Plotting System (part 1)

     

    Video · Lattice Plotting System (part 2)

     

    Video · ggplot2 (part 1)

     

    Video · ggplot2 (part 2)

     

    Video · ggplot2 (part 3)

     

    Video · ggplot2 (part 4)

     

    Video · ggplot2 (part 5)

     

    Reading · Practical R Exercises in swirl Part 2

     

    Practice Programming Assignment · swirl Lesson 1: Lattice Plotting System

     

    Practice Programming Assignment · swirl Lesson 2: Working with Colors

     

    Practice Programming Assignment · swirl Lesson 3: GGPlot2 Part1

     

    Practice Programming Assignment · swirl Lesson 4: GGPlot2 Part2

     

    Practice Programming Assignment · swirl Lesson 5: GGPlot2 Extras

     

    Quiz · Week 2 Quiz

    WEEK 3
    Week 3
    Welcome to Week 3 of Exploratory Data Analysis. This week covers some of the workhorse statistical methods for exploratory analysis. These methods include clustering and dimension reduction techniques that allow you to make graphical displays of very high dimensional data (many many variables). We also cover novel ways to specify colors in R so that you can use color as an important and useful dimension when making data graphics. All of this material is covered in chapters 9-12 of my book Exploratory Data Analysis with R.

     

    Video · Hierarchical Clustering (part 1)

     

    Video · Hierarchical Clustering (part 2)

     

    Video · Hierarchical Clustering (part 3)

     

    Video · K-Means Clustering (part 1)

     

    Video · K-Means Clustering (part 2)

     

    Video · Dimension Reduction (part 1)

     

    Video · Dimension Reduction (part 2)

     

    Video · Dimension Reduction (part 3)

     

    Video · Working with Color in R Plots (part 1)

     

    Video · Working with Color in R Plots (part 2)

     

    Video · Working with Color in R Plots (part 3)

     

    Video · Working with Color in R Plots (part 4)

     

    Reading · Practical R Exercises in swirl Part 3

     

    Practice Programming Assignment · swirl Lesson 1: Hierarchical Clustering

     

    Practice Programming Assignment · swirl Lesson 2: K Means Clustering

     

    Practice Programming Assignment · swirl Lesson 3: Dimension Reduction

     

    Practice Programming Assignment · swirl Lesson 4: Clustering Example

    WEEK 4
    Week 4
    This week, we’ll look at two case studies in exploratory data analysis. The first involves the use of cluster analysis techniques, and the second is a more involved analysis of some air pollution data. How one goes about doing EDA is often personal, but I’m providing these videos to give you a sense of how you might proceed with a specific type of dataset.

     

    Video · Clustering Case Study

     

    Video · Air Pollution Case Study

     

    Reading · Practical R Exercises in swirl Part 4

     

    Practice Programming Assignment · swirl Lesson 1: CaseStudy

     

    Peer Review · Course Project 2

     

    Reading · Post-Course Survey

  5. COURSE 5

    Reproducible Research

    Commitment
    4-9 hours/week
    Subtitles
    English

    About the Course

    This course focuses on the concepts and tools behind reporting modern data analyses in a reproducible manner. Reproducible research is the idea that data analyses, and more generally, scientific claims, are published with their data and software code so that others may verify the findings and build upon them. The need for reproducibility is increasing dramatically as data analyses become more complex, involving larger datasets and more sophisticated computations. Reproducibility allows for people to focus on the actual content of a data analysis, rather than on superficial details reported in a written summary. In addition, reproducibility makes an analysis more useful to others because the data and code that actually conducted the analysis are available. This course will focus on literate statistical analysis tools which allow one to publish data analyses in a single document that allows others to easily execute the same analysis to obtain the same results.


    WEEK 1
    Week 1: Concepts, Ideas, & Structure
    This week will cover the basic ideas of reproducible research since they may be unfamiliar to some of you. We also cover structuring and organizing a data analysis to help make it more reproducible. I recommend that you watch the videos in the order that they are listed on the web page, but watching the videos out of order isn’t going to ruin the story.

     

    Video · Introduction

     

    Reading · Syllabus

     

    Reading · Pre-course survey

     

    Reading · Course Book: Report Writing for Data Science in R

     

    Video · What is Reproducible Research About?

     

    Video · Reproducible Research: Concepts and Ideas (part 1)

     

    Video · Reproducible Research: Concepts and Ideas (part 2)

     

    Video · Reproducible Research: Concepts and Ideas (part 3)

     

    Video · Scripting Your Analysis

     

    Video · Structure of a Data Analysis (part 1)

     

    Video · Structure of a Data Analysis (part 2)

     

    Video · Organizing Your Analysis

     

    Quiz · Week 1 Quiz

    WEEK 2
    Week 2: Markdown & knitr
    This week we cover some of the core tools for developing reproducible documents. We cover the literate programming tool knitr and show how to integrate it with Markdown to publish reproducible web documents. We also introduce the first peer assessment which will require you to write up a reproducible data analysis using knitr.

     

    Video · Coding Standards in R

     

    Video · Markdown

     

    Video · R Markdown

     

    Video · R Markdown Demonstration

     

    Video · knitr (part 1)

     

    Video · knitr (part 2)

     

    Video · knitr (part 3)

     

    Video · knitr (part 4)

     

    Quiz · Week 2 Quiz

     

    Video · Introduction to Course Project 1

     

    Peer Review · Course Project 1

    WEEK 3
    Week 3: Reproducible Research Checklist & Evidence-based Data Analysis
    This week covers what one could call a basic check list for ensuring that a data analysis is reproducible. While it’s not absolutely sufficient to follow the check list, it provides a necessary minimum standard that would be applicable to almost any area of analysis.

     

    Video · Communicating Results

     

    Video · RPubs

     

    Video · Reproducible Research Checklist (part 1)

     

    Video · Reproducible Research Checklist (part 2)

     

    Video · Reproducible Research Checklist (part 3)

     

    Video · Evidence-based Data Analysis (part 1)

     

    Video · Evidence-based Data Analysis (part 2)

     

    Video · Evidence-based Data Analysis (part 3)

     

    Video · Evidence-based Data Analysis (part 4)

     

    Video · Evidence-based Data Analysis (part 5)

    WEEK 4
    Week 4: Case Studies & Commentaries
    This week there are two case studies involving the importance of reproducibility in science for you to watch.

     

    Video · Caching Computations

     

    Video · Case Study: Air Pollution

     

    Video · Case Study: High Throughput Biology

     

    Video · Commentaries on Data Analysis

     

    Video · Introduction to Peer Assessment 2

     

    Peer Review · Course Project 2

     

    Reading · Post-Course Survey

  6. COURSE 6

    Statistical Inference

    Subtitles
    English

    About the Course

    Statistical inference is the process of drawing conclusions about populations or scientific truths from data. There are many modes of performing inference including statistical modeling, data oriented strategies and explicit use of designs and randomization in analyses. Furthermore, there are broad theories (frequentists, Bayesian, likelihood, design based, …) and numerous complexities (missing data, observed and unobserved confounding, biases) for performing inference. A practitioner can often be left in a debilitating maze of techniques, philosophies and nuance. This course presents the fundamentals of inference in a practical approach for getting things done. After taking this course, students will understand the broad directions of statistical inference and use this information for making informed choices in analyzing data.


    WEEK 1
    Week 1: Probability & Expected Values
    This week, we’ll focus on the fundamentals including probability, random variables, expectations and more.

     

    Video · Introductory video

     

    Reading · Welcome to Statistical Inference

     

    Reading · Some introductory comments

     

    Reading · Pre-Course Survey

     

    Reading · Syllabus

     

    Reading · Course Book: Statistical Inference for Data Science

     

    Reading · Data Science Specialization Community Site

     

    Reading · Homework Problems

     

    Reading · Probability

     

    Video · 02 01 Introduction to probability

     

    Video · 02 02 Probability mass functions

     

    Video · 02 03 Probability density functions

     

    Reading · Conditional probability

     

    Video · 03 01 Conditional Probability

     

    Video · 03 02 Bayes’ rule

     

    Video · 03 03 Independence

     

    Reading · Expected values

     

    Video · 04 01 Expected values

     

    Video · 04 02 Expected values, simple examples

     

    Video · 04 03 Expected values for PDFs

     

    Reading · Practical R Exercises in swirl 1

     

    Practice Programming Assignment · swirl Lesson 1: Introduction

     

    Practice Programming Assignment · swirl Lesson 2: Probability1

     

    Practice Programming Assignment · swirl Lesson 3: Probability2

     

    Practice Programming Assignment · swirl Lesson 4: ConditionalProbability

     

    Practice Programming Assignment · swirl Lesson 5: Expectations

     

    Quiz · Quiz 1

    WEEK 2
    Week 2: Variability, Distribution, & Asymptotics
    We’re going to tackle variability, distributions, limits, and confidence intervals.

     

    Reading · Variability

     

    Video · 05 01 Introduction to variability

     

    Video · 05 02 Variance simulation examples

     

    Video · 05 03 Standard error of the mean

     

    Video · 05 04 Variance data example

     

    Reading · Distributions

     

    Video · 06 01 Binomial distrubtion

     

    Video · 06 02 Normal distribution

     

    Video · 06 03 Poisson

     

    Reading · Asymptotics

     

    Video · 07 01 Asymptotics and LLN

     

    Video · 07 02 Asymptotics and the CLT

     

    Video · 07 03 Asymptotics and confidence intervals

     

    Reading · Practical R Exercises in swirl Part 2

     

    Practice Programming Assignment · swirl Lesson 1: Variance

     

    Practice Programming Assignment · swirl Lesson 2: CommonDistros

     

    Practice Programming Assignment · swirl Lesson 3: Asymptotics

     

    Quiz · Quiz 2

    WEEK 3
    Week: Intervals, Testing, & Pvalues
    We will be taking a look at intervals, testing, and pvalues in this lesson.

     

    Reading · Confidence intervals

     

    Video · 08 01 T confidence intervals

     

    Video · 08 02 T confidence intervals example

     

    Video · 08 03 Independent group T intervals

     

    Video · 08 04 A note on unequal variance

     

    Reading · Hypothesis testing

     

    Video · 09 01 Hypothesis testing

     

    Video · 09 02 Example of choosing a rejection region

     

    Video · 09 03 T tests

     

    Video · 09 04 Two group testing

     

    Reading · P-values

     

    Video · 10 01 Pvalues

     

    Video · 10 02 Pvalue further examples

     

    Reading · Knitr

     

    Video · Just enough knitr to do the project

     

    Reading · Practical R Exercises in swirl Part 3

     

    Practice Programming Assignment · swirl Lesson 1: T Confidence Intervals

     

    Practice Programming Assignment · swirl Lesson 2: Hypothesis Testing

     

    Practice Programming Assignment · swirl Lesson 3: P Values

     

    Quiz · Quiz 3

    WEEK 4
    Week 4: Power, Bootstrapping, & Permutation Tests
    We will begin looking into power, bootstrapping, and permutation tests.

     

    Reading · Power

     

    Video · 11 01 Power

     

    Video · 11 02 Calculating Power

     

    Video · 11 03 Notes on power

     

    Video · 11 04 T test power

     

    Video · 12 01 Multiple Comparisons

     

    Reading · Resampling

     

    Video · 13 01 Bootstrapping

     

    Video · 13 02 Bootstrapping example

     

    Video · 13 03 Notes on the bootstrap

     

    Video · 13 04 Permutation tests

     

    Quiz · Quiz 4

     

    Peer Review · Statistical Inference Course Project

     

    Reading · Practical R Exercises in swirl Part 4

     

    Practice Programming Assignment · swirl Lesson 1: Power

     

    Practice Programming Assignment · swirl Lesson 2: Multiple Testing

     

    Practice Programming Assignment · swirl Lesson 3: Resampling

     

    Reading · Post-Course Survey

  7. COURSE 7

    Regression Models

    Subtitles
    English

    About the Course

    Linear models, as their name implies, relates an outcome to a set of predictors of interest using linear assumptions. Regression models, a subset of linear models, are the most important statistical analysis tool in a data scientist’s toolkit. This course covers regression analysis, least squares and inference using regression models. Special cases of the regression model, ANOVA and ANCOVA will be covered as well. Analysis of residuals and variability will be investigated. The course will cover modern thinking on model selection and novel uses of regression models including scatterplot smoothing.


    WEEK 1
    Week 1: Least Squares and Linear Regression
    This week, we focus on least squares and linear regression.

     

    Reading · Welcome to Regression Models

     

    Reading · Book: Regression Models for Data Science in R

     

    Reading · Syllabus

     

    Reading · Pre-Course Survey

     

    Reading · Data Science Specialization Community Site

     

    Reading · Where to get more advanced material

     

    Reading · Regression

     

    Video · Introduction to Regression

     

    Video · Introduction: Basic Least Squares

     

    Reading · Technical details

     

    Video · Technical Details (Skip if you’d like)

     

    Video · Introductory Data Example

     

    Reading · Least squares

     

    Video · Notation and Background

     

    Video · Linear Least Squares

     

    Video · Linear Least Squares Coding Example

     

    Video · Technical Details (Skip if you’d like)

     

    Reading · Regression to the mean

     

    Video · Regression to the Mean

     

    Reading · Practical R Exercises in swirl Part 1

     

    Practice Programming Assignment · swirl Lesson 1: Introduction

     

    Practice Programming Assignment · swirl Lesson 2: Residuals

     

    Practice Programming Assignment · swirl Lesson 3: Least Squares Estimation

     

    Quiz · Quiz 1

    WEEK 2
    Week 2: Linear Regression & Multivariable Regression
    This week, we will work through the remainder of linear regression and then turn to the first part of multivariable regression.

     

    Reading · *Statistical* linear regression models

     

    Video · Statistical Linear Regression Models

     

    Video · Interpreting Coefficients

     

    Video · Linear Regression for Prediction

     

    Reading · Residuals

     

    Video · Residuals

     

    Video · Residuals, Coding Example

     

    Video · Residual Variance

     

    Reading · Inference in regression

     

    Video · Inference in Regression

     

    Video · Coding Example

     

    Video · Prediction

     

    Reading · Looking ahead to the project

     

    Video · Really, really quick intro to knitr

     

    Reading · Practical R Exercises in swirl Part 2

     

    Practice Programming Assignment · swirl Lesson 1: Residual Variation

     

    Practice Programming Assignment · swirl Lesson 2: Introduction to Multivariable Regression

     

    Practice Programming Assignment · swirl Lesson 3: MultiVar Examples

     

    Quiz · Quiz 2

    WEEK 3
    Week 3: Multivariable Regression, Residuals, & Diagnostics
    This week, we’ll build on last week’s introduction to multivariable regression with some examples and then cover residuals, diagnostics, variance inflation, and model comparison.

     

    Reading · Multivariable regression

     

    Video · Multivariable Regression part I

     

    Video · Multivariable Regression part II

     

    Video · Multivariable Regression Continued

     

    Video · Multivariable Regression Examples part I

     

    Video · Multivariable Regression Examples part II

     

    Video · Multivariable Regression Examples part III

     

    Video · Multivariable Regression Examples part IV

     

    Reading · Adjustment

     

    Video · Adjustment Examples

     

    Reading · Residuals

     

    Video · Residuals and Diagnostics part I

     

    Video · Residuals and Diagnostics part II

     

    Video · Residuals and Diagnostics part III

     

    Reading · Model selection

     

    Video · Model Selection part I

     

    Video · Model Selection part II

     

    Video · Model Selection part III

     

    Reading · Practical R Exercises in swirl Part 3

     

    Practice Programming Assignment · swirl Lesson 1: MultiVar Examples2

     

    Practice Programming Assignment · swirl Lesson 2: MultiVar Examples3

     

    Practice Programming Assignment · swirl Lesson 3: Residuals Diagnostics and Variation

     

    Quiz · Quiz 3

     

    Practice Quiz · (OPTIONAL) Data analysis practice with immediate feedback (NEW! 10/18/2017)

    WEEK 4
    Week 4: Logistic Regression and Poisson Regression
    This week, we will work on generalized linear models, including binary outcomes and Poisson regression.

     

    Reading · GLMs

     

    Video · GLMs

     

    Reading · Logistic regression

     

    Video · Logistic Regression part I

     

    Video · Logistic Regression part II

     

    Video · Logistic Regression part III

     

    Reading · Count Data

     

    Video · Poisson Regression part I

     

    Video · Poisson Regression part II

     

    Reading · Mishmash

     

    Video · Hodgepodge

     

    Reading · Practical R Exercises in swirl Part 4

     

    Practice Programming Assignment · swirl Lesson 1: Variance Inflation Factors

     

    Practice Programming Assignment · swirl Lesson 2: Overfitting and Underfitting

     

    Practice Programming Assignment · swirl Lesson 3: Binary Outcomes

     

    Practice Programming Assignment · swirl Lesson 4: Count Outcomes

     

    Quiz · Quiz 4

     

    Peer Review · Regression Models Course Project

     

    Reading · Post-Course Survey

  8. COURSE 8

    Practical Machine Learning

    Subtitles
    English

    About the Course

    One of the most common tasks performed by data scientists and data analysts are prediction and machine learning. This course will cover the basic components of building and applying prediction functions with an emphasis on practical applications. The course will provide basic grounding in concepts such as training and tests sets, overfitting, and error rates. The course will also introduce a range of model based and algorithmic machine learning methods including regression, classification trees, Naive Bayes, and random forests. The course will cover the complete process of building prediction functions including data collection, feature creation, algorithms, and evaluation.


    WEEK 1
    Week 1: Prediction, Errors, and Cross Validation
    This week will cover prediction, relative importance of steps, errors, and cross validation.

     

    Reading · Welcome to Practical Machine Learning

     

    Reading · Syllabus

     

    Reading · Pre-Course Survey

     

    Video · Prediction motivation

     

    Video · What is prediction?

     

    Video · Relative importance of steps

     

    Video · In and out of sample errors

     

    Video · Prediction study design

     

    Video · Types of errors

     

    Video · Receiver Operating Characteristic

     

    Video · Cross validation

     

    Video · What data should you use?

     

    Quiz · Quiz 1

    WEEK 2
    Week 2: The Caret Package
    This week will introduce the caret package, tools for creating features and preprocessing.

     

    Video · Caret package

     

    Video · Data slicing

     

    Video · Training options

     

    Video · Plotting predictors

     

    Video · Basic preprocessing

     

    Video · Covariate creation

     

    Video · Preprocessing with principal components analysis

     

    Video · Predicting with Regression

     

    Video · Predicting with Regression Multiple Covariates

     

    Quiz · Quiz 2

    WEEK 3
    Week 3: Predicting with trees, Random Forests, & Model Based Predictions
    This week we introduce a number of machine learning algorithms you can use to complete your course project.

     

    Video · Predicting with trees

     

    Video · Bagging

     

    Video · Random Forests

     

    Video · Boosting

     

    Video · Model Based Prediction

     

    Quiz · Quiz 3

    WEEK 4
    Week 4: Regularized Regression and Combining Predictors
    This week, we will cover regularized regression and combining predictors.

     

    Video · Regularized regression

     

    Video · Combining predictors

     

    Video · Forecasting

     

    Video · Unsupervised Prediction

     

    Quiz · Quiz 4

     

    Reading · Course Project Instructions (READ FIRST)

     

    Peer Review · Prediction Assignment Writeup

     

    Quiz · Course Project Prediction Quiz

     

    Reading · Post-Course Survey

  9. COURSE 9

    Developing Data Products

    Subtitles
    English

    About the Course

    A data product is the production output from a statistical analysis. Data products automate complex analysis tasks or use technology to expand the utility of a data informed model, algorithm or inference. This course covers the basics of creating data products using Shiny, R packages, and interactive graphics. The course will focus on the statistical fundamentals of creating a data product that can be used to tell a story about data to a mass audience.


    WEEK 1
    Course Overview
    In this overview module, we’ll go over some information and resources to help you get started and succeed in the course.

     

    Video · Welcome to Developing Data Products

     

    Reading · Syllabus

     

    Reading · Welcome

     

    Reading · Book: Developing Data Products in R

     

    Reading · Community Site

     

    Reading · R and RStudio Links & Tutorials

    Shiny, GoogleVis, and Plotly
    Now we can turn to the first substantive lessons. In this module, you’ll learn how to develop basic applications and interactive graphics in shiny, compose interactive HTML graphics with GoogleVis, and prepare data visualizations with Plotly.

     

    Reading · Shiny

     

    Reading · Shinyapps.io Project

     

    Video · Shiny 1.1

     

    Video · Shiny 1.2

     

    Video · Shiny 1.3

     

    Video · Shiny 1.4

     

    Video · Shiny 1.5

     

    Video · Shiny 2.1

     

    Video · Shiny 2.2

     

    Video · Shiny 2.3

     

    Video · Shiny 2.4

     

    Video · Shiny 2.5

     

    Video · Shiny 2.6

     

    Video · Shiny Gadgets 1.1

     

    Video · Shiny Gadgets 1.2

     

    Video · Shiny Gadgets 1.3

     

    Video · GoogleVis 1.1

     

    Video · GoogleVis 1.2

     

    Video · Plotly 1.1

     

    Video · Plotly 1.2

     

    Video · Plotly 1.3

     

    Video · Plotly 1.4

     

    Video · Plotly 1.5

     

    Video · Plotly 1.6

     

    Video · Plotly 1.7

     

    Video · Plotly 1.8

     

    Quiz · Quiz 1

    WEEK 2
    R Markdown and Leaflet
    During this module, we’ll learn how to create R Markdown files and embed R code in an Rmd. We’ll also explore Leaflet and use it to create interactive annotated maps.

     

    Video · R Markdown 1.1

     

    Video · R Markdown 1.2

     

    Video · R Markdown 1.3

     

    Video · R Markdown 1.4

     

    Video · R Markdown 1.5

     

    Video · R Markdown 1.6

     

    Reading · Three Ways to Share R Markdown Products

     

    Video · Leaflet 1.1

     

    Video · Leaflet 1.2

     

    Video · Leaflet 1.3

     

    Video · Leaflet 1.4

     

    Video · Leaflet 1.5

     

    Video · Leaflet 1.6

     

    Quiz · Quiz 2

     

    Peer Review · R Markdown and Leaflet

    WEEK 3
    R Packages
    In this module, we’ll dive into the world of creating R packages and practice developing an R Markdown presentation that includes a data visualization built using Plotly.

     

    Reading · R Packages

     

    Video · R Packages (Part 1)

     

    Video · R Packages (Part 2)

     

    Video · Building R Packages Demo

     

    Video · R Classes and Methods (Part 1)

     

    Video · R Classes and Methods (Part 2)

     

    Quiz · Quiz 3

     

    Peer Review · R Markdown Presentation & Plotly

    WEEK 4
    Swirl and Course Project
    Week 4 is all about the Course Project, producing a Shiny Application and reproducible pitch.

     

    Video · Swirl 1.1

     

    Video · Swirl 1.2

     

    Video · Swirl 1.3

     

    Peer Review · Course Project: Shiny Application and Reproducible Pitch

     

    Reading · Post-Course Survey

  10. COURSE 10

    Data Science Capstone

    Commitment
    4-9 hours/week
    Subtitles
    English

    About the Capstone Project

    The capstone project class will allow students to create a usable/public data product that can be used to show your skills to potential employers. Projects will be drawn from real-world problems and will be conducted with industry, government, and academic partners.


    WEEK 1
    Overview, Understanding the Problem, and Getting the Data
    This week, we introduce the project so you can get a clear grip on the problem at hand and begin working with the dataset.

     

    Video · Welcome to the Capstone Project

     

    Reading · Project Overview

     

    Video · Welcome from SwiftKey

     

    Video · You Are a Data Scientist Now

     

    Reading · Syllabus

     

    Video · Introduction to Task 0: Understanding the Problem

     

    Reading · Task 0 – Understanding the problem

     

    Reading · About the Copora

     

    Video · Introduction to Task 1: Getting and Cleaning the Data

     

    Reading · Task 1 – Getting and cleaning the data

     

    Video · Regular Expressions: Part 1 (Optional)

     

    Video · Regular Expressions: Part 2 (Optional)

     

    Quiz · Quiz 1: Getting Started

    WEEK 2
    Exploratory Data Analysis and Modeling
    This week, we move on to the next tasks, exploratory data analysis and modeling. You’ll also submit your milestone report and review submissions from your classmates.

     

    Video · Introduction to Task 2: Exploratory Data Analysis

     

    Reading · Task 2 – Exploratory Data Analysis

     

    Video · Introduction to Task 3: Modeling

     

    Reading · Task 3 – Modeling

     

    Peer Review · Milestone Report

    WEEK 3
    Prediction Model
    This week, you’ll build and evaluate your prediction model. The goal is to make your model efficient and accurate.

     

    Video · Introduction to Task 4: Prediction Model

     

    Reading · Task 4 – Prediction Model

     

    Quiz · Quiz 2: Natural language processing I

    WEEK 4
    Creative Exploration
    This week’s goal is to improve the predictive accuracy while reducing computational runtime and model complexity.

     

    Video · Introduction to Task 5: Creative Exploration

     

    Reading · Task 5 – Creative Exploration

     

    Quiz · Quiz 3: Natural language processing II

    WEEK 5
    Data Product
    This week, you’ll work on developing the first component of your final project, your data product.

     

    Video · Introduction to Task 6: Data Product

     

    Reading · Task 6 – Data Product

    WEEK 6
    Slide Deck
    This week, you’ll work on developing the second component of your final project, a slide deck to accompany your data product.

     

    Video · Introduction to Task 7: Slide Deck

     

    Reading · Task 7 – Slide Deck

    WEEK 7
    Final Project Submission and Evaluation
    This week, you’ll submit your final project and review the work of your classmates.

     

    Peer Review · Final Project Submission

     

    Video · Congratulations!

 

Creators

Johns Hopkins University is recognized as a destination for excellent, ambitious scholars and a world leader in teaching and research. The mission of The Johns Hopkins University is to educate its students and cultivate their capacity for life-long learning, to foster independent and original research, and to bring the benefits of discovery to the world.

The mission of The Johns Hopkins University is to educate its students and cultivate their capacity for life-long learning, to foster independent and original research, and to bring the benefits of discovery to the world.

 

Enroll Now

 

FREE Wordpress blog setup service

FREE Wordpress blog setup service

Ad: Yes, I will not charge you a single dime for setting up your blog based on WordPress. Click here for more information

Mobile App Creation

Mobile App Creation Service

Ad: Get a mobile app based on your website. Get it published under Google Play and Apple app store in no time! Get more visitors towards your business. Click here for more information