Udacity UD359: Intro to Data Science

Intro to Data Science
Instructor: Dave Holtz, Udacity
Zeitraum: ab Februar 2014
Status: gemacht, ohne Certificate

Anmerkung: habe mich eigentlich durch den ganzen Kurs gearbeitet, aber das Exam war damals bei Udacity nur noch kostenpflichtig. Ist nicht wirklich schlecht, hat aber trotzdem an vielen Stellen echte Schwächen (und das ist mir als dummy aufgefallen). Mir kommt vor der Kurs ist entstanden, indem Dave gesagt hat ok, mach ich, und dann in vier Wochen soviel wie möglich zusammengelesen hat.


Course Syllabus

Lesson 1: Introduction to Data Science

Introduction to Data Science
What is a Data Scientist
Pi-Chaun (Data Scientist at Google): What is Data Science?
Gabor (Data Scientist at Twitter): What is Data Science?
Problems Solved by Data Science
Pandas
Dataframes
Create a New Dataframe

Lesson 2: Data Wrangling

What is Data Wrangling?
Acquiring Data
Common Data Formats
What are Relational Databases?
Aadhaar Data
Aadhaar Data and Relational Databases
Introduction to Databases Schemas
API’s
Data in JSON Format
How to Access an API efficiently
Missing Values
Easy Imputation
Impute using Linear Regression
Tip of the Imputation Iceberg

Lesson 3: Data Analysis

Statistical Rigor
Kurt (Data Scientist at Twitter) – Why is Stats Useful?
Introduction to Normal Distribution
T Test
Welch T Test
Non-Parametric Tests
Non-Normal Data
Stats vs. Machine Learning
Different Types of Machine Learning
Prediction with Regression
Cost Function
How to Minimize Cost Function
Coefficients of Determination

Lesson 4: Data Visualization

Effective Information Visualization
Napoleon’s March on Russia
Don (Principal Data Scientist at AT&T): Communicating Findings
Rishiraj (Principal Data Scientist at AT&T): Communicating Findings Well
Visual Encodings
Perception of Visual Cues
Plotting in Python
Data Scales
Visualizing Time Series Data

Lesson 5: MapReduce

Big Data and MapReduce
Basics of MapReduce
Mapper
Reducer
MapReduce with Aadhaar Data
MapReduce Ecosystem
Joshua (Data Scientist at Twitter): MapReduce Tools, Pig
MapReduce with Subway Data