Applied Statistics for Data Science: A Hands-On Approach!

Build An Intuitive Understanding Using Python code: Histograms, CLT, Testing, Distributions, Correlation and much more!

Ratings: 4.66 / 5.00




Description

Welcome to the course on Statistics For Data Scientists!


  • Learn about the key concepts in statistics, and how to apply them to your data analysis.

  • A highly practical and hands-on approach.

  • A focus on building an intuitive understanding of each topic.

  • Learn to use Python code to simulate various scenarios in a plug-and-play manner.


What is included in the course:

  • Detailed Course Notes (100 page textbook with 50+ illustrative figures)

  • Deck of 360 slides

  • Lectures with 10h+ content spread over 40+ videos

  • All of the code in Jupyter Notebooks (7 notebooks, 2000+ lines of code)

  • Bonus Chapter: Introduction to Machine Learning


Topics that the course covers:

  1. The Histogram

  2. Generating artificial Data sets

  3. The central tenet of Statistics

  4. The Central Limit Theorem

  5. Distribution functions

    1. Percentiles

    2. Data Ranges

    3. Cumulative Distribution Function

    4. Different Distribution types:

      1. Normal Distribution

      2. Uniform Distribution

      3. Exponential Distribution

      4. Poisson Distribution

      5. Bernoulli Distribution

      6. Rayleigh Distribution

  6. Statistical Testing

    1. Reasoning behind statistical testing

    2. P-value

    3. Statistical Significance

    4. Different Statistical Tests:

      1. Shapiro-Wilk test

      2. Levene's test

      3. Student T-test/ Welsh T-test

      4. ANOVA test

      5. Kolmogorov Smirnov test

      6. Non-parametric tests

    5. Two real-life examples

      1. Detect a biased coin with 95% certainty

      2. Real-life A/B testing

  7. Correlation

    1. Linear correlation - Pearson correlation coefficient + alternatives

    2. Categorical correlation - Chi-Squared test + contingency tables

  8. EXTRA: Regression and intro to Machine Learning

    1. Linear Regression

    2. Logistic Regression + ML pipeline


Who is this course for:

  • Students on a data science track, or any other technical field.

  • Professionals that want to pivot into a data science career.

  • Managers that want to be able to make data driven decisions.

  • Practicing Data Scientists that want to add this value skill to their tool belt.

What You Will Learn!

  • Perform elaborate and involved Data Analysis on any dataset.
  • Build an intuitive understanding of concept in Statistics: Sample, Population, Correlation, P-value, Significance, and others.
  • Be able to write Python code that generates elaborate and beautiful Visuals.
  • Make Simulations using Python code that showcase various Statistical Concepts.
  • Be able to perform various Statistical Tests using Python (Student T-test, Welsh's Test, Levene's Test, Shapiro-Wilk test, ...)
  • Be able to build a Machine Learning model to predict outcomes based on linear and logistic regression.

Who Should Attend!

  • Students on a Data Science track or other technical field.
  • Professionals that want to pivot towards a data science career.
  • Active Data Scientists that want to add statistical knowledge and intuition to their tool belt.
  • Managerial Roles in technical fields that want to up their skill to make better decisions about data.