Data Science in Python: Classification Modeling

Learn Python for Data Science & Supervised Machine Learning, and build classification models with fun, hands-on projects

Ratings: 4.62 / 5.00




Description

This is a hands-on, project-based course designed to help you master the foundations for classification modeling in Python.


We’ll start by reviewing the data science workflow, discussing the primary goals & types of classification algorithms, and do a deep dive into the classification modeling steps we’ll be using throughout the course.


You’ll learn to perform exploratory data analysis, leverage feature engineering techniques like scaling, dummy variables, and binning, and prepare data for modeling by splitting it into train, test, and validation datasets.


From there, we’ll fit K-Nearest Neighbors & Logistic Regression models, and build an intuition for interpreting their coefficients and evaluating their performance using tools like confusion matrices and metrics like accuracy, precision, and recall. We’ll also cover techniques for modeling imbalanced data, including threshold tuning, sampling methods like oversampling & SMOTE, and adjusting class weights in the model cost function.


Throughout the course, you'll play the role of Data Scientist for the risk management department at Maven National Bank. Using the skills you learn throughout the course, you'll use Python to explore their data and build classification models to accurately determine which customers have high, medium, and low credit risk based on their profiles.


Last but not least, you'll learn to build and evaluate decision tree models for classification. You’ll fit, visualize, and fine-tune these models using Python, then apply your knowledge to more advanced ensemble models like random forests and gradient boosted machines.


COURSE OUTLINE:


  • Intro to Data Science

    • Introduce the fields of data science and machine learning, review essential skills, and introduce each phase of the data science workflow


  • Classification 101

    • Review the basics of classification, including key terms, the types and goals of classification modeling, and the modeling workflow


  • Pre-Modeling Data Prep & EDA

    • Recap the data prep & EDA steps required to perform modeling, including key techniques to explore the target, features, and their relationships


  • K-Nearest Neighbors

    • Learn how the k-nearest neighbors (KNN) algorithm classifies data points and practice building KNN models in Python


  • Logistic Regression

    • Introduce logistic regression, learn the math behind the model, and practice fitting them and tuning regularization strength


  • Classification Metrics

    • Learn how and when to use several important metrics for evaluating classification models, such as precision, recall, F1 score, and ROC-AUC


  • Imbalanced Data

    • Understand the challenges of modeling imbalanced data and learn strategies for improving model performance in these scenarios


  • Decision Trees

    • Build and evaluate decision tree models, algorithms that look for the splits in your data that best separate your classes


  • Ensemble Models

    • Get familiar with the basics of ensemble models, then dive into specific models like random forests and gradient boosted machines


__________


Ready to dive in? Join today and get immediate, LIFETIME access to the following:


  • 9.5 hours of high-quality video

  • 18 homework assignments

  • 9 quizzes

  • 2 projects

  • Data Science in Python: Classification ebook (250+ pages)

  • Downloadable project files & solutions

  • Expert support and Q&A forum

  • 30-day Udemy satisfaction guarantee


If you're an aspiring data scientist looking for an introduction to the world of classification modeling with Python, this is the course for you.


Happy learning!

-Chris Bruehl (Data Science Expert & Lead Python Instructor, Maven Analytics)

What You Will Learn!

  • Master the foundations of supervised Machine Learning & classification modeling in Python
  • Perform exploratory data analysis on model features and targets
  • Apply feature engineering techniques and split the data into training, test and validation sets
  • Build and interpret k-nearest neighbors and logistic regression models using scikit-learn
  • Evaluate model performance using tools like confusion matrices and metrics like accuracy, precision, recall, and F1
  • Learn techniques for modeling imbalanced data, including threshold tuning, sampling methods, and adjusting class weights
  • Build, tune, and evaluate decision tree models for classification, including advanced ensemble models like random forests and gradient boosted machines

Who Should Attend!

  • Data scientists who want to learn how to build and apply supervised learning models in Python
  • Analysts or BI experts looking to learn about classification modeling or transition into a data science role
  • Anyone interested in learning one of the most popular open source programming languages in the world