Practical Data Science: Reducing High Dimensional Data in R

In this R course, we'll see how PCA can reduce a 5000+ variable data set into 10 variables and barely lose accuracy!

Ratings: 4.39 / 5.00




Description

In this R course, we'll see how PCA can reduce a 5000+ variable data set down to 10 variables and barely lose accuracy! We'll look at different ways of measuring PCA's effectiveness and other ways of reducing wide data sets (those with lots of features/variables). We'll also look at the advantages and disadvantages with different ways of reducing data.

What You Will Learn!

  • Understand various ways of reducing wide data sets
  • Understand Principal Component Analysis (PCA)
  • Control, tune and measure the effects of PCA
  • Use GBM modeling to measure the effectiveness of PCA
  • Reducing dimensionality with classic GBM & GLMNET Variable Selection
  • Use ensembling techniques to find the most stable variables

Who Should Attend!

  • Some understanding and interest in the R programming language
  • Interest in reducing large data sets