PYSPARK End to End Developer Course (Spark with Python)

Learn PySpark end to end features and functionalities. Course also includes a Python course and HDFS Commands Course.

Ratings: 4.08 / 5.00




Description

Introduction to Spark.

HDFS Commands

Python Course.

Why Spark was developed.

What is Spark and its features.

Spark Main Components.

Introduction to Spark.

HDFS Commands

Introduction to SparkSession

RDD Fundamentals

What is RDD

RDD Properties

When to use RDD

RDD Problems

Create RDD

Different Ways to Create RDDs

RDD Operations

Transformations -  Low Level

Transformations - Join Types

Actions -  Total Aggregations

Shuffle and Combiner

Transformations -  Key Aggregations

Transformations -  Sorting

Transformations -  Ranking

Transformations -  Set

Transformations -  Sampling

Transformations -  Partition

Transformations -  Repartition

Transformations -  Repartition and Sort

Transformations -  Coalesce

Transformations -  Repartition Vs Coalesce

Extraction

Spark Cluster Execution Architecture_Full Architecture

Spark Cluster Execution Architecture_YARN As Spark Cluster Manager

Spark Cluster Execution Architecture_JVMs across Clusters

Spark Cluster Execution Architecture- Commonly Used Terms in Execution Framework

Spark Cluster Execution Architecture - Narrow and Wide Transformations

Spark Cluster Execution Architecture - DAG Scheduler

Spark Cluster Execution Architecture - Task Scheduler

RDD Persistence

Spark Shared Variables

SparkSQL Architecture

Detailed SparkSession Features

DataFrame Fundamentals

Datatypes

DataFrame Rows

DataFrame Columns

DataFrame ETL

DataFrame ETL_Introduction to Transformations and Extraction

DataFrame ETL_DataFrame APIs Introduction Extraction

DataFrame ETL_DataFrame APIs Selection

DataFrame ETL_DataFrame APIs Filter or Where

DataFrame ETL_DataFrame APIs Sorting

DataFrame ETL_DataFrame APIs Set

DataFrame ETL_DataFrame APIs Join

DataFrame ETL_DataFrame APIs Aggregations

DataFrame ETL_DataFrame APIs GroupBy

DataFrame ETL_DataFrame APIs Windows

DataFrame ETL_DataFrame Built-in Functions Introduction

Performance and Optimization










What You Will Learn!

  • Complete Development Functionalities and Features of PySpark
  • Spark Cluster Execution Architecture
  • Spark SQL Architecture
  • Spark Performance and Optimization
  • Python Course
  • HDFS Course

Who Should Attend!

  • Data Engineers
  • Data Scientists
  • Data Analysts
  • Database Developers