Mastering Apache Airflow! Deploy to Kubernetes in AWS

Learn to programmatically author, schedule and monitor workflows with Apache Airflow. Deploy to Kubernetes in AWS.

Ratings: 3.53 / 5.00




Description

Apache Airflow is an open-source platform to programmatically author, schedule and monitor workflows. In this course we are going to start with covering some basic concepts related to Apache Airflow - from the main components - web server and scheduler, to the internal components like DAG, Plugin, Operator, Sensor, Hook, Xcom, Variable and Connection.

Later in the course I will teach you some more advanced topics like branching, metrics, performance and log monitoring, and Airflow's REST API. Additionally I will help you to build your development environment with just one click using Docker and Docker Compose.

Why stop here? After all this, we will create a Kubernetes cluster in Amazon and we will deploy our application there!

Finally, I will share with you some useful advanced tips which will be helpful to enhance your simple Airflow project to a production ready system.

What You Will Learn!

  • Advanced tips for production
  • Create your first pipeline
  • Create ETL pipeline using Pandas
  • Build Docker image for Apache Airflow
  • Create helm chart for Apache Airflow
  • Deploy Airflow to Kubernetes in AWS
  • Basic Airflow components - DAG, Plugin, Operator, Sensor, Hook, Xcom, Variable and Connection
  • Advance in branching, metrics, performance and log monitoring
  • Run development environment with one command through Docker Compose
  • Run development environment with one command through Helm and Kubernetes
  • The difference between Sequential, Local, Celery and Kubernetes Executors
  • Understand Apache Airflow's configuration properties
  • Investigate Apache Airflow's REST Api
  • Explore Apache Airflow's web interface

Who Should Attend!

  • Software Engineers curious about Apache Airflow
  • Software Engineers looking to automate repetitive tasks
  • Data Engineers looking to improve their Data Platforms