Process Big Data using Apache PIG

Learn analyzing and processing big data Using Apache Pig

Ratings: 3.17 / 5.00




Description

Pig is a high-level platform for creating MapReduce programs used with Hadoop. The language for this platform is called Pig Latin. In this course we will go through the PIG data flow platform and the language used by PIG tool. The concepts which are covered in this course are:

Writing complex MapReduce transformations using a simple scripting language.

Basics of Big Data, Hadoop and MapReduce Framework.

PIG Data Model and Different type of operators to operate on datasets.

Built-in Functions as well as User Defined Functions for performing a specific task.

Running PIG Script, Unit Testing and Compression.

Many more advance topics such as Embedding PIG in Java, PIG Macros etc.

All the books and PDFs are included, allowing you to follow along with the author throughout the modules in this course.

What You Will Learn!

  • Overview of Big Data and Hadoop Framework
  • Anatomy of a MapReduce Framework
  • Basics of Apache Pig tool and Where we should use it or not
  • Run Pig in different Modes
  • Use Pig Latin Queries
  • Different types of PIG Operators for analysing the data
  • Understand the architecture of PIG tool
  • Work with PIG data model
  • Different kinds of built-in functions
  • Advanced PIG concepts such as PIG Streaming, PIG scripts and User Defined Functions(UDFs)
  • Compress the input files, final output files and intermediate output files
  • Pig Unit Testing, PIG Macros and Parameter Substitution
  • How to embed PIG in Java

Who Should Attend!

  • Students having interest in Big Data and Hadoop Field
  • Database Developers and Administrator
  • Software developers want to build their career in Big Data field
  • Data Analysts
  • Data Scientists and Resesarcher