Process Big Data using Apache PIG

Learn analyzing and processing big data Using Apache Pig

Ratings: 3.17 / 5.00

Description

Pig is a high-level platform for creating MapReduce programs used with Hadoop. The language for this platform is called Pig Latin. In this course we will go through the PIG data flow platform and the language used by PIG tool. The concepts which are covered in this course are:

Writing complex MapReduce transformations using a simple scripting language.

Basics of Big Data, Hadoop and MapReduce Framework.

PIG Data Model and Different type of operators to operate on datasets.

Built-in Functions as well as User Defined Functions for performing a specific task.

Running PIG Script, Unit Testing and Compression.

Many more advance topics such as Embedding PIG in Java, PIG Macros etc.

All the books and PDFs are included, allowing you to follow along with the author throughout the modules in this course.

What You Will Learn!

Overview of Big Data and Hadoop Framework
Anatomy of a MapReduce Framework
Basics of Apache Pig tool and Where we should use it or not
Run Pig in different Modes
Use Pig Latin Queries
Different types of PIG Operators for analysing the data
Understand the architecture of PIG tool
Work with PIG data model
Different kinds of built-in functions
Advanced PIG concepts such as PIG Streaming, PIG scripts and User Defined Functions(UDFs)
Compress the input files, final output files and intermediate output files
Pig Unit Testing, PIG Macros and Parameter Substitution
How to embed PIG in Java

Who Should Attend!

Students having interest in Big Data and Hadoop Field
Database Developers and Administrator
Software developers want to build their career in Big Data field
Data Analysts
Data Scientists and Resesarcher