info@jntukelearn.in
9701450555

Course Detail

img
16 Weeks

Hadoop and Big Data

Course Information

Description

Hadoop and Big Data

The ‘Introduction to Big Data and Hadoop’ is an ideal course for individuals who want to understand the basic concepts of Big Data and Hadoop. The course focuses on the basics of Big Data and Hadoop. It further provides an overview of the commercial distributions of Hadoop as well as the components of the Hadoop ecosystem.

Prerequisites

Basic knowledge of JAVA, Python, Scala ,Linux and SQL

Course Objectives:

Upon completion of this course, students will be able to do the following:

•Optimize business decisions and create competitive advantage with Big Data analytics.
•Introducing Java concepts required for developing map reduce programs
•Derive business benefit from unstructured data
•Imparting the architectural concepts of Hadoop and introducing map reduce paradigm
•To introduce programming tools PIG & HIVE in Hadoop echo system.

Course Outcomes:

After completing this course the candidate should be able to:

• Preparing for data summarization, query, and analysis.
• Applying data modelling techniques to large data sets
• Creating applications for Big Data analytics
• Building a complete business data analytic solution

Syllabus

Unit 1

Introduction:Data structures in Java: Linked List, Stacks, Queues, Sets, Maps; Generics: Generic classes and Type parameters, Implementing Generic Types, Generic Methods, Wrapper Classes, Concept of Serialization.

Refrences:Big Java 4th Edition, Cay Horstmann, Wiley John Wiley & Sons, INC

Unit 2

Working with Big Data: Google File System, Hadoop Distributed File System (HDFS) – Building blocks of Hadoop (Namenode, Datanode, Secondary Namenode, JobTracker, TaskTracker), Introducing and Configuring Hadoop cluster (Local, Pseudo-distributed mode, Fully Distributed mode), Configuring XML files.

Refrences:Hadoop: The Definitive Guide by Tom White, 3rd Edition, O’reilly Hadoop in Action by Chuck Lam, MANNING Publ.

Unit 3

Writing MapReduce Programs: A Weather Dataset, Understanding Hadoop API for MapReduce Framework (Old and New), Basic programs of Hadoop MapReduce: Driver code, Mapper code, Reducer code, RecordReader, Combiner, Partitioner

Refrences: Hadoop: The Definitive Guide by Tom White, 3rd Edition, O’reilly

Unit 4

Hadoop I/O: The Writable Interface, WritableComparable and comparators, Writable Classes: Writable wrappers for Java primitives, Text, BytesWritable, NullWritable, ObjectWritable and GenericWritable, Writable collections, Implementing a Custom Writable: Implementing a RawComparator for speed, Custom comparators

Refrences:Hadoop: The Definitive Guide by Tom White, 3rd Edition, O’reilly

Unit 5

Pig: Hadoop Programming Made Easier Admiring the Pig Architecture, Going with the Pig Latin Application Flow, Working through the ABCs of Pig Latin, Evaluating Local and Distributed Modes of Running Pig Scripts, Checking out the Pig Script Interfaces, Scripting with Pig Latin.

Refrences:Hadoop for Dummies by Dirk deRoos, Paul C.Zikopoulos, Roman B.Melnyk,Bruce Brown, Rafael Coss

Unit 6

Applying Structure to Hadoop Data with Hive: Saying Hello to Hive, Seeing How the Hive is Put Together, Getting Started with Apache Hive, Examining the Hive Clients, Working with Hive Data Types, Creating and Managing Databases and Tables, Seeing How the Hive Data Manipulation Language Works, Querying and Analyzing Data

Refrences:Hadoop for Dummies by Dirk deRoos, Paul C.Zikopoulos, Roman B.Melnyk,Bruce Brown, Rafael Coss