Search results for:
No result found! Try with different keywords!

4 to 8 weeks Practical Hands-On Big Data Hadoop Developer Certification training in Hyderabad | Big Data Training | Hadoop training | Big Data analytics training | Hortonworks, Cloudera, HDFS, Map Reduce, YARN, Pig, Hive, Sqoop, Flume, Ambari training

4 to 8 weeks Practical Hands-On Big Data Hadoop Developer Certification training in Hyderabad | Big Data Training | Hadoop training | Big Data analytics training | Hortonworks, Cloudera, HDFS, Map Reduce, YARN, Pig, Hive, Sqoop, Flume, Ambari training


The first 16 hours of this course we will cover foundational aspects with Big Data technical essentials where you learn the foundations of hadoop, big data technology technology stack, HDFS, Hive, Pig, sqoop, how to set up Hadoop Cluster, how to store Big Data using Hadoop (HDFS), how to process/analyze the Big Data using Map-Reduce Programming or by using other Hadoop ecosystems.

The next 16 hours of the course will cover all the course topics in-depth with Hands-on lab exercises mentioned in the comprehensive course outline below.


Course Schedule for First 8 sessions (First 16 Hours)


This is a weekdays course that will be held September 3 - September 26, 2019 US Pacific Time

The class sessions will be held-Tuesday, Thursday every week

6:30-8:30 PM US Pacific time, each day.

Please check your local date and time for first session.



Course Schedule for Next 8 sessions (Next 16 Hours)


Weekdays October 1 - October 24, 2019 US Pacific Time

The class sessions will be held-Tuesday, Thursday every week

6:30-8:30 PM US Pacific time, each day.


View Detailed Weekly Training Schedule at the bottom of this event listing.


Couse Objectives


Knowledge of Hadoop components such as MapReduce, Sqoop, HBase, Hive, Pig, HDFS, Flume, ZooKeeper, Oozie, etc.

Ability to work on Hadoop related Projects as an individual contributor or as part of a team.

Setup, Install and Configure Hadoop in Different environments - Development, Support and Test environments

Hadoop architecture and various operations performed on it

Familiarity with various Hadoop Solutions.



Prerequisite

Desired but not required - Exposure to, Working proficiency of BI, sql, scripting, how to handle and manage data and databases, using Excel, java programming language, basic UNIX commands.


Course Features


4-8 weeks, 8-16 sessions, 16-32 hours of total LIVE Instruction

Training material, instructor handouts and access to useful resources on the cloud provided

Practical Hands on Lab exercises on cloud workstations provided

Actual code and scripts provided

Real-life Scenarios



Course Outline



This is a comprehensive course outline. It is also a guideline, indicative of what topics might be covered during the class. This outline and the actual course content covered during the class by the instructors may be adjusted based on the skills, experience and background of the students when introductions are done during the beginning of the first session.

We strive to teach and cover as many topics from this course outline as possible during this training. If enough students are interested in learning additional topics in addition to the 32 hours of training delivery in even more comprehensive and in-depth manner, we can hold additional sessions for an extra charge. 1 on 1 tutorship is also available which may be slighlty expensive than a group training.





Big Data Basics


An introduction to Big Data?

Why is Big Data? Why now?

The Three Dimensions of Big Data (Three Vs)

Evolution of Big Data

Big Data versus Traditional RDBMS Databases

Big Data versus Traditional BI and Analytics

Big Data versus Traditional Storage

Key Challenges in Big Data adoption

Benefits of adoption of Big Data

Introduction to Big Data Technology Stack

Apache Hadoop Framework

Introduction to Microsoft HDInsight – Microsoft’s Big Data Service

Hands-On Lab Exercises



The Big Data Technology Stack


Basics of Hadoop Distributed File System (HDFS)

Basics of Hadoop Distributed Processing (Map Reduce Jobs)

Hands-On Lab Exercises



Deep dive into Hadoop Distributed File System (HDFS) 


HDFS

Reading files with HDFS

Writing files with HDFS

Error Handling

Design and Concepts of HDFS

Blocks, Name nodes, Data nodes

HDFS High-Availability

HDFS Federation

HDFS Command-Line Interface

Basic File System Operations

Anatomy of File Read and Write

Block Placement Policy and Modes

Configuration files - Detailed explanation

Metadata

FS image

Edit log

Secondary Name Node

Safe Mode

How to add New Data Node dynamically

How to decommission Data Nodes dynamically without stopping cluster

FSCK Utility

How to override default configuration at Programming level and system level

ZOOKEEPER Leader Election Algorithm

Hands-On Lab Exercises



Processing Big Data –MapReduce and YARN


How MapReduce works

Handling Common Errors

Bottlenecks with MapReduce

How YARN (MapReduceV2) works

Difference between MR1 and MR2

Error Handling

Running a simple MapReduce application (word count)

Running a custom MapReduce application (census data)

Running MapReduce via PowerShell

Running a MapReduce application using PowerShell

Monitoring application status

Hands-On Lab Exercises



Big Data Development Framework


Introduction to HIVE

Introduction to PIG

HBase

Loading the data into HIVE

Submitting Pig jobs using HDInsight

Submitting Pig jobs via PowerShell

Hands-On Lab Exercises



Big Data Integration and Management


Big Data Integration using Polybase

Big Data Management using Ambari

Fetching HDInsight data into SQL

Using Ambari for managing HDInsight cluster

Hands-On Lab Exercises



Map Reduce


Basics of Functional Programming

Map Reduce Basics

How Map Reduce Works

Anatomy of Map Reduce Job

Legacy Architecture: Job Submission, Job Initialization, Task Assignment, Task Execution, Progress

Status Updates

Job Completions and Failures

Shuffling, Sorting

Splits, Record reader, Partition, Types of partitions and Combiner

Optimization Techniques -> Speculative Execution, JVM Reuse

Schedulers, Counters

Comparisons between Old, New API at code and Architecture Level

Getting data from RDBMS into HDFS using Custom data types

Distributed Cache and Hadoop Streaming (Python, Ruby, and R)

Hands-On Lab Exercises


YARN


Sequential Files and Map Files

Enabling Compression Codec’s

Map side Join with distributed Cache

Types of I/O Formats: Multiple outputs, NLINE input format

Handling small files using Combine File Input Format

Hands-On Lab Exercises


Map Reduce and Java Programming


Hands-on “Word Count” in Map Reduce in standalone and Pseudo distribution Mode

Sorting files using Hadoop Configuration API discussion

Emulating “grep” for searching inside a file in Hadoop

DBInput Format

Job Dependency API discussion

Input Format API discussion, Split API discussion

Custom Data type creation in Hadoop

Hands-On Lab Exercises


NOSQL


ACID in RDBMS and BASE in NoSQL

CAP Theorem and Types of Consistency

Types of NoSQL Databases in detail

Columnar Databases in Detail (HBASE and CASSANDRA)

TTL, Bloom Filters and Compensation

Hands-On Lab Exercises


HBase


Concepts

Installation

Data Model of HBase and Comparison between RDBMS and NOSQL

Master and Regional Servers

DDL and DML HBase Operations

Architecture of HBase

HBase Catalog Tables

HBase Block Cache and sharding

HBase SPLITS

HBase DATA Modeling (Sequential, Salted, Promoted and Random Keys)

JAVA API’s and Rest Interface

Client-Side Buffering and Process 1 million records using Client-side Buffering

HBase Counters

Enabling Replication and HBase RAW Scans

HBase Filters

Bulk Loading and Co processors (Endpoints and Observers with programs)

Hands-On Lab Exercises


Hive


Introduction to Hive

Hive Architecture

Hive Installation

Hive Services, Shell, Server, Web Interface (HWI)

Meta store, Hive QL

OLTP vs. OLAP

Working with Tables

Primitive data types

Complex data types

Working with Partitions

User-Defined Functions

Hive Bucketed Tables and Sampling

External partitioned tables

Map the data to the partition in the table

Write the output of one query to another table, Multiple inserts

Dynamic Partition

Differences between ORDER BY, DISTRIBUTE BY and SORT BY

Bucketing and Sorted Bucketing with Dynamic partition

RC File

INDEXES and VIEWS

MAPSIDE JOINS

Compression on hive tables and Migrating Hive tables

Dynamic substation of Hive and Different ways of running Hive

How to enable Update in HIVE

Log Analysis on Hive

Access HBASE tables using Hive

Hands-on Lab Exercises


Pig


Installation

Execution Types

Grunt Shell

Pig Latin

Data Processing

Schema on read

Primitive data types and complex data types

Tuple schema, BAG Schema, and MAP Schema

Loading and Storing

Filtering, Grouping, and Joining

Debugging commands (Illustrate and Explain)

Validations, Type casting in PIG

Working with Functions

User-Defined Functions

Types of JOINS in pig and Replicated Join in detail

SPLITS and Multiquery execution

Error Handling, FLATTEN and ORDER BY

Parameter Substitution

Nested For Each

User-Defined Functions, Dynamic Invokers, and Macros

How to access HBASE using PIG, Load and Write JSON DATA using PIG

Piggy Bank

Hands-on Lab Exercises


SQOOP


Installation

Import Data. (Full table, Only Subset, Target Directory, protecting Password, file format other than CSV, Compressing, Control Parallelism, All tables Import)

Incremental Import (Import only New data, Last Imported data, storing Password in Metastore, Sharing Metastore between Sqoop Clients)

Free Form Query Import

Export data to RDBMS, HIVE, and HBASE

Hands-on Lab Exercises


HCatalog


Introduction

Installation

About Hcatalog with PIG, HIVE, and MR

Hands-on Lab Exercises


Flume


Introduction and Oveview

Installation

Flume Agents: Sources, Channels, and Sinks

Log User information using Java program into HDFS using LOG4J and Avro Source, Tail Source

Log User information using Java program into HBASE using LOG4J and Avro Source, Tail Source

Flume Commands

Hands-on Lab Exercises


Different Hadoop Ecosystems


Hortonworks

Cloudera


Oozie


Workflow (Action, Start, Action, End, K*ll, Join and Fork), Schedulers, Coordinators and Bundles., to show how to schedule Sqoop Job, Hive, MR and PIG

Real-world Use case which will find the top websites used by users of certain ages and will be scheduled to run for every one hour

Zoo Keeper

HBASE Integration with HIVE and PIG

Phoenix

Proof of concept (POC)

Hands-on Lab Exercises


Spark


Spark Overview

Linking with Spark, Initializing Spark

Using the Shell

Resilient Distributed Datasets (RDDs)

Parallelized Collections

External Datasets

RDD Operations

Basics, Passing Functions to Spark

Working with Key-Value Pairs

Transformations

Actions

RDD Persistence

Which Storage Level to Choose?

Removing Data

Shared Variables

Broadcast Variables

Accumulators

Deploying to a Cluster

Unit Testing

Migrating from pre-1.0 Versions of Spark



Detailed Weekly Schedule for First 8 sessions (1st 16 Hours)


September 3, 2019 | 6:30 PM to 8:30 PM US Pacific Time

September 5, 2019 | 6:30 PM to 8:30 PM US Pacific Time

September 10, 2019 | 6:30 PM to 8:30 PM US Pacific Time

September 12, 2019 | 6:30 PM to 8:30 PM US Pacific Time

September 17, 2019 | 6:30 PM to 8:30 PM US Pacific Time

September 19, 2019 | 6:30 PM to 8:30 PM US Pacific Time

September 24, 2019 | 6:30 PM to 8:30 PM US Pacific Time

September 26, 2019 | 6:30 PM to 8:30 PM US Pacific Time



Detailed Weekly Schedule for Next 8 sessions (Additional 16 Hours)


October 1, 2019 | 6:30 PM to 8:30 PM US Pacific Time

October 3, 2019 | 6:30 PM to 8:30 PM US Pacific Time

October 8, 2019 | 6:30 PM to 8:30 PM US Pacific Time

October 10, 2019 | 6:30 PM to 8:30 PM US Pacific Time

October 15, 2019 | 6:30 PM to 8:30 PM US Pacific Time

October 17 2019 | 6:30 PM to 8:30 PM US Pacific Time

October 22, 2019 | 6:30 PM to 8:30 PM US Pacific Time

October 24, 2019 | 6:30 PM to 8:30 PM US Pacific Time




Refund Policy


All Sales are Final. There are no Refunds.

If a student is not happy with the training experience, we strive to listen, take the feedback and implement honest and sincere measures to meet and exceed student expectations. 

If a class is rescheduled/cancelled by the organizer, registered students will be offered a credit towards any future course









Ticket Information Ticket Price
4 weeks | 16 Hours Early Bird Instructor Led Classroom or LIVE Online Training USD 550
8 weeks | 32 Hours Early Bird Instructor Led Classroom or LIVE Online Training USD 850
4 weeks | 16 Hours General Admission Instructor Led Classroom or LIVE Online Training USD 650
8 weeks | 32 Hours General Admission Instructor Led Classroom or LIVE Online Training USD 935



Map Entirety Technology, Hyderabad, India
Event details from Report a problem

Are you going to this event?

Tickets from USD 550 to 935 on eventbrite.com

More Events in Hyderabad

Explore More Events in Hyderabad