Naresh i Technologies

Software Training

Follow Us Social Media :

  • Facebook
  • Instagram
  • LinkedIn
  • Twitter
  • YouTube
  • Home
  • All Courses
  • Services
    • Internship Programs
    • Placement Assistance
    • Placements
  • Software Training
    • Classroom Training
    • Online Training
    • Weekend Training
    • Corporate Training
    • Internships
    • Trainers Profile
    • Course Schedule
  • Projects
    • Live Projects
    • IEEE Projects
    • Realtime Projects
    • Internships
  • Internships
    • Internships
    • Insights
    • Success Factors
    • World Class Software Training
    • Training Institute of Choice
    • Surveys
    • InfoGraphics
    • Thought Leadership
  • Careers
    • Internships
    • Placement Registration
    • Job Openings
    • IT Job Trends
    • Interview Questions
    • NareshIT Whatsapp Notification Groups
  • About
    • Vision and Mission
    • Our Philosophy
    • Management Team
    • Infrastructure
    • Testimonials
    • Training FAQs
    • Contact Us
  • New Batches
    • Classroom – Ameerpet
    • Online Schedule
    • Online Training
    • Kukatpally
    • Internships
    • Free Online Workshops
  • Blog
Home » Online Training » Hadoop Training

Hadoop Training

NIT Manager

Contents [hide]

Hadoop Course Overview

Hadoop Development course teaches the skill set required for the learners how to setup Hadoop Cluster, how to store Big Data using Hadoop (HDFS) and how to process/analyze the Big Data using Map-Reduce Programming or by using other Hadoop ecosystems. Attend Hadoop Training demo by Real-Time Expert.

Hadoop Training Course Prerequisites

  • Basic Unix Commands
  • Core Java (OOPS Concepts, Collections , Exceptions ) for Map Reduce Programming
  • SQL Query knowledge for Hive Queries

Hadoop Course System Requirements

  • Any Linux flavor OS (Ex: Ubuntu/Cent OS/Fedora/RedHat Linux) with 4 GB RAM (minimum), 100 GB HDD
  • Java 1.6+
  • Open-SSH server & client
  • MYSQL Database
  • Eclipse IDE
  • VMWare (To use Linux OS along with Windows OS)

Hadoop Training Course Duration

  • 50 Hours, daily 1:30  Hours

Hadoop Course Content

Introduction to Hadoop

  • High Availability
  • Scaling
  • Advantages and Challenges 

Introduction to Big Data

  • What is Big data
  • Big Data opportunities,Challenges
  • Characteristics of Big data 

Introduction to Hadoop

  • Hadoop Distributed File System
  • Comparing Hadoop & SQL
  • Industries using Hadoop
  • Data Locality
  • Hadoop Architecture
  • Map Reduce & HDFS
  • Using the Hadoop single node image (Clone)

Hadoop Distributed File System (HDFS)

  • HDFS Design & Concepts
  • Blocks, Name nodes and Data nodes
  • HDFS High-Availability and HDFS Federation
  • Hadoop DFS The Command-Line Interface
  • Basic File System Operations
  • Anatomy of File Read,File Write
  • Block Placement Policy and Modes
  • More detailed explanation about Configuration files
  • Metadata, FS image, Edit log, Secondary Name Node and Safe Mode
  • How to add New Data Node dynamically,decommission a Data Node dynamically (Without stopping cluster)
  • FSCK Utility. (Block report)
  • How to override default configuration at system level and Programming level
  • HDFS Federation
  • ZOOKEEPER Leader Election Algorithm
  • Exercise and small use case on HDFS

Map Reduce

  • Map Reduce Functional Programming Basics
  • Map and Reduce Basics
  • How Map Reduce Works
  • Anatomy of a Map Reduce Job Run
  • Legacy Architecture ->Job Submission, Job Initialization, Task Assignment, Task Execution, Progress and Status Updates
  • Job Completion, Failures
  • Shuffling and Sorting
  • Splits, Record reader, Partition, Types of partitions & Combiner
  • Optimization Techniques -> Speculative Execution, JVM Reuse and No. Slots
  • Types of Schedulers and Counters
  • Comparisons between Old and New API at code and Architecture Level
  • Getting the data from RDBMS into HDFS using Custom data types
  • Distributed Cache and Hadoop Streaming (Python, Ruby and R)
  • YARN
  • Sequential Files and Map Files
  • Enabling Compression Codec’s
  • Map side Join with distributed Cache
  • Types of I/O Formats: Multiple outputs, NLINEinputformat
  • Handling small files using CombineFileInputFormat

Map Reduce Programming – Java Programming

  • Hands on “Word Count” in Map Reduce in standalone and Pseudo distribution Mode
  • Sorting files using Hadoop Configuration API discussion
  • Emulating “grep” for searching inside a file in Hadoop
  • DBInput Format
  • Job Dependency API discussion
  • Input Format API discussion,Split API discussion
  • Custom Data type creation in Hadoop

NOSQL

  • ACID in RDBMS and BASE in NoSQL
  • CAP Theorem and Types of Consistency
  • Types of NoSQL Databases in detail
  • Columnar Databases in Detail (HBASE and CASSANDRA)
  • TTL, Bloom Filters and Compensation

HBase

  • HBase Installation, Concepts
  • HBase Data Model and Comparison between RDBMS and NOSQL
  • Master  & Region Servers
  • HBase Operations (DDL and DML) through Shell and Programming and HBase Architecture
  • Catalog Tables
  • Block Cache and sharding
  • SPLITS
  • DATA Modeling (Sequential, Salted, Promoted and Random Keys)
  • JAVA API’s and Rest Interface
  • Client Side Buffering and Process 1 million records using Client side Buffering
  • HBase Counters
  • Enabling Replication and HBase RAW Scans
  • HBase Filters
  • Bulk Loading and Co processors (Endpoints and Observers with programs)
  • Real world use case consisting of HDFS,MR and HBASE

Hive

  • Hive Installation, Introduction and Architecture
  • Hive Services, Hive Shell, Hive Server and Hive Web Interface (HWI)
  • Meta store, Hive QL
  • OLTP vs. OLAP
  • Working with Tables
  • Primitive data types and complex data types
  • Working with Partitions
  • User Defined Functions
  • Hive Bucketed Tables and Sampling
  • External partitioned tables, Map the data to the partition in the table, Writing the output of one query to another table, Multiple inserts
  • Dynamic Partition
  • Differences between ORDER BY, DISTRIBUTE BY and SORT BY
  • Bucketing and Sorted Bucketing with Dynamic partition
  • RC File
  • INDEXES and VIEWS
  • MAPSIDE JOINS
  • Compression on hive tables and Migrating Hive tables
  • Dynamic substation of Hive and Different ways of running Hive
  • How to enable Update in HIVE
  • Log Analysis on Hive
  • Access HBASE tables using Hive
  • Hands on Exercises

Pig

  • Pig Installation
  • Execution Types
  • Grunt Shell
  • Pig Latin
  • Data Processing
  • Schema on read
  • Primitive data types and complex data types
  • Tuple schema, BAG Schema and MAP Schema
  • Loading and Storing
  • Filtering, Grouping and Joining
  • Debugging commands (Illustrate and Explain)
  • Validations,Type casting in PIG
  • Working with Functions
  • User Defined Functions
  • Types of JOINS in pig and Replicated Join in detail
  • SPLITS and Multiquery execution
  • Error Handling, FLATTEN and ORDER BY
  • Parameter Substitution
  • Nested For Each
  • User Defined Functions, Dynamic Invokers and Macros
  • How to access HBASE using PIG, Load and Write JSON DATA using PIG
  • Piggy Bank
  • Hands on Exercises

SQOOP

  • Sqoop Installation
  • Import Data.(Full table, Only Subset, Target Directory, protecting Password, file format other than CSV, Compressing, Control Parallelism,  All tables Import)
  • Incremental  Import(Import only New data, Last Imported data, storing Password in Metastore, Sharing Metastore between Sqoop Clients)
  • Free Form Query Import
  • Export data to RDBMS,HIVE and HBASE
  • Hands on Exercises

HCatalog

  • HCatalog Installation
  • Introduction to HCatalog
  • About Hcatalog with PIG,HIVE and MR
  • Hands on Exercises

Flume

  • Flume Installation
  • Introduction to Flume
  • Flume Agents: Sources, Channels and Sinks
  • Log User information using Java program in to HDFS using LOG4J and Avro Source, Tail Source
  • Log User information using Java program in to HBASE using LOG4J and Avro Source, Tail Source
  • Flume Commands
  • Use case of Flume: Flume the data from twitter in to HDFS and HBASE. Do some analysis using HIVE and PIG

More Ecosystems

  • HUE.(Hortonworks and Cloudera)

Oozie

  • Workflow (Action, Start, Action, End, Kill, Join and Fork), Schedulers, Coordinators and Bundles.,to show how to schedule Sqoop Job, Hive, MR and PIG
  • Real world Use case which will find the top websites used by users of certain ages and will be scheduled to run for every one hour
  • Zoo Keeper
  • HBASE Integration with HIVE and PIG
  • Phoenix
  • Proof of concept (POC)

SPARK

  • Spark Overview
  • Linking with Spark, Initializing Spark
  • Using the Shell
  • Resilient Distributed Datasets (RDDs)
  • Parallelized Collections
  • External Datasets
  • RDD Operations
  • Basics, Passing Functions to Spark
  • Working with Key-Value Pairs
  • Transformations
  • Actions
  • RDD Persistence
  • Which Storage Level to Choose?
  • Removing Data
  • Shared Variables
  • Broadcast Variables
  • Accumulators
  • Deploying to a Cluster
  • Unit Testing
  • Migrating from pre-1.0 Versions of Spark
  • Where to Go from Here

Click Here For Hadoop Online Training

Categories: Online Training Tags: Chennai, Hyderabad

Contact Us

Rating of Average of 4.98 on a total of 500 Ratings
Trainers Profile
Student Enquiry
Corporate Training
Post your Feedback
View/Post Testimonials
Contact Us

Join the Community and Learn for Free

  1. Join WhatsApp
  2. Join Telegram
  3. Watch Free Tutorials

Find Courses Here

  • Full Stack .Net Placement Assistance Program
  • Full Stack .Net Placement Assistance Program
  • Full Stack Python
  • Full Stack Python
  • Full Stack Data Science & AI
  • Full Stack Data Science & AI
  • Full Stack Java Placement Assistance Program
  • Full Stack Java Placement Assistance Program
  • Full Stack Web Development Course
  • Full Stack UI Web Development with React
  • Full Stack UI Web Development with React
  • Full Stack Java Developer Course
  • Android Training
  • iPhone Training
  • Asp.Net Training
  • Asp.Net MVC Training
  • C#.NET Training
  • C++ Training
  • Data Structure Training
  • Java Training
    • Core Java Training
    • Advanced Java Training
    • Hibernate Training
    • J2EE Training
    • Spring 5.x Training
    • Hibernate Training
    • Web Services Training
    • Struts Training
    • XML Training
    • Java Online Training
  • Oracle Training
  • SQL Server Training
  • Selenium Training
  • UNIX LINUX Training
  • PHP Training
  • HTML5 CSS3 Training
  • jQuery Training
  • UI Technologies Training
  • UI UX Training
  • AngularJS Training
    • Angular 2 Training
    • Angular 4 Training
    • Angular 6 Training
    • Angular 7 Training
  • Data Science Training
  • Artificial Intelligence Training
  • Hadoop Training
  • Apache Spark Training
  • Python Training
  • Hadoop Online Training
  • DevOps Training
  • AWS Training
  • IoT Training
  • SalesForce CRM Training
  • Digital Marketing Training
  • RPA Training
    • Blue Prism Training
    • Automation Anywhere Training
    • Automation Anywhere Online Training
  • Blockchain Training
  • Realtime Projects
    • Real Time Java Project
    • Real-Time Project on PHP
  • Software Training in Chennai
  • Doubleclick for Publishers Training
  • Testing Tools Training
  • NareshIT (KPHB) Kukatpally

About Naresh i Technologies

Training Institute Overview Naresh i Technologies Naresh i Technologies (Pronounced: NareshIT) is a leading software training institute providing Software Training, Project Guidance, IT Consulting and Technology Workshops. Using our enhanced global software training delivery methodology,

Interview Questions

  • Selenium Interview Questions
  • Java Interview Questions

HYDERABAD MAIN CAMPUS

Add: 2nd Floor, Durga Bhavani Plaza,
Ameerpet, Hyderabad
Ph : 040-2374 6666 / 2373 4842
Email : support@nareshit.com

For Online Training:

Call/Whatsapp:  +91-8179191999
Email: onlinetraining@nareshit.com

Follow Us On Social Media

  • Facebook
  • Instagram
  • LinkedIn
  • Twitter
  • YouTube

Join Our Telegram – get Updates

  • NareshIT Official Channel   
  • Python | JAVA | AWS | DevOps |  Data Science
TOP
© 2025 Naresh i Technologies | Software Training - Online Training - Live Project Training - Weekend Training