About The Training

Apache Hadoop is an Open Source software framework for distributed storage and processing of very large data sets.

What Will It Offer?

  • The training will help you understand how Apache Spark processes large data sets across clusters of computers to do the distributed processing using simple programming models.
  • Learn why Apache Spark can be used for single machine as well as for number of machines, each of which offers local computation and storage.
  • Learn how to detect and handle failures at application layer rather than relying on hardware to deliver high-availability.
Duration: 4 Days

Course Content:
  • Hadoop Introduction
  • Hadoop Components
  • Hadoop Distributed File System
  • MapReduce
  • MapReduce Programming
  • Hadoop Data I/O
  • Hadoop Cluster
  • Advanced MapReduce
  • Hadoop on AWS Cloud
  • Managing Hadoop
  • Testing & Debugging
  • Hadoop Security
  • Big Data
  • Sqoop
  • Hbase
  • Hbase & MapReduce
  • Hive
  • Pig
  • Avro
  • ZooKeeper
  • Cassandra
  • Mahout
  • Ambari
  • Hbase
  • Chukwa
  • Integration of Components with Other components
  • Case Studies
  • Best Practices