Hadoop is a rapidly evolving Open Source framework scalable for processing huge datasets in distributed systems. It enables users to store and process huge volumes of data and analyzes structured and complex data. Hadoop project is maintained by Apache Software Foundation and is written on Java.
CIGNEX Datamatics delivers Cloudera distributed Hadoop (CDH), providing Hadoop users unprecedented stability, predictability, and reliability. Using Cloudera Manager – an end to end application that runs the full lifecycle of Apache Hadoop – enterprise can improve performance and quality of service and reduce operational costs.
Figure: Cloudera Manager for Apache Hadoop
Cloudera Manager is used for configuring role instances and can monitor Hadoop services for HDFS, HBase or MapReduce. It can also monitor user activities, search logs, and entries on the jobs we are running.
There are a few components around CDH which can be used to resolve high volumes of enterprise content
- MapReduce Framework: a distributed data processing framework that provides a clear abstraction between data analysis tasks to provide reliable large scale computation
- HDFS: Hadoop Distributed File System delivers a scalable, fault-tolerant storage at low cost. The files are stored across a collection of servers
- HBase: Provides random, real time read-write access to content and creates relationships between the objects
- Flume, Sqoop: Reliable data collection system that can be used to integrate with applications
- Hive: Runs analysis similar to data warehouse system but in larger quantities of data
- Pig: Runs analysis of large data sets to derive intelligence.
CIGNEX Datamatics Hadoop Offering
- Hadoop Consulting, Hadoop Implementation and support services
- Certified trainer of Cloudera certified Hadoop System
- Big Data Portal: building a User Experience Platform with the robust features of Big Data technologies to create a single portal window using Liferay, Drupal etc.
CIGNEX Datamatics Advantage
- Cloudera Global Partner
- Big Data Practice with expertise in processing and storing large quantity and variants of data