Module I: Introduction to Big Data and Ecosystem
Module II: HDFS, Hadoop Architecture & YARN
Module III: Environment
Module IV: Scala Programming Language
Module V: Spark Cluster-Computing Framework
Introduction, Architecture, Components, Execution & Related Concepts
o The Resilient Distributed Dataset
o RDD Data Types, Creation & Operations
o Spark Shell In Action & Word Count Spark Job on YARN through Functional Programming on Scala IDE
o RDD Map, Filter, Sort Transformations
o Data Partitioning & Joins
o Accumulators, Broadcast Variables
o Caching and Persistence
o Spark SQL DataFrames and DataSets
o Joins, Strongly Typed Dataset
o DataSets Vs RDD's Choice / Conversion
o Hive queries through Spark
Module VI: MapReduce & Scalding (Scala DSL)
Module VII: Data Warehousing With Hive & Impala
Module VIII: Pig Latin
Module IX: The Hadoop DataBase - HBase
Module X: ETL & Orchestration
Industry Compliant Practical Curriculum On Latest Stacks.
Multiple Domains Simulated Data Patterns for Real Life Project Scenarios.
Guidance for Resume Preparation & Interview Questions.
POCs Suggestions for further practice and pursual.
Mock Interviews for Interested Candidates.
Discussing Current Market Standards & New / Incubating Tech.
Other Q & A, Doubt Clearance.
Discussing Cloudera & Hortonworks Certification Programs.

