Big Data for Testers

 DEMO


 

CURRICULUM


Module 1: Data warehousing, ETL & Big Data concepts
  • Need for Data Warehousing
  • What is Data Warehousing?
  • Advantages of Data Warehouse
  • Properties of A Data Warehouse
  • Data Warehouse Architecture
  • Concepts of OLTP and OLAP
  • What is ETL?
  • What is Big Data?
  • 5 V's of Big Data
  • Types of Data
  • What is Hadoop?
  • History of Hadoop
  • Architecture of Hadoop
  • Hadoop Ecosystem
Module 2: Linux Basics
  • Unix/Linux commands
  • Shell scripting
Module 3: Hadoop HDFS & MapReduce
  • Introduction to HDFS & MapReduce
  • HDFS Architecture
  • HDFS Commands
  • MapReduce Architecture
  • MapReduce Examples in Java & Python
  • Validating MAP Reduce jobs
Module 4: Querying Data using Apache Hive
  • Hive Overview
  • Hive Characteristics and Features
  • Different Hive Tables and its Differences
  • How Hive different with RDBMS
  • Hive Components & Clients
  • Creating and dropping Hive database
  • Hive Data Types
  • Hive Managed Tables
  • Hive External Tables
  • Altering Hive Table
  • Collections - Array, Map & Struct
  • Processing XML & JSON files in Hive
  • Hive Partitions & Buckets
  • Indexes and Views
  • Hive Queries: Order By, Group By, Distribute By and Cluster By clauses
  • Hive Aggregation Functions
  • Hive Joins
  • Hive UDF's and UDAF's
  • Working with Hue
  • Creating and Querying hive tables in Hue
Module 5: Ingesting Data using Apache Sqoop
  • Sqoop Overview
  • Sqoop Components and Architecture
  • Importing data from RDBMS tables to HDFS
  • Exporting data from HDFS to RDBMS Tables
  • Sqoop Commands
  • Working with Sqoop Jobs
  • codegen
  • Mini project on Hive and Sqoop
Module 6: Processing and transforming Data using Apache Pig
  • Overview on Pig
  • Pig Shell Types
  • Load and Store operators
  • Diagnostic Operators
  • Grouping and Joining
  • Combining and Splitting
  • Filtering
  • Sorting
  • Pig Latin Built-in functions
  • Pig UDF’s
  • Understanding the test cases of pig & Testing Pig Jobs
  • Mini Project on Pig and Sqoop
Module 7: NoSQL
  • What is NoSQL?
  • Challenges of RDBMS
  • Benefits to adopting a NoSQL database over RDBMS
  • Concepts and characteristics of NoSQL databases
  • Popular NoSQL Databases
  • CAP theorem
  • Working with NOSQL Databases
HBase
  • What is HBase?
  • HBase Vs RDBMS
  • Features of HBase
  • HBase Architecture
  • HBase Shell
  • HBase Commands
  • Create, Listing, Disabling and Enabling a Table
  • Describe & Alter
  • Exists
  • Drop a Table
  • Create, update,read and delete Data
  • Scan
  • Count & Truncate
  • HBase Admin API
MongoDB
  • Introduction to MongoDB
  • Creating Database in MongoDB
  • Dropping a Database
  • Creating Collections and Documents
  • CRUD Operations - Create, Read, Update and Delete
Module 8: Big Data/Hadoop Testing
  • Test Strategy And Steps For Testing Big Data Applications
  • Database Testing Of Big Data Applications
  • Functional Testing of Big Data Applications
  • Roles and Responsibilities Of A Tester In Big Data Applications
  • The Big Data Testing Challenges
  • Big Data Tools / Common Terminologies
  • Big Data Automation Testing Tools