Friday, May 5, 2017

Big Data Testing - Part 1

Today many enterprise companies have flooded with data. Current technological advancements have led to overwhelming amount of data from distinctive domains like internet of things (sensors), health care, ecommerce and financial companies. The term big data was coined to capture the meaning of this emerging trend.

Big Data is helpful to take better business decisions on different kind of business. The data should not contain any bad data and incomplete data. It has different analytics for variety of business needs. It is used in customer segmentation, Product performance, fraud detection, social media analytics, sentimental analysis, predictive analytics, Financial risk analysis etc.

Big Data Ecosystem

A generic big data implementation has four stages:
  1. Capture Data from multiple sources (Data Ingestion)
  2. Storage for huge data
  3. Data Processing (Loading and transformation)
  4. Analyze layer (Analytics & Reports)

Like components ecosystem, the big data system can be decomposed into a layered structure and divisible into three layers, i.e., the infrastructure layer, the computing layer, and the application layer, from bottom to top. This layered view only provides a conceptual hierarchy to underscore the complexity of a big data system.

Big Data Testing Approach
Bad data can cause great difficulties for analysis and make decisions. Also, Poor implementation can lead to poor quality, delays in testing and increased cost. The right test strategy should ensure both functional testing and non-functional testing.

Test approach can be varied based on type of data processing like Stream processing and batch processing and type of analytics solution used. If big data implementation has less stages or few components and Testing techniques also will be reduces based on components availability.


Functional Testing
  • Data Flow Validation
  • Data Integrity
  • Data Ingestion Layer validation
  • Data Storage Layer
  • Data Processing Layer
  • Analytics & Reports Layer

Non-Functional Testing
  • Data Quality Monitoring
  • Infrastructure
  • Data Security
  • Performance Benchmarking
  • Fault Tolerance
  • Fail Over Mechanism

Will explain each of testing in next post.