Data Mesh & Data Fabric; 12Wk of Implementation

Tech Mahindra Limited

The Key Objective of setting up a Data Mesh or Data Fabric is boiled down to the availability of quality data in a timely fashion to the right people and in the right format.

The difficulty of managing data in a heterogeneous data environment leads organizations to adopt modern solutions. Data Mesh & Data Fabric are concepts on how to address these complications.

A Data Fabric is a conceptual architecture and set of data services that provide frictionless data capabilities across a choice of endpoint applications/Services spanning hybrid/multi-cloud and On-Premises. Data Mesh essentially refers to the concept of breaking down data lakes and siloes into smaller data domain-specific sets with a self-serve design, to enable data-driven decisions.

In Data Fabric the governance is centralized compared to a decentralized governance approach for Data Mesh by principle. The key difference between the two is how teams manage their own data as they realize fit for their respective teams.

Key Azure Components

  • Azure Data Factory (ADF) for orchestrating data integration pipelines to integrate data from across hybrid, multi-cloud, SaaS, and legacy enterprise data source systems
  • Azure Databricks for building an open standard data lakehouse using the Delta format
  • Power BI for Business Intelligence
  • Microsoft Purview and Databricks Unity Catalog Federation for Unified Data Governance
  • Azure Machine Learning

Key TechM IPs

  • UDMF –Data Migration & Quality platform
  • InfoWise -Metadata Governance Platform
  • SPRINTER –Any Source to Any Destination Cloud Migration Accelerator
  • FASTEST –ETL Test Automation Tool
  • CDIF-Self Service Ingestion Framework for Modern Data Analytics Platforms

Key Business Challenges:

  • Platform to actionable Insights to the business
  • Robust data governance
  • Ability to increase the value of hidden data
  • To spend less time preparing data
  • Improve Operational Efficiency

Key Solution Benefits:

  • Automation
    Global API and Batch based framework with Data Register for Source & Subscriber onboarding, Adaptive Governance, Metadata Graph & adaptive process automation

  • Seamless Performance
    Scalable and monitored seamless performance through aligned reporting needs to Data product models

  • Reliable Data & Self Heal
    DQ as a Service based on governance policies with robust error framework along with Use case based Self-healing Capability

  • Self Service
    Automated Integration for Operational and Analytical applications using Data registry and global API framework

  • Flexible Integration Framework
    Enable global & Data Product framework to automatically share data from individual and across data products in both real time (Streaming) and batch mode (Global ETL Framework)

  • Model Data as Products
    Create Data Products based on specific domains, consumer groups or business lines (e.g., Sales, Services, Finance, Engineering, Analytics) that is self-described and connected

  • Data as a Service
    Provide reliable, performance intensive data as a service to consumers across platforms, format and medium