Tiger Data Fabric: Accelerator for Implementing Data Lake & Data Fabric on Microsoft Azure
Tiger Analytics' accelerator for Implementing Data Lake & Data Fabric on Microsoft Azure in a highly scalable manner with powerful low code/ Self-Service & Governance capabilities at its core
Tiger Data Fabric is developed by following various industry best practices and using proven Azure PaaS & Open-Sourced Big Data services. Some of the key features include:
Agility with Self Service: Cuts down the time it takes for setting up new Data Pipelines from Days & Weeks to Few Minutes. Leverage different interfaces like Chatbot, Automation Scripts, REST APIS and a modern Web UI for managing the entire data platform and configure new data pipelines.
Medallion Lakehouse Architecture: Follows Data Engineering Best Practices & based on Medallion Lakehouse Architecture by organizing the data in Bronze, Silver & Gold layers.
Scalable Cloud Architecture: Highly Scalable & Service Oriented Architecture, developed using different Azure PaaS & Open-Source Big Data Technologies.
Transformation Capabilities: Capabilities to perform various data Transformation Operations like data cleansing, validation, quarantine, auto-profile & target merge – in a highly configuration driven manner. Integrated DBT Module for implementing governed transformation pipelines.
DE Virtual Assistant: An integrated virtual assistant (powered by MS Bot framework and LUIS) that provides capabilities for setting up ingestion pipelines, tracking jobs, evaluating data quality etc. – all over a simple chat interface.
Knowledge Graph & Intelligent Search: Integrated with Knowledge graph & Search Index to provide intelligent Data Catalog capabilities (for both metadata and lineage).
Data Quality: Great Expectations based Data Testing framework to validate onboarded datasets for consistency and user defined rules.
Additional Governance Capabilities: Powerful framework for logging, notification & alert, and azure cost monitoring features.
DevOps: Powerful Infrastructure as Code capabilities with single click deployment via CI/CD pipelines.
Technologies Used: Azure Data Factory, Azure Databricks, Azure SQL Server, Azure Data Lake Storage Gen2, Elastic Search, Azure Key vault, Azure DevOps