Constructing a data platform is a challenging task, and with an increasing number of deployment options available, it can be difficult to evaluate whether the correct design choices have been made.
Our process for reviewing Data Platform Architecture is centered on the three primary components of any platform: People, Process, and Technology.
We base our review on our substantial experience working with our clients and partners, including Microsoft and Databricks.
Through this collaboration, we have gained an understanding of how a platform should be designed not only to address the challenges that companies currently face but also to prepare for those that will arise in the near future.
Our review comprehensively examines the following crucial areas:
- Architecture: We evaluate the current state of your architecture and perform a gap analysis based on both Microsoft's Well-Architected Framework and our independent research.
- Integration: We assess how your data is consumed and whether there is extensive use of automation. We also determine if the platform supports all types of data (structured and unstructured).
- Data Storage: We review how your data is stored and if it is optimized for security and least permissions. We also evaluate if it is optimized for performance.
- Data Curation: We investigate how your data is cleaned and shaped, and if it is curated into an appropriate data model that will be governed.
- Serving Layer: We examine how data is served and if the platform uses serverless architectures to reduce cost and complexity.
- Visualization: We evaluate how reports are currently being operated and how Power BI queries big data. We also determine if Power BI is optimized for consuming high volumes of big data.
- Data Science: We assess if the platform is ready for machine learning and if there is an approach to MLOps currently in place.
- DataOps/DevOps: We review how the platform is deployed and if there is extensive use of DevOps.
- Performance: We evaluate if the platform is designed to scale.
- Cost Management: We determine if the costs are understood and if there are effective cost management strategies in place.
We cover all of this in 10 days and produce a series of actionable recommendations which can be implemented by your team or by a us.
We focus on all the major Azure Data Products, and we can optionally extend the scope to non-data products where required. We cover the following:
- Azure Databricks
- Azure Synapse Analytics
- Azure Data Lake Gen 2
- Azure Data Factory
- Azure Machine Learning
- Azure KeyVault, Vnet, SQL DB and many more components.