Azure Databricks: 1-Wk Proof of Concept

Thorogood Associates

Azure Databricks is an exciting addition to the Azure platform, whether you’re looking to transform and clean large volumes of data, or collaborate with colleagues to build advanced analytics jobs.

Azure Databricks builds on Apache Spark’s speed and flexibility to process massive volumes of data, applying a wide range of statistical and machine learning algorithms in an efficient manner at large scale. By adding features such as integrated security, collaboration and sharing tools and closer integration with front-end tools such as Power BI, Databricks makes an excellent all-round platform for data analysis.

In this 1-week Proof of Concept (PoC), Thorogood consultants will work with your team to demonstrate how Azure Databricks can help meet your analysis goals, from data transformation through to building data science models. During the PoC, we’ll deploy Databricks, and identify 1-2 important use cases in your organization to show how your team can use it to build and share analysis.

###What we’ll deliver:

  • Kick-off meeting with your team to identify key use cases for PoC, focusing on improving business decisions that are expected to deliver value
  • Set up Azure Databricks, and load appropriate data for use cases
  • Introduce other Azure capabilities as needed, such as Azure Data Lake Storage for storing source data, or Azure Data Factory for pipeline automation
  • Working with your team, we’ll show how you can use a combination of Spark SQL, R and Python in Databricks to prepare data for analysis
  • Discuss appropriate modelling techniques for the use cases identified, considering how the outputs can be used to drive better decisions
  • Present options for deploying the data and models in a repeatable way, for example through Power BI dashboards that end users can interact with to explore the latest insights from the model