Currently, Microsoft Purview does not support lineage mapping for data processes in a Synapse notebook. Hexaware, a Microsoft Partner has a complementary code to address this gap.
A specialty insurer in the London market desired to modernize their data management system and enhance data security. They were challenged by the lack of harmonized data for analysis, data silos across multiple systems, and scalability concerns.
Their on-premises SQL server database was moved to a cloud data lake and subsequently to a data mart. Purview could read data lineage across all layers but needed a connector for Azure Analysis Services.
Hexaware, their implementation partner, assisted in developing a Data Lake and Analytics platform on Azure Cloud using Azure Synapse Analytics.
Key Tech Implementations Azure Purview is designed for data governance within the system and aids in building data lineage, which is moved from Synapse analytics to the target system. Microsoft Purview can identify tables or processes but presently does not read the query of a table. Our strategy involves a PySpark notebook to load data from the gold layer (business-level table) to the Data Mart. A limitation of Azure Synapse is that it doesn't push lineage data from the PySpark notebook to Azure Purview.
With PyApacheAtlas, Hexaware created a code for custom lineage, capable of reading an Excel template and understanding the metadata of the source, target, and process. This aids in creating custom lineage for activities (like notebooks, stored procedures, UDF) not fully supported in Synapse Analytics. The custom lineage process can be automated using a script and an Excel file.
Reach out to our Azure experts to understand how you can integrate Azure Purview into your organization. For more details, contact firstname.lastname@example.org. http://www.hexaware.com