Data versioning, lineage & provenance add-on for Azure Machine Learning, powered by Pachyderm.
Pachyderm adds immutable versioned datasets to Azure Machine Learning that automatically populate, so you don't have to manually manage dataset versions. You can safely call different versions of your dataset without worrying about the data referenced being overwritten. Keep track of the precise data lineage and provenance (where data came from) so that you can improve the auditability and governance of the Machine Learning models you ship into production. Data engineers can make these datasets available in an Azure Machine Learning Workspace via Pachyderm Datastore. Data scientists can continue to use Azure ML Python libraries to access versioned datasets.