https://store-images.s-microsoft.com/image/apps.57006.e769dbc0-5b77-4b93-a9dd-3aaf83408716.617ef2a2-e666-4a97-bb40-d2951a4458c9.562b1234-a0dc-4ddc-9068-dbf0228e7703

Data Version Unifier

Teia Consultoría en Sistemas SA de CV

Data Version Unifier

Teia Consultoría en Sistemas SA de CV

Unifier information from the data schemas of various versions of entities

From the data schemas of various versions of entities, such as CFDI's, Purchase Orders (PO), catalogs of products, suppliers or customers, among others; It allows to standardize the metadata and integrate the information, avoiding errors derived from different version schemas. It also allows integrate new tables with old schemas, unifying to a destination table.
The internal process of this solution performs a cycle between different versions, where each version will be placed in its corresponding destination according to the comparison made in the cycle (data lineage).

The source and target can be any combination of connected and allowed relational database providers (RDBMS), for example:
  • SQL Server database versions 2012 or later
  • Azure SQL Database
  • SQL Server in virtual machine
  • Azure Database for PostgreSQL
  • Azure Database for MySQL
  • Among others.

The problem of having different schemas and not being able to compare the information in the same entity is common in projects where it's required to validate, perform operations, use derived columns, compare and integrate historical information.

This solution performs some simple detections of paterns in fields automatically, however, for optimal operation the process requires business rules and manual validation in case of not detecting simple paterns between the fields. The differences between source and destination appear in a comparative matrix to facilitate their validation. Once you have the complete mapping of the different versions, you can perform the required operations:
  • Validation
  • Comparison and
  • Integration.

Once the process is finished, a notifications are sent via email to the administrators with the report of the integration results.

The comparisons are saved and can be integrated into a script to versioned, and later modified and used later.

The base of this solution is in Databricks, which allows handling large volumes of data.
https://store-images.s-microsoft.com/image/apps.56283.e769dbc0-5b77-4b93-a9dd-3aaf83408716.617ef2a2-e666-4a97-bb40-d2951a4458c9.08fe99b8-874c-4bff-8e77-4f1aa8935ea9
https://store-images.s-microsoft.com/image/apps.56283.e769dbc0-5b77-4b93-a9dd-3aaf83408716.617ef2a2-e666-4a97-bb40-d2951a4458c9.08fe99b8-874c-4bff-8e77-4f1aa8935ea9