Olive Data Ingestion Framework (ODIF) - cloud agnostic data ingestion framework
Olive Data Ingestion Framework (ODIF), is a data ingestion tool which can connect to any source and sink to make data ingestion/transfer faster and easier. ODIF built with a cloud agnostic approach with no pre-installation of cluster and can be deployed with minimal resource footprint. It provides a user friendly web interface which helps user in, data source registration, job config, job runs and monitoring.
ODIF guiding principles
- Cloud Native Design
- Platform Agnostic
- Dynamic Compute
- API Driven
- IaaC (Infrastructure as a Code)
- Reusable connectors : Once created connectors can be used as source as well as sink.
- RDBMS Source : Provide feature to select multiple databases, tables, as well as, feature to select complete dataset or particular dataset with where clause.
- Split Job : Job gets split based on dataset size when input source is large, which accelerate ingestion.
- File Format : Support csv, txt, parquet and json file format at sink.
- Load Type : Feature support incremental load for regular ingestion and full load for historical or one time load.
- UI & REST API : Support web interface as well as REST APIs.
- Job Scheduling : Job can be schedule and can run on given time interval.
- Livy Support : Support Livy on static cluster.
- Cluster Type : ODIF use cloud and platform agnostic approach, hence can be run on static as well as on-demand cluster (AWS,Azure,GCP)