https://106c4.wpc.azureedge.net/80106C4/Gallery-Prod/cdn/2015-02-24/prod20161101-microsoft-windowsazure-gallery/streamsets.streamsets-for-databricksstreamsets-databricks-core-hour-1_0.1.0.1/Icons/Large.png

StreamSets for Databricks

StreamSets
Collaborative development, automated deployment and integrated governance of dataflow pipelines.
https://gallery.azure.com/artifact/20151001/streamsets.streamsets-for-databricksstreamsets-databricks-core-hour-1_0.1.0.1/Artifacts/Thumbnails/c0f4f2ff-7353-4f44-8977-0510a3ac481e.png
/images/videoOverlay.png
https://gallery.azure.com/artifact/20151001/streamsets.streamsets-for-databricksstreamsets-databricks-core-hour-1_0.1.0.1/Artifacts/Thumbnails/c0f4f2ff-7353-4f44-8977-0510a3ac481e.png
/images/videoOverlay.png
https://106c4.wpc.azureedge.net/80106C4/Gallery-Prod/cdn/2015-02-24/prod20161101-microsoft-windowsazure-gallery/streamsets.streamsets-for-databricksstreamsets-databricks-core-hour-1_0.1.0.1/Screenshots/Screenshot1.png
Support
Support

StreamSets for Databricks

StreamSets

Collaborative development, automated deployment and integrated governance of dataflow pipelines.

StreamSets for Databricks brings the power of two data planes in the StreamSets DataOps platform for building, testing and deploying ingest to transform and ML jobs with Databricks.

Once instantiated, this instance provides two services: Data Collector and Transformer.

Data Collector provides an execution engine to visually build and run continuous data flow pipelines to ingest data from a wide variety of sources. Use Data Collector to ingest data into Databricks Delta Lake

StreamSets Transformer is an execution engine to visually create data processing pipelines for ETL and ML that execute on Apache Spark. Use Transformer to design DAGs that are pushed down into Databricks for scalable processing and elastic workloads.

Usage Instructions
It takes a few minutes for the VM to be deployed. Once the VM is available, two StreamSets services will be available: Data Collector and Transformer.
Note: The instance is automatically configured to allow TCP traffic from anywhere on Data Collector default port 18630 and Transformer default port 19630. StreamSets highly recommends restricting access to these ports based on your organizational rules.

StreamSets Data Collector:
StreamSets Data Collector web based user interface will be available on port 18630. To access Data Collector, enter the following URL in the address bar of your browser: http://[Public IP of the VM]:18630
For example if the Public IP of the VM is 123.123.123.123, enter http://123.123.123.123:18630 on the browser. To log in to the Data Collector UI, use the following credentials: Username: admin Password: admin

StreamSets Transformer:
StreamSets Transformer web based user interface will be available on port 19630. To access Transformer, enter the following URL in the address bar of your browser: http://[Public IP of the VM]:19630
For example if the Public IP of the VM is 123.123.123.123, enter http://123.123.123.123:19630 on the browser. To log in to the Transformer UI, use the following credentials: Username: admin Password: admin

https://gallery.azure.com/artifact/20151001/streamsets.streamsets-for-databricksstreamsets-databricks-core-hour-1_0.1.0.1/Artifacts/Thumbnails/c0f4f2ff-7353-4f44-8977-0510a3ac481e.png
/images/videoOverlay.png
https://gallery.azure.com/artifact/20151001/streamsets.streamsets-for-databricksstreamsets-databricks-core-hour-1_0.1.0.1/Artifacts/Thumbnails/c0f4f2ff-7353-4f44-8977-0510a3ac481e.png
/images/videoOverlay.png
https://106c4.wpc.azureedge.net/80106C4/Gallery-Prod/cdn/2015-02-24/prod20161101-microsoft-windowsazure-gallery/streamsets.streamsets-for-databricksstreamsets-databricks-core-hour-1_0.1.0.1/Screenshots/Screenshot1.png