Sparkling Water for HDInsight Sparkling Water for HDInsight

100% open source, fully distributed in-memory machine learning platform with linear scalability

Sparkling Water allows users to combine the fast, scalable machine learning algorithms of H2O with the capabilities of Spark. With Sparkling Water, users can drive computation from Scala/R/Python and utilize the H2O Flow UI, providing an ideal machine learning platform for application developers. It intelligently combines the following features:
  • Best of Breed Open Source Technology –Enjoy the freedom that comes with big data science powered by open source technology.
  • Easy-to-use WebUI and Familiar Interfaces –Set up and get started quickly using either H2O’s intuitive web-based Flow GUI or familiar programming environments like R, Python, Java, Scala, JSON, and through our powerful APIs.
  • Data Agnostic Support for all Common Database and File Types –Easily explore and model big data from within Microsoft Excel, R Studio, Tableau and more. Connect to data from HDFS, S3, SQL and NoSQL data sources.
  • Spark + H2O –Seamlessly transition between Spark and H2O. Data mining in Spark plus Machine Learning in H2O.
  • Real-time Data Scoring –Rapidly deploy models to production via plain-old Java objects (POJO), model-optimized Java objects (MOJO) or REST API