H2O Artificial Intelligence for HDInsight
100% open source, fully distributed in-memory machine learning platform with linear scalability
Sparkling Water allows users to combine the fast, scalable machine learning algorithms of H2O with the capabilities of Spark. With Sparkling Water, users can drive computation from Scala/R/Python and utilize the H2O Flow UI, providing an ideal machine learning platform for application developers. It intelligently combines the following features:
  • Best of Breed Open Source Technology –Enjoy the freedom that comes with big data science powered by open source technology.
  • Easy-to-use WebUI and Familiar Interfaces –Set up and get started quickly using either H2O’s intuitive web-based Flow GUI or familiar programming environments like R, Python, Java, Scala, JSON, and through our powerful APIs.
  • Data Agnostic Support for all Common Database and File Types –Easily explore and model big data from within Microsoft Excel, R Studio, Tableau and more. Connect to data from HDFS, S3, SQL and NoSQL data sources.
  • Massively Scalable Big Data Munging and Analysis –H2O Big Joins performs 7x faster than R data.table in a benchmark, and linearly scales to 10 billion x 10 billion row joins.
  • Real-time Data Scoring –Rapidly deploy models to production via plain-old Java objects (POJO), model-optimized Java objects (MOJO) or REST API