Data Science Virtual Machine for Linux (CentOS)

Virtual machine with tools for the data science modeling and development

This virtual machine built on OpenLogic CentOS-based Linux, contains popular tools for data science and development activities, including Microsoft R Open, Anaconda Python, Azure command line tools, and Jupyter notebooks for Python, R and Julia. It also has machine learning tools and algorithms like mxnet, CNTK, Vowpal Wabbit and xgboost.

What's new

The Linux data science virtual machine now includes
  • Microsoft R Server 9.0, now with Microsoft R Open 3.3.2 and new options for operationalizing R models
  • Weka for easy graphical exploration and machine learning
  • Apache Drill for querying non-relational data using SQL
  • Spark local 2.0.2 with a PySpark Jupyter kernel
  • Single node local Hadoop (HDFS, Yarn)
  • Visual Studio Code IDEs, IntelliJ IDEA, PyCharm, Atom
  • mxnet for deep learning
  • JuliaPro - a curated distribution of Julia Language and tools

You can view a full list of installed tools for the Linux edition here.