With so much noise around Apache Spark, let’s look into how to get started with Spark in local mode and execute a simple Scala program. A lot of complex combinations are possible, but we will look at the minimum steps required to get started with Spark.
Most of the Big Data software’s are developed with Linux as the platform and porting to Windows has been an after thought. It is interesting to see how Big Data on Windows will morph in the future. Spark can run on both Windows/Linux, but we will take Linux (Ubuntu 14.04 64-bit Desktop) into consideration.
So, here are the steps:
1) Download and install Oracle VirtualBox.
2) Download and install Ubuntu.
3) Update the patches on Ubuntu from a terminal and reboot it.
1. sudo apt update;sudo apt-get dist-upgrade
4) Oracle Java doesn’t come with Linux distributions, so has to be installed manually on top of Ubuntu as mentioned here.
5) Spark has been developed in Scala, so we need install Scala.
1. sudo apt-get install scala
Read More: Getting started with Apache Spark