Installing Spark on EC2
This is an account of setting up Spark on my small EC2 cluster of two m3.medium spot instances. Spot instances are good way of saving on cost of on demand prices, and you also get the option of retaining your instances till the spot prices are below your chosen maximum bid. There are many well written guides about setting up Spark on an EC2 cluster but I still got stuck at a few places. I will be describing those here, along with what was the reason for getting stuck. This will be helpful for those who face similar problems. I will not go into the details of each step, but delve into details of only the troubleshooting parts. Step 1: Create an IAM role for EC2 service role. This step is not required for setup of Spark. This is required only when accessing other AWS services. Step 2: Create security group with SSH access from your local work machine. This step is crucial, as without this we cannot SSH into the EC2 machine. Step 3: Launch EC2 instances with IAM ro...