First steps with Apache Spark on K8s (part #1): preparation of environment

All you will read here is personal opinion or lack of knowledge :) Please feel free to contact me for fixing incorrect parts.

In order to start working with Spark on Kubernetes first step is to prepare sandbox. In my case I’m using Windows 10 based machine with Hyper-V. For this sandbox I can dedicate 16 Gigs and 8 threads.

All scripts from this article can be found at: github.com/domasjautakis/docker-minikube

As OS I will use Centos 7 Minimal:

The first step of prep. — install all packages which might be required for installation or configuration.

#1.yum install dnf -y
yum install dnf-plugins-core -y
yum install wget -y
yum install java-11-openjdk-devel -y
yum install git -y

Let’s create dedicated user and connect to it, as we’re going to make all installation from it.

#2.adduser minikubeuser
passwd minikubeuser
#**New password:
usermod -aG wheel minikubeuser
su - minikubeuser

For the next step I will:

  • will add dnf repo
  • install docker
  • start the service
  • add firewall masquerade
  • will change mode for docker.sock

Mode change is needed in order to use docker socket.

#3.sudo dnf update -y
sudo setenforce 0
sudo sed -i --follow-symlinks 's/SELINUX=enforcing/SELINUX=disabled/g' /etc/sysconfig/selinux
sudo dnf config-manager --add-repo=https://download.docker.com/linux/centos/docker-ce.repo
sudo dnf install docker-ce -y
sudo systemctl start docker
sudo systemctl enable docker
sudo firewall-cmd --zone=public --add-masquerade --permanent
sudo firewall-cmd --reload
sudo chmod 666 /var/run/docker.sock
docker ps --no-trunc
docker image ls

if you have some issues with docker ps — make sure that the latest step with change mode is completed. More can be found at: https://stackoverflow.com/questions/48957195/how-to-fix-docker-got-permission-denied-issue

In the next step I will install conntrack.

#4.sudo dnf install conntrack -y

Let’s install minikube now:

#5.curl -LO https://storage.googleapis.com/kubernetes-release/release/`curl -s https://storage.googleapis.com/kubernetes-release/release/stable.txt`/bin/linux/amd64/kubectl
chmod +x ./kubectl
sudo mv ./kubectl /usr/local/bin/kubectl
curl -Lo minikube https://storage.googleapis.com/minikube/releases/latest/minikube-linux-amd64
chmod +x minikube
sudo mkdir -p /usr/local/bin/
sudo install minikube /usr/local/bin/

It’s almost there. Next is start the minikube!

#6.minikube start --driver=none
minikube status

It’s worth to check which addons are enabled. For such functionality like HPA metrics-server might be required

#7.minikube addons list
minikube addons enable metrics-server
minikube addons list

The final step in environment preparation — helm install:

#8.curl -fsSL -o get_helm.sh https://raw.githubusercontent.com/helm/helm/master/scripts/get-helm-3
chmod 700 get_helm.sh
./get_helm.sh

Done!

Data Engineer