8 min read

Running Jupyter on Kubernetes with Spark (and Why Ilum Makes It Easy)

Running Jupyter on Kubernetes with Spark (and Why Ilum Makes It Easy)


If you’ve done data science long enough, you’ve met this moment: you open a Jupyter notebook, hit Shift+Enter, and the cell just… hangs. Somewhere a Spark driver is starved for memory, or a cluster is “almost” configured, or the library versions don’t quite agree. You’re here to explore data, not babysit infrastructure.

That’s where Kubernetes shines and where Ilum , a free, cloud-native data lakehouse focused on running and monitoring Apache Spark on Kubernetes, helps you go from “why is this stuck?” to “let’s try that model” with far less friction.

In this post, we’ll take a narrative walk from a clean laptop to a working Giove e Apache Zeppelin setup on Kubernetes , both backed by Apache Spark and connected through a Livy-compatible API. You’ll see how Ilum plugs into the picture, why it matters for day-to-day data work, and how to scale from a single demo box to a real cluster without rewriting your notebooks.

Why Kubernetes for Data Science

Kubernetes (K8s) gives you three things notebooks secretly crave:

  • Elasticity: Executors scale up when a join explodes, and back down when you’re idle.
  • Isolation: Each user or team runs in a clean, containerized environment—no shared-conda-env roulette.
  • Repeatability: “It works on my machine” becomes “it works because it’s declarative.”

Pair K8s with Scintilla and a notebook front-end, and you get an interactive analytics platform that grows with your data. Pair it with Ilum , and you also get the plumbing and observability—logs and metrics—that keep you moving when things go sideways.

The missing piece: speaking Livy

Both Jupyter (via Sparkmagic)e Zeppelin (via its Livy interpreter) speak the Livy REST API to manage Spark sessions. Ilum implements that API through an embedded ilum-livy-proxy , so your notebooks can create and use Spark sessions while Ilum handles the Spark-on-K8s lifecycle behind the scenes.

Think of it this way:

Notebook → Livy API → Ilum → Spark on Kubernetes
You write cells. Ilum speaks Kubernetes. Spark does the heavy lifting.

No special notebook rewrites, no custom glue.

A gentle, from-scratch run-through

Let’s build a tiny playground locally with Minikube . Later, you can point the exact same setup at GKE/EKS/AKS or prem k8s distro.

1) Start a small Kubernetes cluster

You don’t need production horsepower to try this—just a few CPUs and memory.

minikube start --cpus 6 --memory 12288 --addons metrics-server

kubectl get nodes

You’ll see a node and the metrics add-on come up. That’s enough to host Spark drivers and executors for a demo.

2) Install Ilum with Jupyter, Zeppelin, and the Livy proxy

Ilum ships Helm charts so you don’t have to handcraft manifests.

helm repo add ilum https://charts.ilum.cloud
helm repo update

helm install ilum ilum/ilum \
  --set ilum-zeppelin.enabled=true \
  --set ilum-jupyter.enabled=true \
  --set ilum-livy-proxy.enabled=true

kubectl get pods -w

Grab a coffee while pods settle. When the dust clears, you have Ilum’s core plus bundled Giove e Zeppelin ready to talk to Spark through the Livy-compatible proxy.

Accessing the Ilum UI after installation

a) Port-forward (quick local access)

kubectl port-forward svc/ilum-ui 9777:9777 

Open: http://localhost:9777

b) NodePort (stable for testing)
For testing, a Porta nodo is enabled by default (avoids port-forward drops).
Open: http://<KUBERNETES_NODE_IP>:31777
Mancia: Get the node IP with kubectl get nodes -o wide (oppure minikube ip on Minikube).

c) Minikube shortcut

Servizio Minikube ILUM-UI 

This opens the service in your browser (or prints the URL) using your Minikube IP.

d) Production
Use an Ingress (TLS, domain, auth). See: https://ilum.cloud/docs/configuration/ilum-ui/#ilum-ui-ingress-parameters


First contact: Jupyter + Sparkmagic (via Ilum UI)

Open the Interfaccia utente di Ilum in your browser. From the left navigation bar, go to Notebook .

In Jupyter:

  1. Crea un Python 3 notebook.
  2. Load Sparkmagic:
%load_ext sparkmagic.magics 
  1. Open the spark session manager
%manage_spark 
jupyter ilum spark form
jupyter ilum spark form

Basic Settings

  • Endpoint – Select the preconfigured Livy endpoint (usually http://ilum-livy-proxy:8998 ). This is how Jupyter/Sparkmagic talks to Ilum.
  • Grappolo – Choose the target cluster (e.g., default ). If you manage multiple K8s clusters in Ilum, pick the one you want your driver/executors to run on.
  • Nome sessione – Any short identifier (e.g., eda-january). You’ll see this in Sparkmagic session lists.
  • Lingua pitone (PySpark) or Scala . Most Jupyter users go with Python.
  • Immagine Spark – The container image for your Spark driver/executors (e.g., ILUM / Scintilla: 3.5.3-Delta ). Images tagged with -delta already include Delta Lake.
  • Pacchetti Extra – Comma-separated extras to pull into the session (e.g., numpy,delta).
    • Tip: if you selected a -delta image, you usually don’t need to add delta again.
  • Abilita la pausa automatica – When checked, Ilum will automatically pause the session after it’s idle for a while to save resources. You can resume it from the UI.

Resource Settings

  • Driver Settings
    • Memoria del driver – Start with 1 grammo for demos; bump to 2–4g for heavier notebooks.
    • Core del driver 1 is fine for exploratory work; increase if your driver does more coordination/collects.
  • Executor Settings (collapsed by default)
    • Configure only if you want to override defaults; many users rely on dynamic allocation (below).

More Advanced Options

  • Configurazione Spark personalizzata – JSON map for spark.* keys (e.g., event logs, S3 creds, serializer). Example:jsonCopyEdit{
    "spark.eventLog.enabled": "true",
    "spark.sql.adaptive.enabled": "true"
    }

  • Estensione SQL – Pre-fills for Delta Lake: io.delta.sql.DeltaSparkSessionExtension . Leave as-is if you plan to read/write Delta tables.
  • Opzioni Java aggiuntive del driver – JVM flags for the driver. The defaults (-Divy.cache.dir=/tmp -Divy.home=/tmp ) keep Ivy dependency caches inside the container.
  • Opzioni Java extra dell'esecutore – Same idea, but for executors. Leave empty unless you need specific JVM flags.
  • Allocazione dinamica – Let Spark scale executors automatically.
    • Min Esecutori Esecutori Testamentari – Floor for scaling (e.g., 1 ).
    • Esecutori iniziali – Startup size (e.g., 2–3).
    • Numero massimo di esecutori testamentari – Ceiling (e.g., 10 for demos, higher in prod).
  • Partizioni casuali – Number of partitions for wide ops (e.g., 200 ). Rule of thumb: start near 2–3× total executor cores, then tune.

Clic Crea sessione . Ilum will start the Spark driver and executors on Kubernetes; the first pull can take a minute if images are new. When the status flips to available, you’re ready to run cells:

%%spark
spark.range(0, 100000).selectExpr("count(*) as n").show()
%%spark
from pyspark.sql import Row
rows = [Row(id=1, city="Warsaw"), Row(id=2, city="Riyadh"), Row(id=3, city="Austin")]
print(rows)

It’s a small example, but the important part is what just happened: Jupyter spoke Livy. Ilum created a Spark session on Kubernetes, Spark did the work. You didn’t touch a single YAML by hand.

Or if you prefer: Apache Zeppelin

Some teams love Zeppelin for its multi-language paragraphs and shareable notes. That works here too.

  1. Per eseguire il codice, dobbiamo creare una nota:

2. Poiché la comunicazione con Ilum viene gestita tramite livy-proxy, è necessario scegliere livy come interprete predefinito.

3. Ora apriamo la nota e inseriamo del codice nel paragrafo:


Come per Jupyter, anche Zeppelin ha una configurazione predefinita necessaria per Ilum. È possibile personalizzare facilmente le impostazioni. Basta aprire il menu contestuale nell'angolo in alto a destra e fare clic sul pulsante dell'interprete.

C'è un lungo elenco di interpreti e delle loro proprietà che potrebbero essere personalizzate.

Zeppelin offre 3 diverse modalità per eseguire il processo dell'interprete: condiviso, con ambito e isolato. Per ulteriori informazioni sulla modalità di binding dell'interprete qui .

Same experience: Zeppelin sends code through Livy. Ilum spins up and manages the session on K8s. Spark runs it.


A quick detour into production thinking (without killing the vibe)

Here are the ideas you’ll care about when this graduates from demo to team-wide platform:

  • Immagazzinamento : For a lakehouse, use object storage (S3/MinIO/GCS/Azure Blob). Keep Spark event logs there so the Server della cronologia can give you post-mortems that aren’t just vibes.
  • Sicurezza : Put Jupyter/Zeppelin behind SSO (OIDC/Keycloak), scope access with Kubernetes RBAC, and keep secrets in a manager, not in a notebook cell.
  • Autoscaling: Let the cluster scale node pools; let Spark dynamic allocation manage executors. Your wallet and your patience will thank you.
  • Costs: Spot/preemptible nodes for executors, right-size memory/cores, and avoid tiny files (Parquet/Delta/Iceberg for the win).

You don’t need to implement all of this today. The point is: you’re not stuck. The notebook you wrote for Minikube is the same notebook you’ll run next quarter on EKS.

Field notes: little problems you’ll actually hit

  • “Session stuck starting.” Usually resource pressure. Either give Minikube a bit more (--memory 14336) or lower Spark requests/limits for the demo.
  • ImagePullBackOff. Your node can’t reach the registry, or you need imagePullSecrets . Easy fix, don’t overthink it.
  • Slow reads on big datasets. You’re paying the tiny-file tax or skipping predicate pushdown. Compact to Parquet/Delta and filter early.

The good news: Ilum’s logs and metrics make these less mysterious. You’ll still debug—but with tools, not folklore.

Do you actually need Kubernetes for data science?

Strictly speaking? No. Plenty of useful analysis runs on a single machine. But as soon as your team grows or your data size stops being cute, Kubernetes buys you standardized environments, sane isolation, and predictable scaling. The more people share the same platform, the more those properties matter.

And the nice part is: with Ilum , moving to Spark on K8s doesn’t require tearing up your notebooks or learning the entire Kubernetes dictionary on day one. You point Jupyter/Zeppelin at a Livy-compatible endpoint and keep going.

Where to go next

  • Keep this demo, but try a real dataset: NYC Taxi, clickstream, retail baskets, anything columnar and not tiny.
  • Aggiungere Spark event logging to object storage so the Server della cronologia can tell you what actually happened.
  • If you’re on a cloud provider, deploy the same chart to GKE/EKS/AKS, add ingress + TLS, and connect SSO.

If you want a deeper dive, auth, storage classes, GPU pools for deep learning, or an example migration from YARN to Kubernetes, say the word and I’ll spin up a follow-up with concrete manifests.

Copy-paste corner

# Start local Kubernetes
minikube start --cpus 4 --memory 12288 --addons metrics-server

# Install Ilum with Jupyter, Zeppelin, and Livy proxy
helm repo add ilum https://charts.ilum.cloud
helm repo update
helm install ilum ilum/ilum \
  --set ilum-zeppelin.enabled=true \
  --set ilum-jupyter.enabled=true \
  --set ilum-livy-proxy.enabled=true

# Open Jupyter and get the token
kubectl port-forward svc/ilum-jupyter 8888:8888
kubectl logs -l app.kubernetes.io/name=ilum-jupyter

# Open Zeppelin
kubectl port-forward svc/ilum-zeppelin 8080:8080
#Jupyter cell to test:
%load_ext sparkmagic.magics
%manage_spark  # choose the predefined Ilum endpoint, then create session

%%spark
spark.range(0, 100000).selectExpr("count(*) as n").show()

A gentle nudge to try Ilum

Ilum is free, cloud-native, and built to make Spark su Kubernetes practical for actual teams—not just demo videos. You get the Livy-compatible endpoint, interactive sessionse logs/metrics all in one place, so your notebooks feel like notebooks again.