Shipping to a multi-node k3s cluster without a registry

When you build a container image on your machine and deploy it to Kubernetes, something has to get that image onto the node that runs the pod. In most setups that something is a container registry: you push the image up, and every node pulls it back down by name. On a small cluster you stood up yourself, you may not have a registry yet. Here is how to ship without one, and what the registry would have been doing for you.

A single node hides the problem. You build the image, load it onto the one node, and everything runs. Add a second node and it breaks the moment a pod is scheduled there, because the image only exists on the first node. Kubernetes tries to pull it, there is no registry to pull from, and the pod lands in ErrImageNeverPull or ImagePullBackOff. The image has to exist on every node a pod might land on, which is why a few of the steps below repeat across nodes.

One detail makes the rest of this make sense. k3s does not use Docker at runtime; it uses containerd. Docker on your build machine is only building the image, while the cluster runs it through containerd. That is why the command to load an image into the cluster is a containerd command rather than a Docker one. With that in mind, here is the procedure end to end.

Build the image with a real version tag.

docker build -t inventory-app:1.7.0 .

NoteAvoid latest. A concrete tag is what lets you roll forward and back on purpose.
Save the image to a tar file.

docker save inventory-app:1.7.0 -o inventory-app-170.tar
Copy the tar to every node.

scp inventory-app-170.tar root@node-1:/tmp/

scp inventory-app-170.tar root@node-2:/tmp/

NoteIt has to reach every node a pod might land on.
Import it into containerd on each node.

ctr -n k8s.io images import /tmp/inventory-app-170.tar

NoteThe k8s.io namespace matters, since Kubernetes only looks there. Skipping this on the second node is what causes the ErrImageNeverPull above.
Set the image pull policy to Never.

imagePullPolicy: Never

NoteThe image is already on the node, so Kubernetes has to be told not to try pulling it from a registry that is not there.
Roll it out and watch it land.

helm upgrade inventory-app ./helm/inventory-app --namespace app

kubectl rollout status deployment/inventory-app -n app

NoteWith maxUnavailable: 0 and maxSurge: 1 in the deployment strategy, the old pod keeps serving until the new one passes its readiness check, so the live site does not blink during the swap.
Verify against the health endpoint.

curl https://your-host/health

NoteThe deploy is done when health is green, not when the rollout reports success. If a pod is stuck pulling, run ctr -n k8s.io images ls | grep 1.7.0 on each node to confirm the image actually landed.

This works, and it is worth understanding because it is the real mechanics that a registry automates away. It is also the same steps in the same order every release, with no judgment between them, which is the shape of a thing you script.

Here is the whole flow as one command. Save it next to the Dockerfile as deploy.sh, and a release becomes ./deploy.sh 1.7.1.

#!/usr/bin/env bash
set -euo pipefail

APP=inventory-app
VERSION="${1:?usage: ./deploy.sh 1.7.1}"
NODES=(root@node-1 root@node-2)
TAR="${APP}-${VERSION//./}.tar"

docker build -t "${APP}:${VERSION}" .
docker save "${APP}:${VERSION}" -o "${TAR}"

for node in "${NODES[@]}"; do
  scp "${TAR}" "${node}:/tmp/"
  ssh "${node}" "ctr -n k8s.io images import /tmp/${TAR} && rm /tmp/${TAR}"
done

helm upgrade "${APP}" "./helm/${APP}" \
  --namespace app \
  --set image.tag="${VERSION}"

kubectl rollout status "deployment/${APP}" -n app

curl -fsS https://your-host/health

Two details carry the safety. set -euo pipefail stops the script at the first failure, so a tar that never reached a node is never followed by a rollout that asks for it. And the version is typed exactly once, as the argument; everything downstream reads it, so the image you built, the image on the nodes, and the image Helm asks for can never disagree. Doing that bookkeeping by hand is where the ErrImageNeverPull mismatches usually start.

The smaller touches earn their keep too. The node list is a variable, so a third node is one more entry instead of another step to remember. Each tar is deleted after its import to keep /tmp clean. The curl -fsS at the end sets the exit code, so the script itself fails unless health comes back green, which makes verification part of the deploy rather than a thing you do after it. If you want the repo to record the running version, bump values.yaml and drop the --set; either way the version lives in one place.

That script is already a deployment pipeline; it just lives on your machine. For a cluster this size it covers most of what GitLab or Octopus would do for the deploy itself: one command, the same order every time, a failure stops the line, and the release fails unless health checks out. The difference is everything around the run. The script runs when you remember to run it, from your own laptop and SSH keys. A GitLab pipeline runs on the merge to main, on a runner, and records the commit, the artifact, who shipped it, and when. It can put an approval step between staging and production, and rolling back means re-running a recorded pipeline instead of digging the last good tag out of your shell history. GitLab also includes a container registry, which removes the copy-and-import loop outright: push once, and every node pulls by name.

The value of doing it by hand first still holds: when you add the registry and the runner later, you know exactly what they are taking off your plate.

← All write-ups