Skip to content

Improve API surface for external network consumers (kind bridge use case) #36

@dennisklein

Description

@dennisklein

Context

We are developing a Crossplane controller for Slurm dynamic node migration. Integration tests need a kind (Kubernetes-in-Docker) cluster that can resolve and reach sind Slurm nodes — pods must be able to SSH into sind controllers and run Slurm commands.

To validate this, we wrote a bridge script that creates two sind clusters in the same realm and connects a single kind cluster to both. The script works, but it exposes several places where sind's API forces consumers to reach behind the abstraction into raw Docker.

What the script does

  1. Creates two sind clusters (same realm, shared mesh/DNS)
  2. Creates a kind cluster
  3. Attaches kind node containers to the sind mesh + both cluster networks
  4. Patches CoreDNS with a stub zone forwarding <realm>.sind to the sind DNS server
  5. Creates a K8s secret from the sind SSH volume (private key + known_hosts)
  6. Verifies by running sinfo on both controllers via SSH from kind pods

Abstraction leaks

1. Naming conventions reimplemented in the script

The script manually reconstructs internal naming:

SIND_MESH_NET="${SIND_REALM}-mesh"
SIND_NET_A="${SIND_REALM}-${SIND_CLUSTER_A}-net"
SIND_DNS_CONTAINER="${SIND_REALM}-dns"
SIND_SSH_VOLUME="${SIND_REALM}-ssh-config"
SIND_DNS_ZONE="${SIND_REALM}.sind"

These mirror pkg/mesh/mesh.go and pkg/cluster/naming.go. If sind changes its naming scheme, every consumer breaks.

2. DNS server IP requires docker inspect

docker inspect sind-dns \
    --format '{{(index .NetworkSettings.Networks "sind-mesh").IPAddress}}'

sind get dns shows records but not the server address itself.

3. SSH credentials require Docker volume gymnastics

docker run --rm \
    -v sind-ssh-config:/ssh:ro \
    -v "${tmpdir}:/out:Z" \
    alpine sh -c 'cp /ssh/id_ed25519 /ssh/known_hosts /out/'

sind get ssh-config returns the host file path, which is inaccessible from inside containers or CI runners.

4. Network connect/disconnect is raw Docker

sind has no awareness of external consumers on its networks. The teardown ordering is critical — forgetting to disconnect kind before sind delete cluster causes Docker errors because sind can't remove networks with active endpoints.

5. SSH image is a hardcoded constant

The script copies ghcr.io/gsi-hpc/sind-node:latest from pkg/mesh/ssh.go. No way to query it at runtime.

Suggested improvements

Suggestion Eliminates
sind get mesh [--output json] — expose DNS IP, mesh network name, SSH volume, SSH image Naming reimplementation, docker inspect for DNS IP, hardcoded image
sind get ssh-key --private / --known-hosts — dump credentials to stdout Docker volume extraction hack
--output json on all get commands awk parsing of human-readable tables
sind network connect/disconnect <container> — let sind track external consumers Raw docker network connect, fragile teardown ordering
sind get coredns-config — emit a CoreDNS stub zone snippet Manually constructing the Corefile block, documenting the ndots trap

Broader ideas for Crossplane integration testing

  • sind status --watch or readiness signal — the controller will add/remove workers and needs to know when slurmd has registered, not just when the container is running.
  • sind delete realm — nuke all resources in a realm (mesh included) in one command for CI cleanup.
  • ndots documentation — Kubernetes defaults to ndots:5, which causes musl-based images to cycle through all search domains before trying bare FQDNs. .sind names have too few dots and time out. This is a gotcha anyone connecting kind to sind will hit.
sind-kind-bridge.sh
#!/usr/bin/env bash
#
# sind-kind-bridge.sh — Create two sind Slurm clusters (same realm) and one
# kind Kubernetes cluster, then connect them so that kind pods can resolve
# all sind node hostnames and reach them by IP.
#
# Architecture:
#
#   sind creates a shared mesh network per realm that carries DNS and SSH.
#   Each cluster gets its own network for node traffic.  Two clusters in
#   the same realm share the mesh, so a single DNS server holds records
#   for both:
#
#       <hostname>.<cluster>.<realm>.sind
#
#   kind runs on a separate Docker network.  By attaching the kind node
#   containers to the sind mesh and both cluster networks, they gain L2
#   connectivity to every sind container.  A CoreDNS stub zone forwards
#   the "<realm>.sind" domain to the sind DNS server.
#
#   Docker's embedded DNS does NOT honour the host's systemd-resolved
#   per-link routing, so the CoreDNS stub zone is necessary.
#
# Usage:
#   ./sind-kind-bridge.sh          # create everything
#   ./sind-kind-bridge.sh teardown # disconnect kind and delete both clusters
#
set -euo pipefail

# ---------------------------------------------------------------------------
# Parameters
# ---------------------------------------------------------------------------

# Realm — all sind resources (mesh, DNS, clusters) live under this namespace.
SIND_REALM="sind"

# sind cluster names — two clusters in the same realm.
SIND_CLUSTER_A="alpha"
SIND_CLUSTER_B="beta"

# Optional: paths to sind config files.  Leave empty to use default layout
# (one controller + one worker per cluster).
SIND_CONFIG_A=""
SIND_CONFIG_B=""

# Mount mode for /data inside sind containers.
# Use "volume" for a Docker volume or a host path for a bind mount.
SIND_DATA="volume"

# kind cluster name.
KIND_CLUSTER="sind-bridge"

# kind node image (leave empty for the kind default).
KIND_IMAGE=""

# SSH client image used by verification pods.  Uses the same image as sind's
# own SSH relay container — it already has openssh-client installed.
SSH_IMAGE="ghcr.io/gsi-hpc/sind-node:latest"

# Slurm command to run on each controller during verification.
SLURM_VERIFY_CMD="sinfo"

# ---------------------------------------------------------------------------
# Derived names — follow sind's naming conventions
# ---------------------------------------------------------------------------

# Mesh network (shared by all clusters in the realm, hosts DNS + SSH).
SIND_MESH_NET="${SIND_REALM}-mesh"

# Per-cluster networks (carry the actual node IPs).
SIND_NET_A="${SIND_REALM}-${SIND_CLUSTER_A}-net"
SIND_NET_B="${SIND_REALM}-${SIND_CLUSTER_B}-net"

# DNS container that serves the <realm>.sind zone.
SIND_DNS_CONTAINER="${SIND_REALM}-dns"

# sind SSH config volume (shared across all clusters in the realm, contains
# the private key and known_hosts for passwordless SSH to every sind node).
SIND_SSH_VOLUME="${SIND_REALM}-ssh-config"

# CoreDNS zone that covers all clusters in the realm.
SIND_DNS_ZONE="${SIND_REALM}.sind"

# kubectl context for the kind cluster.
KIND_CONTEXT="kind-${KIND_CLUSTER}"

# Name of the Kubernetes secret that mirrors the sind SSH volume.
K8S_SSH_SECRET="sind-ssh"

# ---------------------------------------------------------------------------
# Functions
# ---------------------------------------------------------------------------

create_sind_cluster() {
    # Create a single sind cluster.
    #   $1 — cluster name
    #   $2 — config file path (empty string for defaults)
    local name=$1 config=$2

    local flags=(--data "${SIND_DATA}")
    [[ -n "${config}" ]] && flags+=(--config "${config}")

    echo "==> Creating sind cluster '${name}'"
    sind --realm "${SIND_REALM}" create cluster "${name}" "${flags[@]}"
    sind --realm "${SIND_REALM}" status "${name}"
}

discover_sind_dns_ip() {
    # Look up the sind DNS container's IP on the mesh network.
    # Prints the IP to stdout.
    docker inspect "${SIND_DNS_CONTAINER}" \
        --format "{{(index .NetworkSettings.Networks \"${SIND_MESH_NET}\").IPAddress}}"
}

create_kind_cluster() {
    # Create a kind cluster.
    echo "==> Creating kind cluster '${KIND_CLUSTER}'"

    local flags=(--name "${KIND_CLUSTER}")
    [[ -n "${KIND_IMAGE}" ]] && flags+=(--image "${KIND_IMAGE}")

    kind create cluster "${flags[@]}"
}

connect_kind_to_sind() {
    # Attach every kind node container to the sind Docker networks.
    # Each node needs connectivity to:
    #   - the mesh network   → to reach the sind DNS server
    #   - each cluster network → to reach the node IPs that DNS resolves to
    local nodes
    mapfile -t nodes < <(kind get nodes --name "${KIND_CLUSTER}")

    for node in "${nodes[@]}"; do
        echo "==> Connecting ${node} to ${SIND_MESH_NET}"
        docker network connect "${SIND_MESH_NET}" "${node}"

        echo "==> Connecting ${node} to ${SIND_NET_A}"
        docker network connect "${SIND_NET_A}" "${node}"

        echo "==> Connecting ${node} to ${SIND_NET_B}"
        docker network connect "${SIND_NET_B}" "${node}"
    done
}

patch_coredns() {
    # Add a stub zone to CoreDNS that forwards the sind realm domain to the
    # sind DNS server.  Without this, pods cannot resolve sind hostnames —
    # CoreDNS would fall through to Docker's embedded DNS, which ignores
    # the host's systemd-resolved routing rules.
    #   $1 — sind DNS server IP
    local dns_ip=$1

    echo "==> Patching CoreDNS: forwarding '${SIND_DNS_ZONE}' to ${dns_ip}"

    kubectl --context "${KIND_CONTEXT}" apply -f - <<EOF
apiVersion: v1
kind: ConfigMap
metadata:
  name: coredns
  namespace: kube-system
data:
  Corefile: |
    ${SIND_DNS_ZONE}:53 {
        errors
        cache 30
        forward . ${dns_ip}
    }
    .:53 {
        errors
        health {
           lameduck 5s
        }
        ready
        kubernetes cluster.local in-addr.arpa ip6.arpa {
           pods insecure
           fallthrough in-addr.arpa ip6.arpa
           ttl 30
        }
        prometheus :9153
        forward . /etc/resolv.conf {
           max_concurrent 1000
        }
        cache 30 {
           disable success cluster.local
           disable denial cluster.local
        }
        loop
        reload
        loadbalance
    }
EOF

    kubectl --context "${KIND_CONTEXT}" rollout restart deployment coredns -n kube-system
    kubectl --context "${KIND_CONTEXT}" rollout status  deployment coredns -n kube-system --timeout=60s
}

load_ssh_image() {
    # Load the SSH client image into kind so verification pods can use
    # imagePullPolicy=Never and skip pulling from the registry.  The sind
    # node image is usually already present locally because sind pulled it
    # when creating the clusters.
    echo "==> Loading SSH image into kind"

    if ! docker image inspect "${SSH_IMAGE}" &>/dev/null; then
        docker pull "${SSH_IMAGE}"
    fi

    kind load docker-image "${SSH_IMAGE}" --name "${KIND_CLUSTER}"
}

create_ssh_secret() {
    # Copy the sind SSH private key and known_hosts from the sind-ssh-config
    # Docker volume into a Kubernetes secret.  Verification pods mount this
    # secret to authenticate against sind nodes.
    #
    # The volume is owned by root inside the container, so we use the :Z
    # flag to handle SELinux relabelling when bind-mounting the temp dir.
    echo "==> Creating Kubernetes secret '${K8S_SSH_SECRET}' from sind SSH volume"

    local tmpdir
    tmpdir=$(mktemp -d)

    docker run --rm \
        -v "${SIND_SSH_VOLUME}:/ssh:ro" \
        -v "${tmpdir}:/out:Z" \
        alpine sh -c 'cp /ssh/id_ed25519 /ssh/known_hosts /out/ && chmod 644 /out/*'

    kubectl --context "${KIND_CONTEXT}" create secret generic "${K8S_SSH_SECRET}" \
        --from-file="${tmpdir}/id_ed25519" \
        --from-file="${tmpdir}/known_hosts"

    rm -rf "${tmpdir}"
}

verify() {
    # Verify the full pipeline: DNS resolution, IP connectivity, and SSH
    # authentication by running a Slurm command on each controller from a
    # kind pod.
    #
    # Each pod:
    #   - mounts the sind SSH secret (private key + known_hosts)
    #   - uses dnsConfig to set ndots=1 (Kubernetes defaults to ndots=5,
    #     which causes musl-based images to cycle through all search domains
    #     before trying the bare FQDN — and that times out for .sind names)
    #   - SSHes into the controller and runs the configured Slurm command
    local controller

    for controller in \
        "controller.${SIND_CLUSTER_A}.${SIND_DNS_ZONE}" \
        "controller.${SIND_CLUSTER_B}.${SIND_DNS_ZONE}"
    do
        echo "==> Verifying: ${SLURM_VERIFY_CMD} on ${controller}"

        kubectl --context "${KIND_CONTEXT}" run -i --rm "verify-${RANDOM}" \
            --image="${SSH_IMAGE}" --image-pull-policy=Never --restart=Never \
            --overrides="$(cat <<OJSON
{
  "spec": {
    "dnsConfig": {
      "options": [{"name": "ndots", "value": "1"}]
    },
    "containers": [{
      "name": "ssh",
      "image": "${SSH_IMAGE}",
      "imagePullPolicy": "Never",
      "command": ["ssh",
        "-o", "StrictHostKeyChecking=yes",
        "-o", "UserKnownHostsFile=/ssh/known_hosts",
        "-i", "/ssh/id_ed25519",
        "root@${controller}",
        "${SLURM_VERIFY_CMD}"
      ],
      "volumeMounts": [{
        "name": "ssh",
        "mountPath": "/ssh",
        "readOnly": true
      }]
    }],
    "volumes": [{
      "name": "ssh",
      "secret": {
        "secretName": "${K8S_SSH_SECRET}",
        "defaultMode": 384
      }
    }]
  }
}
OJSON
)"
    done
}

disconnect_kind_from_sind() {
    # Detach every kind node container from the sind Docker networks.
    # This MUST run before deleting sind clusters, otherwise sind cannot
    # remove its networks (Docker refuses to delete a network that still
    # has connected endpoints).
    local nodes
    mapfile -t nodes < <(kind get nodes --name "${KIND_CLUSTER}" 2>/dev/null) || true

    for node in "${nodes[@]}"; do
        for net in "${SIND_MESH_NET}" "${SIND_NET_A}" "${SIND_NET_B}"; do
            echo "==> Disconnecting ${node} from ${net}"
            docker network disconnect "${net}" "${node}" 2>/dev/null || true
        done
    done
}

teardown() {
    # Tear down everything in the correct order:
    #   1. Disconnect kind nodes from sind networks
    #   2. Delete the kind cluster
    #   3. Delete both sind clusters (sind cleans up networks/volumes)
    echo "==> Tearing down"

    disconnect_kind_from_sind

    echo "==> Deleting kind cluster '${KIND_CLUSTER}'"
    kind delete cluster --name "${KIND_CLUSTER}" 2>/dev/null || true

    echo "==> Deleting sind clusters"
    sind --realm "${SIND_REALM}" delete cluster --all
}

summary() {
    echo ""
    echo "==> Setup complete"
    echo "    sind realm   : ${SIND_REALM}"
    echo "    sind clusters: ${SIND_CLUSTER_A}, ${SIND_CLUSTER_B}"
    echo "    kind cluster : ${KIND_CLUSTER}  (context: ${KIND_CONTEXT})"
    echo "    sind DNS     : ${SIND_DNS_IP} (zone: ${SIND_DNS_ZONE})"
    echo ""
    echo "    DNS records reachable from kind pods:"
    sind --realm "${SIND_REALM}" get dns | sed 's/^/      /'
}

# ---------------------------------------------------------------------------
# Main
# ---------------------------------------------------------------------------

case "${1:-up}" in
    up)
        create_sind_cluster "${SIND_CLUSTER_A}" "${SIND_CONFIG_A}"
        create_sind_cluster "${SIND_CLUSTER_B}" "${SIND_CONFIG_B}"

        SIND_DNS_IP=$(discover_sind_dns_ip)
        echo "==> sind DNS at ${SIND_DNS_IP} on ${SIND_MESH_NET}"

        create_kind_cluster
        connect_kind_to_sind
        patch_coredns "${SIND_DNS_IP}"
        load_ssh_image
        create_ssh_secret
        verify
        summary
        ;;
    teardown)
        teardown
        ;;
    *)
        echo "Usage: $0 [up|teardown]" >&2
        exit 1
        ;;
esac
Example output: up
==> Creating sind cluster 'alpha'
CLUSTER   STATUS (R/S/P/T)
alpha     running (2/0/0/2)

NETWORKS
NAME             DRIVER   SUBNET          GATEWAY      STATUS
sind-mesh        bridge   172.18.0.0/16   172.18.0.1   ✓
sind-alpha-net   bridge   172.19.0.0/16   172.19.0.1   ✓

MESH SERVICES
NAME   CONTAINER   STATUS
dns    sind-dns    ✓

MOUNTS
MOUNT        SOURCE              TYPE     STATUS
/etc/slurm   sind-alpha-config   volume   ✓
/etc/munge   sind-alpha-munge    volume   ✓
/data        sind-alpha-data     volume   ✓

NODES
NAME               ROLE         IP           CONTAINER   MUNGE   SSHD   SERVICES
controller.alpha   controller   172.19.0.4   running     ✓       ✓      slurmctld ✓
worker-0.alpha     worker       172.19.0.3   running     ✓       ✓      slurmd ✓
==> Creating sind cluster 'beta'
CLUSTER   STATUS (R/S/P/T)
beta      running (2/0/0/2)

NETWORKS
NAME            DRIVER   SUBNET          GATEWAY      STATUS
sind-mesh       bridge   172.18.0.0/16   172.18.0.1   ✓
sind-beta-net   bridge   172.21.0.0/16   172.21.0.1   ✓

MESH SERVICES
NAME   CONTAINER   STATUS
dns    sind-dns    ✓

MOUNTS
MOUNT        SOURCE             TYPE     STATUS
/etc/slurm   sind-beta-config   volume   ✓
/etc/munge   sind-beta-munge    volume   ✓
/data        sind-beta-data     volume   ✓

NODES
NAME              ROLE         IP           CONTAINER   MUNGE   SSHD   SERVICES
controller.beta   controller   172.21.0.3   running     ✓       ✓      slurmctld ✓
worker-0.beta     worker       172.21.0.4   running     ✓       ✓      slurmd ✓
==> sind DNS at 172.18.0.2 on sind-mesh
==> Creating kind cluster 'sind-bridge'
Creating cluster "sind-bridge" ...
 ✓ Ensuring node image (kindest/node:v1.35.0) 🖼
 ✓ Preparing nodes 📦
 ✓ Writing configuration 📜
 ✓ Starting control-plane 🕹️
 ✓ Installing CNI 🔌
 ✓ Installing StorageClass 💾
Set kubectl context to "kind-sind-bridge"
==> Connecting sind-bridge-control-plane to sind-mesh
==> Connecting sind-bridge-control-plane to sind-alpha-net
==> Connecting sind-bridge-control-plane to sind-beta-net
==> Patching CoreDNS: forwarding 'sind.sind' to 172.18.0.2
configmap/coredns configured
deployment.apps/coredns restarted
deployment "coredns" successfully rolled out
==> Loading SSH image into kind
Image: "ghcr.io/gsi-hpc/sind-node:latest" [...] loading...
==> Creating Kubernetes secret 'sind-ssh' from sind SSH volume
secret/sind-ssh created
==> Verifying: sinfo on controller.alpha.sind.sind
PARTITION AVAIL  TIMELIMIT  NODES  STATE NODELIST
all*         up   infinite      1   idle worker-0
pod "verify-28197" deleted from default namespace
==> Verifying: sinfo on controller.beta.sind.sind
PARTITION AVAIL  TIMELIMIT  NODES  STATE NODELIST
all*         up   infinite      1   idle worker-0
pod "verify-5643" deleted from default namespace

==> Setup complete
    sind realm   : sind
    sind clusters: alpha, beta
    kind cluster : sind-bridge  (context: kind-sind-bridge)
    sind DNS     : 172.18.0.2 (zone: sind.sind)

    DNS records reachable from kind pods:
      HOSTNAME                     IP
      controller.alpha.sind.sind   172.19.0.4
      worker-0.alpha.sind.sind     172.19.0.3
      controller.beta.sind.sind    172.21.0.3
      worker-0.beta.sind.sind      172.21.0.4
Example output: teardown
==> Tearing down
==> Disconnecting sind-bridge-control-plane from sind-mesh
==> Disconnecting sind-bridge-control-plane from sind-alpha-net
==> Disconnecting sind-bridge-control-plane from sind-beta-net
==> Deleting kind cluster 'sind-bridge'
==> Deleting sind clusters

Metadata

Metadata

Assignees

Labels

featureNew feature or request

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions