igloo

Kubernetes based home network 🐧

Overview

Kubernetes
Hardware
Software
Cluster Notes
Thanks

This is a mono repository for my home infrastructure and Kubernetes cluster. I try to adhere to Infrastructure as Code (IaC) and GitOps practices using tools like Kubernetes, Flux, Renovate, and GitHub Actions.

💻 Kubernetes

My Kubernetes cluster is deployed with Talos. This is a semi-hyper-converged cluster, workloads and block storage are sharing the same available resources on my nodes while I have a separate NAS server for NFS/SMB shares, bulk file storage and backups.

Core Components

actions-runner-controller: Self-hosted Github runners.
cert-manager: Creates SSL certificates for services in my cluster.
cilium: Internal Kubernetes container networking interface.
cloudflared: Enables Cloudflare secure access to certain ingresses.
external-dns: Automatically syncs ingress DNS records to a DNS provider.
external-secrets: Managed Kubernetes secrets using 1Password Connect.
openebs: local storage provisioner
- rook: Distributed block storage for persistent storage.
sops: Managed secrets for Kubernetes and Terraform which are committed to Git.
spegel: Stateless cluster local OCI registry mirror.
volsync: Backup and recovery of persistent volume claims.

GitOps

Flux watches the clusters in my kubernetes folder (see Directories below) and makes the changes to my clusters based on the state of my Git repository.

The way Flux works for me here is it will recursively search the kubernetes/apps folder until it finds the most top level kustomization.yaml per directory and then apply all the resources listed in it. That aforementioned kustomization.yaml will generally only have a namespace resource and one or many Flux kustomizations (ks.yaml). Under the control of those Flux kustomizations there will be a HelmRelease or other resources related to the application which will be applied.

Renovate watches my entire repository looking for dependency updates, when they are found a PR is automatically created. When some PRs are merged Flux applies the changes to my cluster.

Repository structure

📁 .github         # GH Actions configs, repo reference objects, renovate config
📁 kubernetes      # Kubernetes cluster defined as code
├─📁 apps          # Applications deployed into the cluster grouped by namespace
├─📁 components    # Re-useable Kustomize components
└─📁 flux          # Flux system configuration

Flux Workflow

This is a high-level look how Flux deploys my applications with dependencies. In most cases a HelmRelease will depend on other HelmRelease's, in other cases a Kustomization will depend on other Kustomization's, and in rare situations an app can depend on a HelmRelease and a Kustomization. The example below shows that gatus won't be deployed or upgrade until the rook-ceph-cluster Helm release is installed or in a healthy state.

graph TD
    A>Kustomization: rook-ceph] -->|Creates| B[HelmRelease: rook-ceph]
    A>Kustomization: rook-ceph] -->|Creates| C[HelmRelease: rook-ceph-cluster]
    C>HelmRelease: rook-ceph-cluster] -->|Depends on| B>HelmRelease: rook-ceph]
    D>Kustomization: gatus] -->|Creates| E(HelmRelease: gatus)
    E>HelmRelease: gatus] -->|Depends on| C>HelmRelease: rook-ceph-cluster]

🌐 Networking

The Kube gateway api is utilized through cilium to manage routes. This cluster uses two instances of ExternalDNS running. One for syncing private DNS records to my UDM Pro using ExternalDNS webhook provider for UniFi, while another instance syncs public DNS to Cloudflare. This setup is managed by creating ingresses with two specific classes: internal for private DNS and external for public DNS. The external-dns instances then syncs the DNS records to their respective platforms accordingly.

⚙ Hardware

Device	Count	OS Disk Size	Data Disk Size	Ram	Purpose	Alias	OS
Asus NUC 14 Pro	1	512GB NVMe SSD	2TB SATA SSD	64GB	Kubernetes Control-Plane	asus-node-01	Talos Linux
Dell Optiplex 7040	1	256GB NVMe SSD	1TB SATA SSD	16GB	Kubernetes Worker	dell-node-01	Talos Linux
Dell Optiplex 7060	1	512GB NVMe SSD	1TB SATA SSD	32GB	Kubernetes Control-Plane	dell-node-02	Talos Linux
Helios64 NAS	1	N/A	8x4TB RAID6	4GB	Media and shared file storage	glacier	Debian GNU/Linux
MacBook Pro 2012	1	250GB SSD	N/A	8GB	Kubernetes Control-Plane	mbp-node-01	Talos Linux
MacBook Pro 2016	1	500GB SSD	N/A	16GB	Kubernetes Worker	mbp-node-02	Talos Linux

Software

🔧 Tools

Tool	Purpose
mise	Set `KUBECONFIG` environment variable based on present working directory
sops	Encrypt secrets
go-task	Replacement for make and makefiles
talos	Operating System to install on nodes
uv	python package + virtualenv manager

🛎 Cloud Services

While most of my infrastructure and workloads are self-hosted I do rely upon the cloud for certain key parts of my setup. This saves me from having to worry about three things. (1) Dealing with chicken/egg scenarios, (2) services I critically need whether my cluster is online or not and (3) The "hit by a bus factor" - what happens to critical apps (e.g. Email, Password Manager, Photos) that my family relies on when I no longer around.

Service	Use	Cost
1Password	Secrets with External Secrets	~$65/yr
Cloudflare	Domain and S3	~$30/yr
GitHub	Hosting this repository and continuous integration/deployments	Free
Pushover	Kubernetes Alerts and application notifications	$5 OTP
Tailscale	Device VPN	Free
		Total: ~$8/mo

Media Stack

The servarr stack supports torrent and Usenet-based automation and is integrated for high performance, privacy, and seed ratio maximization:

Indexers:
- Prowlarr
Downloaders:
- qBittorrent (via Gluetun + ProtonVPN provider)
- sabnzbd (for Usenet)
Organizers:
- Sonarr (TV)
- Radarr (Movies)
- Bazarr (Subtitles)
- Recyclarr (auto-syncs indexer/tracker settings)
Automation:
- Cross-seed – uses hardlink watch and injects back into qBittorrent to boost sharing ratios
- Autobrr – filters and pushes releases to qBittorrent and/or Radarr via custom webhook integration
Frontends:
- Jellyfin – main media frontend
- Jellyseerr – request management for Jellyfin users

Cluster Notes

🌱 Environment

mise will make it so anytime you cd to your repo's directory it will export the required environment variables (e.g. KUBECONFIG). To set this up:

Install and activate mise

Use mise to install the required CLI tools:

mise trust && mise install && mise run deps

🛠️ Talos and Kubernetes Maintenance

⚙️ Updating Talos node configuration

Tip

Ensure you have updated talconfig.yaml and any patches with your updated configuration. In some cases you not only need to apply the configuration but also upgrade talos to apply new configuration.

# (Re)generate the Talos config
task talos:generate-config
# Apply the config to the node
task talos:apply-node IP=? MODE=?
# e.g. task talos:apply-node IP=10.10.10.10 MODE=auto

⬆️ Updating Talos and Kubernetes versions

Tip

Ensure the talosVersion and kubernetesVersion in talconfig.yaml are up-to-date with the version you wish to upgrade to.

# Upgrade node to a newer Talos version
task talos:upgrade-node IP=?
# e.g. task talos:upgrade-node IP=10.10.10.10

# Upgrade cluster to a newer Kubernetes version
task talos:upgrade-k8s
# e.g. task talos:upgrade-k8s

🐛 Debugging

Below is a general guide on trying to debug an issue with an resource or application. For example, if a workload/resource is not showing up or a pod has started but in a CrashLoopBackOff or Pending state.

Start by checking all Flux Kustomizations & Git Repository & OCI Repository and verify they are up-to-date and in a ready state.
- flux get sources oci -A
- flux get sources git -A
- flux get ks -A
- flux get all -A

Force Flux to sync your repository to your cluster:

flux -n flux-system reconcile ks flux-system --with-source

Verify all the Flux Helm Releases are up-to-date and in a ready state.
- flux get hr -A
Then check the if the pod is present.
- kubectl -n <namespace> get pods -o wide
Then check the logs of the pod if its there.
- kubectl -n <namespace> logs <pod-name> -f

Note: If a resource exists, running kubectl -n <namespace> describe <resource> <name> might give you insight into what the problem(s) could be.

🤝 Thanks

Huge shout out to @onedr0p and the k8s@Home community!

Name		Name	Last commit message	Last commit date
Latest commit History 1,242 Commits
.github		.github
.taskfiles		.taskfiles
bootstrap		bootstrap
kubernetes		kubernetes
scripts		scripts
talos		talos
.editorconfig		.editorconfig
.gitattributes		.gitattributes
.gitignore		.gitignore
.mise.toml		.mise.toml
.sops.yaml		.sops.yaml
LICENSE		LICENSE
README.md		README.md
Taskfile.yaml		Taskfile.yaml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

igloo

Kubernetes based home network 🐧

Overview

💻 Kubernetes

Core Components

GitOps

Repository structure

Flux Workflow

🌐 Networking

⚙ Hardware

Software

🔧 Tools

🛎 Cloud Services

Media Stack

Cluster Notes

🌱 Environment

🛠️ Talos and Kubernetes Maintenance

⚙️ Updating Talos node configuration

⬆️ Updating Talos and Kubernetes versions

🐛 Debugging

🤝 Thanks

References

About

Uh oh!

Releases 25

Uh oh!

Contributors 13

Uh oh!

Languages

License

slipperypenguin/igloo

Folders and files

Latest commit

History

Repository files navigation

igloo

Kubernetes based home network 🐧

Overview

💻 Kubernetes

Core Components

GitOps

Repository structure

Flux Workflow

🌐 Networking

⚙ Hardware

Software

🔧 Tools

🛎 Cloud Services

Media Stack

Cluster Notes

🌱 Environment

🛠️ Talos and Kubernetes Maintenance

⚙️ Updating Talos node configuration

⬆️ Updating Talos and Kubernetes versions

🐛 Debugging

🤝 Thanks

References

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 25

Uh oh!

Contributors 13

Uh oh!

Languages