Home Lab: Kubernetes Cluster
Overview
The home lab runs a k3s cluster — a lightweight Kubernetes distribution well-suited for bare-metal and resource-constrained environments. k3s ships with sensible defaults (embedded SQLite or etcd, built-in Traefik, CoreDNS) and is managed like any other Kubernetes cluster once running.
GitOps: ArgoCD
All workloads are managed through ArgoCD using a GitOps workflow. The cluster state is defined entirely in a private Git repository; ArgoCD continuously reconciles the live cluster against that source of truth.
App of Apps
ArgoCD is bootstrapped with a single root Application that points at the argocd/ directory in the repo. That root app renders a Helm chart which generates individual ArgoCD Application resources for every service — one per app. This is the App of Apps pattern.
Apps are grouped into three categories:
| Category | Description |
|---|---|
| system | Core cluster infrastructure (ingress, DNS, storage, observability) |
| management | Cluster management tooling (secrets, DNS management, netbox) |
| services | User-facing applications |
Each app is a self-contained Helm chart under apps/<category>/<name>/. Adding a new service means adding a chart directory and a single entry in the root values file — ArgoCD picks it up on the next sync.
Ingress: Traefik
Traefik is the cluster's ingress controller, deployed via the system/traefik app. It holds a single static LoadBalancer IP assigned by MetalLB.
All HTTP traffic enters the cluster through Traefik. Services expose themselves by creating a standard Kubernetes Ingress resource pointing at that IP; Traefik routes requests to the correct backend based on the Host header.
HTTP traffic is automatically redirected to HTTPS at the Traefik level.
Load Balancer: MetalLB
MetalLB provides LoadBalancer IP allocation for bare-metal clusters (which have no cloud provider to do this automatically). It operates in Layer 2 mode, advertising IPs from a reserved range on the local network.
A small pool of IPs is reserved for cluster use. Most services no longer need their own LoadBalancer IP — they go through Traefik — but MetalLB is still used for services that require a dedicated IP (e.g. Blocky's DNS service on port 53).
DNS: Blocky
Blocky is the cluster's DNS resolver, running as a high-availability deployment with its own dedicated LoadBalancer IP. Home network devices use this IP as their DNS server.
Blocky provides:
- Ad blocking — upstream block lists filter ads and trackers for all clients
- Internal name resolution — a
customDNSmapping resolves*.int.beckstrand.devto Traefik's LoadBalancer IP, keeping internal services off public DNS entirely - Conditional forwarding —
cluster.localqueries are forwarded to CoreDNS for in-cluster service discovery; reverse DNS for the local subnet is forwarded to the home router - Caching — responses are cached to reduce upstream query volume
Internal services are accessed at <service>.int.beckstrand.dev. These hostnames exist only in Blocky's custom DNS mapping and are never published to public DNS, so they are unreachable from outside the home network.
TLS: cert-manager + Let's Encrypt
cert-manager automates TLS certificate issuance and renewal. All services use certificates issued by Let's Encrypt.
Because internal services at *.int.beckstrand.dev are not publicly reachable, HTTP-01 validation is not an option. Instead, every certificate uses the DNS-01 challenge, which proves domain ownership by creating a TXT record in the public beckstrand.dev Cloudflare zone. cert-manager handles this automatically via the Cloudflare API.
This means internal services get valid, browser-trusted TLS certificates even though they are never exposed to the internet.
Services opt in by annotating their Ingress resource:
cert-manager.io/cluster-issuer: letsencrypt-issuer
cert-manager then issues and renews the certificate automatically.
Secrets: External Secrets Operator + Doppler
Secrets are stored in Doppler and synced into the cluster by the External Secrets Operator (ESO).
All secrets live in a single Doppler project (k8s-apps) under keys prefixed by service name (e.g. GRAFANA_ADMIN_PASSWORD, MINIO_ROOT_USER). A ClusterSecretStore resource points ESO at that project.
Each service that needs secrets declares an ExternalSecret resource. ESO uses a dataFrom.find selector with a regexp to pull all keys matching the service's prefix, then strips the prefix via a rewrite rule so the resulting Kubernetes Secret contains bare key names (e.g. ADMIN_PASSWORD) — matching what the downstream Helm chart expects, without any changes to the chart itself.
Storage
Two storage layers are in use:
Longhorn is the default StorageClass for general persistent volumes. It provides replicated block storage across cluster nodes and integrates with Kubernetes PVCs directly.
MinIO provides S3-compatible object storage for services that need it. Observability backends (Mimir for metrics, Loki for logs) use MinIO buckets as their long-term storage layer instead of local disk, which simplifies backup and avoids tying large datasets to specific nodes.
Observability
The observability stack is built around the Prometheus ecosystem:
| Component | Role |
|---|---|
| kube-prometheus-stack | Prometheus operator, alerting rules, node exporters |
| Mimir | Long-term metrics storage (backed by MinIO) |
| Loki | Log aggregation (backed by MinIO) |
| Vector | Log collection agent — ships logs from all pods to Loki |
| Grafana | Dashboards and visualization, sourcing from both Mimir and Loki |
| Blackbox Exporter | Synthetic probes for endpoint availability monitoring |
Grafana dashboards are version-controlled as JSON files in the k8s-apps repo and loaded automatically via a sidecar that watches for labelled ConfigMaps.