Where We Are Today
What started as a minimal single-node Kubernetes cluster has evolved into a production-ready platform with microservices, comprehensive observability, and automated workflows. This update captures the current state of the project and the journey from bare infrastructure to a working application platform.
Infrastructure Foundation
The core infrastructure remains focused on simplicity and cost-effectiveness:
Terraform-Based Infrastructure as Code
- Hetzner Cloud Server: CPX21 (2 vCPU, 4GB RAM) running Ubuntu 22.04
- Kubernetes 1.30: Single-node cluster with kubeadm + containerd
- Networking: Flannel CNI for pod networking
- Automation: Cloud-init templates for reproducible deployments
The infrastructure layer is fully defined in Terraform modules:
terraform/
├── hetzner/minimal/ # Server provisioning
│ ├── main.tf # Compute, network, firewall
│ └── templates/
│ └── kube-init.yaml # Kubernetes bootstrap
└── kubernetes/
├── monitoring/ # Observability stack
└── todo-backend/ # Microservice deploymentRecent Achievement: Memory Optimization
A critical milestone was reached in optimizing the monitoring stack's memory consumption - essential for running everything on a 4GB single-node cluster.
31% Memory Reduction
Through careful tuning, we reduced the monitoring stack's memory footprint from ~1.3GB to ~0.9GB:
Component Optimizations:
- Prometheus: 256Mi/512Mi limits, 7-day retention, 20Gi storage
- Grafana: 128Mi/256Mi limits, 5Gi storage
- Loki: 128Mi/256Mi limits, 10Gi storage, 7-day retention
- AlertManager: 64Mi/128Mi limits, 2Gi storage
- Node Exporter: 32Mi/64Mi limits
- Kube State Metrics: 64Mi/128Mi limits
Key Changes:
- Reduced data retention (15d → 7d for Prometheus)
- Increased scrape intervals to 60s
- Added rate limits for Loki log ingestion
- Set explicit resource limits for all components
This optimization leaves sufficient headroom for application workloads while maintaining comprehensive observability.
First Microservice: Todo Backend
The platform now hosts its first production-ready microservice - a RESTful API built in Rust.
Architecture Highlights
Technology Stack:
- Language: Rust 1.75+ for performance and safety
- Framework: Axum for async HTTP handling
- Storage: In-memory with PostgreSQL-ready repository pattern
- Container: Alpine-based image (~15-20MB)
Clean Architecture Pattern:
HTTP Request
↓
Handlers (API layer)
↓
Services (Business logic)
↓
Repository Trait (Abstraction)
↓
In-Memory Implementation (Arc<RwLock<HashMap>>)API Capabilities
The service exposes a RESTful API with full CRUD operations:
GET /health- Kubernetes readiness/liveness probesGET /metrics- Prometheus metrics endpointPOST /api/v1/todos- Create todo itemsGET /api/v1/todos- List todos with filteringGET /api/v1/todos/{id}- Retrieve specific todoPUT /api/v1/todos/{id}- Update todoDELETE /api/v1/todos/{id}- Delete todo
Data Model:
{
"id": "uuid",
"user_id": "string",
"title": "string",
"description": "string | null",
"due_date": "ISO 8601 datetime | null",
"severity": "low | medium | high | critical",
"created_at": "ISO 8601 datetime",
"updated_at": "ISO 8601 datetime"
}Production-Ready Features
Observability:
- Prometheus metrics (
todo_operations_total,active_todos_total) - Structured JSON logging with correlation IDs
- Health check endpoints for Kubernetes probes
Security Hardening:
- Non-root container (UID 1000)
- Read-only root filesystem
- All Linux capabilities dropped
- No privilege escalation allowed
- ClusterIP service (internal only)
Performance:
- Thread-safe in-memory storage with RwLock
- Optimized release build with LTO
- Minimal image size (15-20MB)
- Memory limits: 64Mi request, 128Mi max
Deployment Automation
Helm Chart
A production-grade Helm chart packages the todo backend with:
- Configurable resource limits
- Horizontal Pod Autoscaler (HPA) support
- Ingress controller integration
- ServiceMonitor for Prometheus
- ConfigMap-based configuration
Located at helm/charts/todo-backend/ with comprehensive documentation.
Terraform Module
Infrastructure-as-code deployment via terraform/kubernetes/todo-backend/:
- Automated namespace creation
- ConfigMap management
- Helm release orchestration
- Output values for service access
CI/CD Pipeline
GitHub Actions workflow (.github/workflows/todo-backend-ci.yml) automates:
- Testing: Cargo fmt, clippy, unit tests
- Build: Multi-stage Docker build with layer caching
- Publish: Push to GitHub Container Registry
- Validation: Semantic versioning checks
Monitoring Stack
The platform includes a comprehensive observability layer:
Components
- Prometheus: Time-series metrics database with 7-day retention
- Grafana: Visualization dashboards with pre-configured views
- Loki: Log aggregation and querying
- Promtail: Log shipping from Kubernetes pods
- AlertManager: Alert routing and notification
- Node Exporter: Host metrics collection
- Kube State Metrics: Kubernetes object state metrics
Custom Dashboards
Pre-configured Grafana dashboards for:
- Kubernetes cluster overview
- Node resource monitoring
- Pod metrics and logs
- Custom application metrics
All deployed via Terraform with optimized resource allocations.
Development Workflow
The project implements best practices for collaborative development:
Code Quality
- Pre-commit hooks: Automated validation before commits
- Conventional Commits: Semantic versioning via commit messages
- Claude Code Integration: AI-assisted code review via GitHub Actions
Testing Strategy
- Unit tests: Comprehensive test coverage for services
- Integration tests: API endpoint validation
- Format checks: Automated code formatting (cargo fmt)
- Linting: Clippy with warnings-as-errors
Git Workflow
feature/branch → PR → Claude review → Tests → Merge → DeployCurrent Project Structure
hetzner-cloud-minimal-kubernetes-cluster/
├── .github/workflows/ # CI/CD automation
│ ├── claude-code-review.yml
│ ├── claude.yml
│ └── todo-backend-ci.yml
├── .scripts/ # Utility scripts
├── helm/charts/
│ └── todo-backend/ # Helm chart for microservice
├── newsletter-blog/ # This blog (Gatsby)
│ └── content/blog/ # Article content
├── services/backend/
│ └── todo/ # Rust todo service
│ ├── src/ # Application code
│ ├── tests/ # Integration tests
│ └── Dockerfile # Container definition
└── terraform/
├── hetzner/minimal/ # Infrastructure provisioning
└── kubernetes/
├── monitoring/ # Observability stack
└── todo-backend/ # Service deploymentLessons Learned
1. Memory Constraints Drive Design
Running a full stack on 4GB forced aggressive optimization:
- Reduced retention periods
- Tuned scrape intervals
- Set strict resource limits
- Monitored actual usage vs. limits
2. Repository Pattern Enables Flexibility
Starting with in-memory storage allowed rapid iteration while maintaining a clean migration path to PostgreSQL later.
3. Infrastructure as Code is Non-Negotiable
Every component - from servers to Helm releases - is version-controlled and reproducible.
4. Observability from Day One
Building metrics and logging into the first service prevented technical debt.
5. Security is Incremental
Each layer adds security (non-root containers, dropped capabilities, read-only filesystems, network policies).
What's Next
Short Term
- Frontend Development: React/Next.js UI for the todo application
- API Gateway: Kong or Traefik for routing and authentication
- PostgreSQL Integration: Migrate from in-memory to persistent storage
- Ingress Setup: Expose services with TLS termination
Medium Term
- Authentication: JWT-based auth with refresh tokens
- Service Mesh: Istio or Linkerd for advanced traffic management
- Backup Strategy: Automated PostgreSQL backups to Hetzner Object Storage
- Monitoring Alerts: PagerDuty integration for critical alerts
Long Term
- Multi-Service Architecture: Add more microservices
- AI/ML Integration: Model serving infrastructure
- GitOps: ArgoCD for continuous deployment
- Multi-Region: Expand beyond single-node cluster
Cost Analysis
Current monthly operating costs (EUR):
- Hetzner CPX21 Server: ~€5.83
- Traffic: Included (1TB free tier)
- Container Registry: Free (GitHub Container Registry)
- CI/CD: Free (GitHub Actions free tier)
Total: ~€6/month for a complete production-ready platform.
Key Metrics
As of December 20, 2025:
- Lines of Code: ~4,900 lines added in last 3 commits
- Terraform Modules: 3 (infrastructure, monitoring, todo-backend)
- Docker Images: 1 microservice (~15-20MB)
- API Endpoints: 6 (health, metrics, CRUD operations)
- Memory Footprint: ~900MB for monitoring stack
- Uptime: Configurable with HPA and health checks
Conclusion
What began as a simple Kubernetes learning project has matured into a production-capable platform. The combination of cost-effective infrastructure, modern development practices, and comprehensive observability creates a solid foundation for building distributed applications.
The project demonstrates that you don't need expensive cloud providers or complex setups to run production workloads - a €6/month Hetzner server with thoughtful architecture delivers a capable platform.
Every line of code is open source and available on GitHub.
Get Involved
Interested in following along or contributing?
- Explore the code: Check out the repository structure
- Try it yourself: Deploy your own cluster using the Terraform modules
- Share feedback: Open issues or PRs with improvements
- Follow updates: Watch the repo for new features
Next article: Deep dive into the Rust microservice architecture - clean code patterns, testing strategies, and performance optimization techniques.