My First Infrastructure Skeleton: From Manual Pain to IaC Sanity
Senior full-stack dev with an AI twist. I build weirdly useful things on my own infrastructure — often before coffee.
Setting up infrastructure from scratch is often romanticized — until you're face to face with a blinking cursor and nothing installed. This post walks through how I built my initial setup manually, tested everything hands-on, and prepared the system for a future IaC-based rebuild.
1. Server & DNS Setup
Hetzner VPS (CX22)
- Ubuntu base image
- Clean slate environment — no preinstalled surprises
Cloudflare DNS (Proxy Mode)
- DNS with built-in DDoS mitigation
- IP masking and active edge protection
2. Core Stack: Half-Manual, Half-IaC
Instead of full automation from day one, I opted for a hybrid approach: docker-compose files organized in a central repo, with services running entirely in Docker for simplicity and consistency.
Deployment Approach
Initial deployments used GitHub Actions, but due to permission complexity and audit concerns, I switched to manual root@vps deployments. Not elegant, but stable and predictable.
Main Components
Traefik Reverse Proxy
- Automatic HTTPS via Let’s Encrypt
- Dynamic subdomain routing
- Docker integration
- Dashboard:
http://localhost:9000
Cloudflare Tunnel
- One tunnel for public-facing services
- One tunnel for internal/admin interfaces
- All sensitive admin UIs are accessible only through tunnel routing
Shared Services
- PostgreSQL and Temporal
- Shared across all PoC apps, with future isolation options for scaling
Monitoring Stack (Separate blog post coming soon)
- Prometheus, Grafana, Loki, Promtail
- Host metrics via node-exporter, container metrics via cAdvisor
- Logs collected via Docker socket (not log drivers)
- Preconfigured Grafana dashboards at
http://localhost:9120 - Ready for AI-based log analysis and debugging bottlenecks
3. Debug-Driven DevOps
I tested and debugged every service manually in the terminal — not just to make it work, but to understand why it works (or fails). This included:
- Verifying Traefik certificate renewals
- Troubleshooting Cloudflare tunnel and DNS behavior
- Dealing with Loki label configuration and Promtail edge cases
4. Snapshot → Wipe → Rebuild
After validating the entire stack, I created a snapshot and wiped the server. Clean state, no cruft.
What’s next?
Rebuilding the same infrastructure with Pulumi, with Kubernetes likely following later. For now, the goal is controlled complexity — and a better understanding of where automation makes sense.
TL;DR
Built an initial infrastructure setup with:
- Traefik for HTTPS and routing
- Docker-based monitoring stack (Prometheus, Grafana, Loki)
- Temporal for orchestrated background workflows
- PostgreSQL for shared persistence
- Cloudflare Tunnel for secure admin access
- Manual root deployments (IaC in progress)
Next step: Pulumi-based automation, and eventually Kubernetes — but only when the payoff outweighs the extra complexity.