Production Homelab Infrastructure
Project Overview
Built a production-grade homelab infrastructure using Proxmox VE hypervisor with consolidated Docker services, TrueNAS cloud storage, and comprehensive monitoring. The system provides a robust foundation for personal and family use while serving as a learning platform for enterprise technologies.
Homelab Infrastructure Specifications
This project runs on my dedicated homelab setup:
- CPU: Intel i9-9900K (8 cores, 16 threads) with Intel QuickSync
- RAM: 32GB DDR4 (non-ECC requiring enhanced monitoring practices)
- Storage: 2x Seagate IronWolf 12TB drives in mirror configuration
- OS: Proxmox VE with Debian LXC containers and TrueNAS VM
Architecture Design
Infrastructure Components
|
|
The infrastructure includes:
- Proxmox VE Hypervisor: Primary virtualization platform with LXC containers and VMs
- TrueNAS VM: Dedicated storage management with SMB/NFS shares and RAM caching
- Docker LXC Container: Consolidated service deployment on Debian with Portainer
- Network Security: WireGuard VPN with Pi-hole DNS filtering and SSL termination
Service Integration Pipeline
Implemented layered service architecture for security and functionality:
- Network Layer: WireGuard VPN providing secure remote access with Pi-hole DNS filtering
- Reverse Proxy: NGINX Proxy Manager with LetsEncrypt SSL certificates and custom domains
- Storage Layer: TrueNAS VM with SMB shares integrated into Nextcloud container
- Application Layer: Dockerized services including Nextcloud, Jellyfin, and monitoring stack
Technical Implementation
Nextcloud Cloud Storage Stack
|
|
Monitoring and Media Services
Deployed comprehensive monitoring stack with Prometheus and Grafana alongside Jellyfin media server for family use. All services utilize the centralized TrueNAS storage through SMB mounts, providing unified data management across the entire infrastructure.
Challenges & Solutions
Challenge 1: Storage Permissions and GPU Passthrough
Problem: Proxmox root permissions blocking /mnt/center access on unprivileged LXC container, preventing proper SMB integration
Solution: Researched Proxmox forums and StackOverflow to implement matching /mnt/center structure across datacenter node and container with proper mount point configurations. This enabled full read/write/execute access to 12TB storage array.
Challenge 2: DNS Security Incident
Problem: Misconfigured Pi-hole with open port from old game server caused massive DNS leak, resulting in 2 days of brute force attacks and network slowdown
Solution: Implemented proper DHCP configuration, closed vulnerable ports, and established rate limiting (FTLCONF_RATE_LIMIT=20000/60). Enhanced operational security practices and regular port scanning.
Results & Impact
Performance Improvements
- Service Uptime: Achieved 99%+ uptime across all services with automated machine restarts for RAM data protection at 12am
- Storage Efficiency: 12TB mirrored storage with automated Proxmox backup system alongside preservation of important files on main computer
- Network Security: Zero successful breaches since implementing proper DNS and VPN configuration, more stringent log review habits
Family and Personal Impact
- Data Independence: Successfully migrated family from Google Drive/OneDrive to self-hosted Nextcloud, alongside my own document platform on web
- Media Access: Jellyfin providing centralized access to books, courses, and media content
- Remote Administration: Secure access to homelab services from anywhere via WireGuard VPN
Lessons Learned
- ECC RAM Importance: Non-ECC RAM requires enhanced monitoring, regular restarts, and comprehensive backup strategies for data integrity
- Security First: Proper network configuration essential, misconfigured services can expose entire infrastructure to attacks
- Storage Path Planning: Container volume mapping requires careful planning to avoid permission conflicts in virtualized environments
- Service Consolidation: LXC container approach provides better resource utilization than individual VMs for lightweight services
Future Enhancements
- ECC Storage Server: Dedicated storage system with ECC RAM to eliminate data corruption concerns, and to separate storage from services with an EPS
- Hardware Upgrade: Consider migrating compute workloads to 7950X3D system (16C/32T, 64GB RAM) for better resource utilization
- Service Mesh: Implement Traefik or Istio for more advanced service discovery and load balancing
- Monitoring Expansion: Add custom alerting and automated remediation for common infrastructure issues beyond n8n which has API costs