Production Homelab Infrastructure

July 19, 2023 Infrastructure DevOps

Project Overview

Built a production-grade homelab infrastructure using Proxmox VE hypervisor with consolidated Docker services, TrueNAS cloud storage, and comprehensive monitoring. The system provides a robust foundation for personal and family use while serving as a learning platform for enterprise technologies.

Homelab Infrastructure Specifications

This project runs on my dedicated homelab setup:

Architecture Design

Infrastructure Components

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
# Core Services Docker Compose Stack
version: '3.8'
services:
  nginx-proxy-manager:
    image: jc21/nginx-proxy-manager:latest
    container_name: npm
    restart: always
    ports:
      - '80:80'
      - '81:81'
      - '443:443'
    volumes:
      - /root/nginx/data:/data
      - /root/nginx/letsencrypt:/etc/letsencrypt

  pihole:
    image: pihole/pihole:latest
    container_name: pihole
    restart: unless-stopped
    networks:
      wireguard-pihole:
        ipv4_address: 172.20.0.3
    ports:
      - 53:53/udp
      - 53:53/tcp
      - 4443:443/tcp
      - 8080:80/tcp
    environment:
      - TZ=Canada/Vancouver
      - FTLCONF_RATE_LIMIT=20000/60

The infrastructure includes:

Service Integration Pipeline

Implemented layered service architecture for security and functionality:

  1. Network Layer: WireGuard VPN providing secure remote access with Pi-hole DNS filtering
  2. Reverse Proxy: NGINX Proxy Manager with LetsEncrypt SSL certificates and custom domains
  3. Storage Layer: TrueNAS VM with SMB shares integrated into Nextcloud container
  4. Application Layer: Dockerized services including Nextcloud, Jellyfin, and monitoring stack

Technical Implementation

Nextcloud Cloud Storage Stack

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
services:
  db:
    image: mariadb:10.6
    container_name: mariadb
    restart: unless-stopped
    command: --transaction-isolation=READ-COMMITTED --log-bin=ROW
    volumes:
      - /root/mariadb/db:/var/lib/mysql
    environment:
      - MYSQL_ROOT_PASSWORD=${MYSQL_ROOT_PASSWORD}
      - MYSQL_DATABASE=nextcloud
      - MYSQL_USER=nextcloud

  app:
    image: nextcloud:30.0.0
    container_name: nextcloud
    restart: unless-stopped
    ports:
      - 8181:80  # Custom port to avoid Pi-hole conflict
    volumes:
      - /root/nextcloud/html:/var/www/html
      - /mnt/center:/data
    environment:
      - MYSQL_HOST=db
      - REDIS_HOST=redis
      - PHP_MEMORY_LIMIT=21G
      - PHP_UPLOAD_MAX_FILESIZE=20G

Monitoring and Media Services

Deployed comprehensive monitoring stack with Prometheus and Grafana alongside Jellyfin media server for family use. All services utilize the centralized TrueNAS storage through SMB mounts, providing unified data management across the entire infrastructure.

Challenges & Solutions

Challenge 1: Storage Permissions and GPU Passthrough

Problem: Proxmox root permissions blocking /mnt/center access on unprivileged LXC container, preventing proper SMB integration

Solution: Researched Proxmox forums and StackOverflow to implement matching /mnt/center structure across datacenter node and container with proper mount point configurations. This enabled full read/write/execute access to 12TB storage array.

Challenge 2: DNS Security Incident

Problem: Misconfigured Pi-hole with open port from old game server caused massive DNS leak, resulting in 2 days of brute force attacks and network slowdown

Solution: Implemented proper DHCP configuration, closed vulnerable ports, and established rate limiting (FTLCONF_RATE_LIMIT=20000/60). Enhanced operational security practices and regular port scanning.

Results & Impact

Performance Improvements

Family and Personal Impact

Lessons Learned

  1. ECC RAM Importance: Non-ECC RAM requires enhanced monitoring, regular restarts, and comprehensive backup strategies for data integrity
  2. Security First: Proper network configuration essential, misconfigured services can expose entire infrastructure to attacks
  3. Storage Path Planning: Container volume mapping requires careful planning to avoid permission conflicts in virtualized environments
  4. Service Consolidation: LXC container approach provides better resource utilization than individual VMs for lightweight services

Future Enhancements