Portability & Migrations
HALO’s Docker-based architecture on Nexus provides exceptional portability, enabling rapid migration to new hardware, recovery from failures, and seamless environment cloning. This document outlines the strategies, procedures, and best practices for moving HALO infrastructure between hosts.
Docker Portability Benefits
Container Independence
Docker containers provide consistent runtime environments regardless of the underlying host:
OS Independence: Containers run identically on Ubuntu 20.04, 22.04, 24.04, or other Linux distributions
Hardware Abstraction: CPU architecture differences (x86_64, ARM64) are handled by multi-arch images where available
Configuration Portability: Compose files and environment variables define the entire stack, no hidden dependencies
Volume-Based State
All persistent state resides in Docker volumes or bind mounts:
PostgreSQL Data: Database files in named volume or bind mount
Frigate Recordings: Camera footage in host directory, easily moved or archived
Home Assistant Config: YAML files in monorepo, state in volumes
n8n Workflows: Stored in PostgreSQL, backed by volume persistence
This separation of compute (containers) from state (volumes) enables clean migrations.
Migration Scenarios
Scenario 1: Hardware Upgrade
Use Case: Replacing aging mini-PC with new hardware for improved performance
Downtime: 2-4 hours planned maintenance
Procedure:
-
Backup Phase:
- Export all Docker volumes to tar archives
- Dump PostgreSQL database to SQL file
- Copy monorepo configuration to portable storage
- Document USB device mappings (Coral TPU, Zigbee dongle)
-
New Host Setup:
- Install Ubuntu Server LTS on new hardware
- Install Docker and Docker Compose
- Clone HALO monorepo from Git
- Create Docker networks
-
Restore Phase:
- Import volume archives to new host
- Restore PostgreSQL from SQL dump
- Deploy containers using compose files
- Attach USB devices and verify device paths
-
Validation:
- Verify all containers healthy
- Test Home Assistant device communication
- Confirm Frigate camera streams
- Validate n8n workflow execution
- Check Traefik routing and TLS certificates
Rollback Plan: Keep old hardware powered off but ready to reboot if issues arise
Scenario 2: Disaster Recovery
Use Case: Hardware failure requiring emergency restoration
Downtime: 4-8 hours (depends on backup freshness and new hardware availability)
Procedure:
-
Acquire Replacement Hardware:
- Purchase or repurpose compatible hardware
- Minimum specs: 6-core CPU, 16GB RAM, 500GB storage
-
Rapid Deployment:
- Use pre-built deployment scripts from monorepo
- Restore most recent backup (automated or manual)
- Deploy essential services first (Traefik, PostgreSQL, Home Assistant)
- Deploy remaining services in priority order
-
Service Restoration Priority:
- Critical: Traefik, PostgreSQL, Home Assistant, Mosquitto
- High: Frigate, n8n, Zigbee2MQTT
- Medium: Grafana, Node-RED
- Low: Watchtower, Omnia API
-
Post-Recovery:
- Verify automation resumption
- Check for data loss (recent events, recordings)
- Update DNS or network configuration if host IP changed
- Document lessons learned and update backup procedures
Scenario 3: Development Environment Clone
Use Case: Creating test environment for experimentation without affecting production
Downtime: None (production unaffected)
Procedure:
-
Clone Monorepo:
- Create separate branch for development changes
- Modify compose files to use different project name
- Change exposed ports to avoid conflicts
-
Copy Data (Optional):
- Clone production volumes for realistic testing
- Sanitize sensitive data (API keys, passwords)
- Use subset of data for faster deployment
-
Isolated Network:
- Use different Docker network names
- Configure Traefik on different port (e.g., 8080)
- Update environment variables for dev configuration
-
Testing Workflow:
- Deploy changes to development environment
- Validate functionality with real-ish data
- Merge successful changes back to main branch
- Deploy to production using standard process
Volume Management
Volume Backup Strategies
Named Volumes:
# Backup named volume to tar archive
docker run --rm -v postgres_data:/source -v $(pwd):/backup \
ubuntu tar czf /backup/postgres_data_backup.tar.gz -C /source .
# Restore named volume from tar archive
docker run --rm -v postgres_data:/target -v $(pwd):/backup \
ubuntu tar xzf /backup/postgres_data_backup.tar.gz -C /target
Bind Mounts:
# Backup bind mount using rsync
rsync -av --progress /mnt/storage/frigate/ /backup/frigate_$(date +%Y%m%d)/
# Restore bind mount
rsync -av --progress /backup/frigate_20251023/ /mnt/storage/frigate/
Volume Migration
Cross-Host Volume Transfer:
# On source host: create tar stream and pipe over SSH
docker run --rm -v postgres_data:/source ubuntu tar czf - -C /source . | \
ssh user@new-host "docker run --rm -i -v postgres_data:/target ubuntu tar xzf - -C /target"
Volume Replication:
# Create volume snapshot at specific point in time
docker run --rm -v source_vol:/source -v backup_vol:/backup \
ubuntu cp -a /source/. /backup/
Database Migration
PostgreSQL Dump & Restore
Full Database Dump:
# Dump all databases and schemas
docker exec postgres-db pg_dumpall -U postgres > halo_full_backup.sql
# Restore to new instance
cat halo_full_backup.sql | docker exec -i postgres-db psql -U postgres
Schema-Specific Dump:
# Dump specific schema (e.g., n8n)
docker exec postgres-db pg_dump -U postgres -n n8n > n8n_schema_backup.sql
# Restore specific schema
cat n8n_schema_backup.sql | docker exec -i postgres-db psql -U postgres
Automated Backup Script:
#!/bin/bash
# Daily PostgreSQL backup with rotation
BACKUP_DIR="/backup/postgres"
TIMESTAMP=$(date +%Y%m%d_%H%M%S)
mkdir -p $BACKUP_DIR
docker exec postgres-db pg_dumpall -U postgres | \
gzip > $BACKUP_DIR/halo_backup_$TIMESTAMP.sql.gz
# Keep only last 30 days
find $BACKUP_DIR -name "halo_backup_*.sql.gz" -mtime +30 -delete
Configuration Portability
Environment Variables
Store environment-specific configuration in .env files (not committed to Git):
# nexus/env/prod.env
POSTGRES_PASSWORD=secure_password_here
N8N_ENCRYPTION_KEY=random_key_here
TRAEFIK_ACME_EMAIL=admin@example.com
For migration, generate new secrets on the target host or copy from secure storage.
Compose File Portability
Use variables for host-specific paths:
# nexus/compose/frigate.yml
volumes:
- ${FRIGATE_MEDIA_PATH:-/mnt/storage/frigate}:/media/frigate
Set FRIGATE_MEDIA_PATH in environment file per host.
Git-Tracked Configuration
All service configuration is version-controlled:
- Home Assistant:
configs/home-assistant/ - Frigate:
configs/frigate/ - Zigbee2MQTT:
configs/zigbee2mqtt/ - Node-RED:
configs/node-red/
Simply clone the repository on the new host and configurations are ready.
Hardware Replacement Workflow
Pre-Migration Checklist
- Verify backups are current (< 24 hours old)
- Document current network configuration (static IPs, DNS)
- List all USB devices and their container assignments
- Export Traefik certificates (if using Let’s Encrypt)
- Note current Docker version and configuration
- Backup
.envfiles containing secrets - Create restore procedure checklist
Migration Day Workflow
T-0 (Start):
- Announce maintenance window to household
- Create final backup snapshot
- Gracefully stop all containers
T+30min:
- Power down old Nexus host
- Install and configure new host OS
- Install Docker and dependencies
T+1hr:
- Clone monorepo and restore configurations
- Create Docker networks
- Restore PostgreSQL data
T+2hr:
- Deploy core services (Traefik, PostgreSQL, Mosquitto)
- Deploy Home Assistant and verify device communication
- Deploy Frigate and reconnect cameras
T+3hr:
- Deploy automation services (n8n, Node-RED)
- Deploy monitoring (Grafana, Watchtower)
- Restore Omnia API
T+3.5hr:
- Comprehensive validation testing
- Verify automation execution
- Check dashboard functionality
T+4hr:
- End maintenance window
- Monitor for 24 hours for issues
- Update documentation with migration notes
Post-Migration Validation
Service Health:
- All containers report healthy status
- No crash loops or restart cycles
- Logs show normal operation
Data Integrity:
- Home Assistant device history intact
- Frigate recordings accessible
- n8n workflows execute successfully
- PostgreSQL data consistent
Network Connectivity:
- Traefik routing functional
- TLS certificates valid
- External access working (if configured)
- Inter-service communication operational
Functionality:
- Automations trigger correctly
- Devices respond to commands
- Camera detection events appear
- Dashboard loads and operates normally
Automation & Tooling
Backup Automation
Create systemd timer for automated backups:
# /etc/systemd/system/halo-backup.service
[Unit]
Description=HALO Backup Service
[Service]
Type=oneshot
ExecStart=/usr/local/bin/halo-backup.sh
# /etc/systemd/system/halo-backup.timer
[Unit]
Description=HALO Daily Backup
[Timer]
OnCalendar=daily
Persistent=true
[Install]
WantedBy=timers.target
Migration Scripts
Develop reusable migration scripts in nexus/scripts/:
backup.ps1- Full system backuprestore.ps1- Full system restoremigrate.ps1- Orchestrated migration workflowvalidate.ps1- Post-migration validation
Best Practices
Regular Backup Testing: Periodically restore backups to verify integrity and practice procedures
Documentation: Maintain runbooks for each migration scenario with step-by-step instructions
Incremental Migration: Test migration with non-critical services first before moving production
Rollback Plan: Always have a rollback strategy before starting migration
Change Control: Track migrations in version control, note what changed and why
Communication: Notify household members of maintenance windows and expected functionality impact
Related Documentation
- Nexus Hardware - Host specifications
- Container Reference - Service inventory
- Data & Backups - Backup strategies