If you don’t have backups, you don’t have data. Period.
Your production database just corrupted. Last backup was 3 days ago. You’re about to explain to customers why their data is gone forever while you scramble through corrupted log files hoping to recover something.
Your database will fail. Hardware dies, humans make mistakes, software has bugs, and ransomware exists. The question isn’t if you’ll need backups - it’s whether you’ll have them when disaster strikes at Midnight on a Sunday.
The Reality of Database Disasters
Common failure scenarios:
- Disk corruption (hardware failure)
- Accidental
DROP TABLEorDELETEwithoutWHERE - Application bug corrupting data
- Ransomware targeting databases
- Failed migration destroying production data
Backup Strategy Breakdown
Level 1: Basic Automated Dumps
Good for: Small databases (< 10GB), daily recovery acceptable
#!/bin/bash
DB_NAME="myapp"DB_USER="myapp"BACKUP_DIR="/var/backups/postgres"DATE=$(date +%Y%m%d_%H%M%S)RETENTION_DAYS=30
mkdir -p $BACKUP_DIR
# PostgreSQL backupdocker compose exec -T postgres pg_dump -U $DB_USER -Fc $DB_NAME > "$BACKUP_DIR/${DB_NAME}_$DATE.dump"
# Cleanup old backupsfind $BACKUP_DIR -name "${DB_NAME}_*.dump" -mtime +$RETENTION_DAYS -delete
echo "Backup completed: ${DB_NAME}_$DATE.dump"Schedule it:
# Run daily at 2 AMecho "0 2 * * * /usr/local/bin/backup_postgres.sh >> /var/log/backup.log 2>&1" | sudo crontab -Level 2: WAL Archiving (Point-in-Time Recovery)
Good for: Critical databases, need recovery to specific timestamp
Note: This guide uses PostgreSQL 12+ configuration. For PostgreSQL 11 and earlier, recovery configuration differs (uses recovery.conf instead of recovery.signal).
PostgreSQL WAL setup in postgresql.conf:
# WAL settings for PITRwal_level = replicaarchive_mode = onarchive_command = 'test ! -f /var/backups/postgres/wal/%f && cp %p /var/backups/postgres/wal/%f'max_wal_size = 2GBcheckpoint_completion_target = 0.9wal_keep_size = 1GBCreate WAL archive directory:
sudo mkdir -p /var/backups/postgres/walsudo chown postgres:postgres /var/backups/postgres/walsudo chmod 700 /var/backups/postgres/walBase backup script:
#!/bin/bash
BACKUP_DIR="/var/backups/postgres/base"DATE=$(date +%Y%m%d_%H%M%S)
mkdir -p $BACKUP_DIR
# Create base backup with tar format and gzip compression# -D - outputs to stdout, -Ft creates tar format, -Z gzip enables compression# -P shows progressdocker compose exec -T postgres pg_basebackup -U postgres -D - -Ft -Z gzip -P > "$BACKUP_DIR/base_backup_$DATE.tar.gz"
echo "Base backup completed: base_backup_$DATE.tar.gz"PITR Recovery process:
# Stop PostgreSQLdocker compose stop postgres
# Backup current data directory (optional but recommended)mv /var/lib/postgresql/data /var/lib/postgresql/data.old
# Create new data directorymkdir -p /var/lib/postgresql/datacd /var/lib/postgresql/data
# Extract base backup (pg_basebackup -Ft creates base.tar internally)tar -xzf /var/backups/postgres/base/base_backup_20251106_020000.tar.gz
# Remove any WAL files from the backup (they're obsolete)rm -rf pg_wal/*
# Configure recovery settings in postgresql.confcat >> postgresql.conf << EOFrestore_command = 'cp /var/backups/postgres/wal/%f %p'recovery_target_time = '2025-11-06 02:00:00'EOF
# Create recovery.signal file to trigger recovery modetouch recovery.signal
# Fix ownership (if running as root)chown -R postgres:postgres /var/lib/postgresql/data
# Start PostgreSQL - it will recover to the specified timedocker compose start postgres
# After recovery completes, PostgreSQL will automatically:# - Remove recovery.signal file# - Promote to normal operation mode# - Create a new timeline
# Check recovery statusdocker compose exec postgres psql -U postgres -c "SELECT pg_is_in_recovery();"Level 3: Continuous Backup with pgBackRest
Good for: Production databases, automated PITR, multiple retention policies
Install pgBackRest:
sudo apt install pgbackrest -yConfigure /etc/pgbackrest.conf:
[global]repo1-path=/var/lib/pgbackrestrepo1-retention-full=3repo1-retention-diff=3process-max=4
[myapp]pg1-path=/var/lib/postgresql/datapg1-port=5432pg1-user=postgresPostgreSQL configuration:
# Add to postgresql.confarchive_command = 'pgbackrest --stanza=myapp archive-push %p'Setup and run:
# Create stanzasudo -u postgres pgbackrest --stanza=myapp stanza-create
# Full backup (run weekly)sudo -u postgres pgbackrest --stanza=myapp backup --type=full
# Incremental backup (run daily)sudo -u postgres pgbackrest --stanza=myapp backup --type=diff
# Point-in-time recoverysudo -u postgres pgbackrest --stanza=myapp --type=time "--target=2025-10-30 14:30:00" restoreCloud Storage Integration
S3-Compatible Storage
Setup rclone for cloud backups:
# Install rclonesudo apt install rclone -y
# Configure (interactive setup)rclone config
# Test connectionrclone ls mycloud:mybucketEnhanced backup script with cloud sync:
#!/bin/bash
LOCAL_BACKUP_DIR="/var/backups/postgres"CLOUD_REMOTE="mycloud:postgres-backups"DATE=$(date +%Y%m%d_%H%M%S)
# Create local backupdocker compose exec -T postgres pg_dump -U myapp -Fc myapp > "$LOCAL_BACKUP_DIR/myapp_$DATE.dump"
# Sync to cloud storagerclone copy $LOCAL_BACKUP_DIR $CLOUD_REMOTE --include "*.dump" --transfers=4
# Cleanup local files older than 7 daysfind $LOCAL_BACKUP_DIR -name "*.dump" -mtime +7 -delete
# Cleanup cloud files older than 90 daysrclone delete $CLOUD_REMOTE --min-age 90d
echo "Backup synced to cloud: myapp_$DATE.dump"Hetzner Storage Box Integration
Mount Storage Box:
# Install dependenciessudo apt install cifs-utils -y
# Create credentials filesudo mkdir -p /etc/cifsecho "username=your-username" | sudo tee /etc/cifs/hetzner-credentialsecho "password=your-password" | sudo tee -a /etc/cifs/hetzner-credentialssudo chmod 600 /etc/cifs/hetzner-credentials
# Mount storage boxsudo mkdir /mnt/backupecho "//your-username.your-server.de/backup /mnt/backup cifs credentials=/etc/cifs/hetzner-credentials,uid=1000,gid=1000,iocharset=utf8 0 0" | sudo tee -a /etc/fstabsudo mount -aDirect backup to Storage Box:
#!/bin/bash# Direct backup to Hetzner Storage Box
BACKUP_DIR="/mnt/backup/postgres"DATE=$(date +%Y%m%d_%H%M%S)
mkdir -p $BACKUP_DIR
# Backup directly to storage boxdocker compose exec -T postgres pg_dump -U myapp -Fc myapp > "$BACKUP_DIR/myapp_$DATE.dump"
# Keep 90 days on storage boxfind $BACKUP_DIR -name "*.dump" -mtime +90 -delete
echo "Backup completed to Storage Box: myapp_$DATE.dump"Death is hard enough. Accessing accounts shouldn't be.
When someone dies, you don't get even one extra second to access the documents and information they meant to share it with you. Trying to fix this problem with Eternal Vault.
Monitoring and Alerting
Backup Health Check
#!/bin/bashBACKUP_DIR="/var/backups/postgres"MAX_AGE_HOURS=26 # Alert if backup older than 26 hours
# Find latest backupLATEST_BACKUP=$(find $BACKUP_DIR -name "*.dump" -type f -printf '%T@ %p\n' | sort -n | tail -1 | cut -d' ' -f2-)BACKUP_AGE_HOURS=$((($(date +%s) - $(stat -c %Y "$LATEST_BACKUP")) / 3600))
if [ $BACKUP_AGE_HOURS -gt $MAX_AGE_HOURS ]; then echo "ALERT: Latest backup is $BACKUP_AGE_HOURS hours old" | mail -s "Backup Alert" $ALERT_EMAIL exit 1fi
# Test backup integrity (runs on host, not in container)if pg_restore --list "$LATEST_BACKUP" > /dev/null 2>&1; then echo "Backup integrity check passed"else echo "ALERT: Backup integrity check failed" | mail -s "Backup Corruption Alert" $ALERT_EMAIL exit 1fiSchedule health checks:
# Run every 4 hoursecho "0 */4 * * * /usr/local/bin/check_backup_health.sh" | crontab -Recovery Testing
Automated Recovery Testing
#!/bin/bashTEST_DB="myapp_recovery_test"BACKUP_FILE="/var/backups/postgres/myapp_20251106_020000.dump"BACKUP_FILENAME=$(basename "$BACKUP_FILE")
# Create test databasedocker compose exec -T postgres createdb -U postgres $TEST_DB
# Copy backup file into containerdocker cp "$BACKUP_FILE" $(docker compose ps -q postgres):/tmp/$BACKUP_FILENAME
# Restore backup to test databasedocker compose exec -T postgres pg_restore -U postgres -d $TEST_DB "/tmp/$BACKUP_FILENAME"
# Run basic integrity checksRECORD_COUNT=$(docker compose exec -T postgres psql -U postgres -d $TEST_DB -t -c "SELECT COUNT(*) FROM users;")if [ "$RECORD_COUNT" -gt 0 ]; then echo "Recovery test passed: $RECORD_COUNT records restored"else echo "Recovery test failed: No records found"fi
# Cleanup test database and temp filedocker compose exec -T postgres dropdb -U postgres $TEST_DBdocker compose exec -T postgres rm /tmp/$BACKUP_FILENAMEWhat Could Go Wrong
Common Backup Failures
Backup script permissions:
# Fix permissionssudo chown postgres:postgres /var/backups/postgressudo chmod 755 /var/backups/postgresInsufficient disk space:
# Monitor backup directory sizedf -h /var/backupsdu -sh /var/backups/postgres/*
# Compress old backupsfind /var/backups -name "*.dump" -mtime +7 -exec gzip {} \;Network timeouts to cloud storage:
# Add retry logic to rclonerclone copy $LOCAL_BACKUP_DIR $CLOUD_REMOTE --retries 3 --low-level-retries 10Recovery Troubleshooting
Corrupted backup file:
# Test backup integrity before recoverypg_restore --list backup.dump | head -10Missing WAL files:
# Check WAL archive directoryls -la /var/backups/postgres/wal/
# Verify archive_command is workingdocker compose exec postgres psql -c "SELECT pg_switch_wal();"Quick Checks
Setup time: 2-4 hours for comprehensive backup strategy
Storage costs (monthly):
- Local: 50GB = included in VPS storage (CAX11 has 40GB, CAX21 has 80GB)
- Hetzner ↗️ Storage Box: 100GB = $3.20/month
- AWS S3: 100GB = $2.30/month + transfer costs
Critical decision: Point-in-time recovery adds complexity but saves hours during actual disasters. Worth it for any database you can’t afford to lose.
Most production issues come from:
- No backup testing - backups exist but are corrupted
- Missing WAL files - gaps in point-in-time recovery chain
- Slow recovery - no documented recovery procedures
Test your recovery process monthly. When disaster strikes, you want muscle memory, not learning.