Database Backup Strategies - automated dumps, point-in-time recovery

If you don’t have backups, you don’t have data. Period.

Your production database just corrupted. Last backup was 3 days ago. You’re about to explain to customers why their data is gone forever while you scramble through corrupted log files hoping to recover something.

Your database will fail. Hardware dies, humans make mistakes, software has bugs, and ransomware exists. The question isn’t if you’ll need backups - it’s whether you’ll have them when disaster strikes at Midnight on a Sunday.

The Reality of Database Disasters

Common failure scenarios:

Disk corruption (hardware failure)
Accidental DROP TABLE or DELETE without WHERE
Application bug corrupting data
Ransomware targeting databases
Failed migration destroying production data

Backup Strategy Breakdown

Level 1: Basic Automated Dumps

Good for: Small databases (< 10GB), daily recovery acceptable

#!/bin/bash

DB_NAME="myapp"
DB_USER="myapp"
BACKUP_DIR="/var/backups/postgres"
DATE=$(date +%Y%m%d_%H%M%S)
RETENTION_DAYS=30

mkdir -p $BACKUP_DIR

# PostgreSQL backup
docker compose exec -T postgres pg_dump -U $DB_USER -Fc $DB_NAME > "$BACKUP_DIR/${DB_NAME}_$DATE.dump"

# Cleanup old backups
find $BACKUP_DIR -name "${DB_NAME}_*.dump" -mtime +$RETENTION_DAYS -delete

echo "Backup completed: ${DB_NAME}_$DATE.dump"

Schedule it:

# Run daily at 2 AM
echo "0 2 * * * /usr/local/bin/backup_postgres.sh >> /var/log/backup.log 2>&1" | sudo crontab -

Level 2: WAL Archiving (Point-in-Time Recovery)

Good for: Critical databases, need recovery to specific timestamp

Note: This guide uses PostgreSQL 12+ configuration. For PostgreSQL 11 and earlier, recovery configuration differs (uses recovery.conf instead of recovery.signal).

PostgreSQL WAL setup in postgresql.conf:

# WAL settings for PITR
wal_level = replica
archive_mode = on
archive_command = 'test ! -f /var/backups/postgres/wal/%f && cp %p /var/backups/postgres/wal/%f'
max_wal_size = 2GB
checkpoint_completion_target = 0.9
wal_keep_size = 1GB

Create WAL archive directory:

sudo mkdir -p /var/backups/postgres/wal
sudo chown postgres:postgres /var/backups/postgres/wal
sudo chmod 700 /var/backups/postgres/wal

Base backup script:

#!/bin/bash

BACKUP_DIR="/var/backups/postgres/base"
DATE=$(date +%Y%m%d_%H%M%S)

mkdir -p $BACKUP_DIR

# Create base backup with tar format and gzip compression
# -D - outputs to stdout, -Ft creates tar format, -Z gzip enables compression
# -P shows progress
docker compose exec -T postgres pg_basebackup -U postgres -D - -Ft -Z gzip -P > "$BACKUP_DIR/base_backup_$DATE.tar.gz"

echo "Base backup completed: base_backup_$DATE.tar.gz"

PITR Recovery process:

# Stop PostgreSQL
docker compose stop postgres

# Backup current data directory (optional but recommended)
mv /var/lib/postgresql/data /var/lib/postgresql/data.old

# Create new data directory
mkdir -p /var/lib/postgresql/data
cd /var/lib/postgresql/data

# Extract base backup (pg_basebackup -Ft creates base.tar internally)
tar -xzf /var/backups/postgres/base/base_backup_20251106_020000.tar.gz

# Remove any WAL files from the backup (they're obsolete)
rm -rf pg_wal/*

# Configure recovery settings in postgresql.conf
cat >> postgresql.conf << EOF
restore_command = 'cp /var/backups/postgres/wal/%f %p'
recovery_target_time = '2025-11-06 02:00:00'
EOF

# Create recovery.signal file to trigger recovery mode
touch recovery.signal

# Fix ownership (if running as root)
chown -R postgres:postgres /var/lib/postgresql/data

# Start PostgreSQL - it will recover to the specified time
docker compose start postgres

# After recovery completes, PostgreSQL will automatically:
# - Remove recovery.signal file
# - Promote to normal operation mode
# - Create a new timeline

# Check recovery status
docker compose exec postgres psql -U postgres -c "SELECT pg_is_in_recovery();"

Level 3: Continuous Backup with pgBackRest

Good for: Production databases, automated PITR, multiple retention policies

Install pgBackRest:

sudo apt install pgbackrest -y

Configure /etc/pgbackrest.conf:

[global]
repo1-path=/var/lib/pgbackrest
repo1-retention-full=3
repo1-retention-diff=3
process-max=4

[myapp]
pg1-path=/var/lib/postgresql/data
pg1-port=5432
pg1-user=postgres

PostgreSQL configuration:

# Add to postgresql.conf
archive_command = 'pgbackrest --stanza=myapp archive-push %p'

Setup and run:

# Create stanza
sudo -u postgres pgbackrest --stanza=myapp stanza-create

# Full backup (run weekly)
sudo -u postgres pgbackrest --stanza=myapp backup --type=full

# Incremental backup (run daily)
sudo -u postgres pgbackrest --stanza=myapp backup --type=diff

# Point-in-time recovery
sudo -u postgres pgbackrest --stanza=myapp --type=time "--target=2025-10-30 14:30:00" restore

Cloud Storage Integration

S3-Compatible Storage

Setup rclone for cloud backups:

# Install rclone
sudo apt install rclone -y

# Configure (interactive setup)
rclone config

# Test connection
rclone ls mycloud:mybucket

Enhanced backup script with cloud sync:

#!/bin/bash

LOCAL_BACKUP_DIR="/var/backups/postgres"
CLOUD_REMOTE="mycloud:postgres-backups"
DATE=$(date +%Y%m%d_%H%M%S)

# Create local backup
docker compose exec -T postgres pg_dump -U myapp -Fc myapp > "$LOCAL_BACKUP_DIR/myapp_$DATE.dump"

# Sync to cloud storage
rclone copy $LOCAL_BACKUP_DIR $CLOUD_REMOTE --include "*.dump" --transfers=4

# Cleanup local files older than 7 days
find $LOCAL_BACKUP_DIR -name "*.dump" -mtime +7 -delete

# Cleanup cloud files older than 90 days
rclone delete $CLOUD_REMOTE --min-age 90d

echo "Backup synced to cloud: myapp_$DATE.dump"

Hetzner Storage Box Integration

Mount Storage Box:

# Install dependencies
sudo apt install cifs-utils -y

# Create credentials file
sudo mkdir -p /etc/cifs
echo "username=your-username" | sudo tee /etc/cifs/hetzner-credentials
echo "password=your-password" | sudo tee -a /etc/cifs/hetzner-credentials
sudo chmod 600 /etc/cifs/hetzner-credentials

# Mount storage box
sudo mkdir /mnt/backup
echo "//your-username.your-server.de/backup /mnt/backup cifs credentials=/etc/cifs/hetzner-credentials,uid=1000,gid=1000,iocharset=utf8 0 0" | sudo tee -a /etc/fstab
sudo mount -a

Direct backup to Storage Box:

#!/bin/bash
# Direct backup to Hetzner Storage Box

BACKUP_DIR="/mnt/backup/postgres"
DATE=$(date +%Y%m%d_%H%M%S)

mkdir -p $BACKUP_DIR

# Backup directly to storage box
docker compose exec -T postgres pg_dump -U myapp -Fc myapp > "$BACKUP_DIR/myapp_$DATE.dump"

# Keep 90 days on storage box
find $BACKUP_DIR -name "*.dump" -mtime +90 -delete

echo "Backup completed to Storage Box: myapp_$DATE.dump"

PROMOTED Built & launched by me

Death is hard enough. Accessing accounts shouldn't be.

When someone dies, you don't get even one extra second to access the documents and information they meant to share it with you. Trying to fix this problem with Eternal Vault.

Protect your family now

Monitoring and Alerting

Backup Health Check

#!/bin/bash
BACKUP_DIR="/var/backups/postgres"
ALERT_EMAIL="[email protected]"
MAX_AGE_HOURS=26  # Alert if backup older than 26 hours

# Find latest backup
LATEST_BACKUP=$(find $BACKUP_DIR -name "*.dump" -type f -printf '%T@ %p\n' | sort -n | tail -1 | cut -d' ' -f2-)
BACKUP_AGE_HOURS=$((($(date +%s) - $(stat -c %Y "$LATEST_BACKUP")) / 3600))

if [ $BACKUP_AGE_HOURS -gt $MAX_AGE_HOURS ]; then
    echo "ALERT: Latest backup is $BACKUP_AGE_HOURS hours old" | mail -s "Backup Alert" $ALERT_EMAIL
    exit 1
fi

# Test backup integrity (runs on host, not in container)
if pg_restore --list "$LATEST_BACKUP" > /dev/null 2>&1; then
    echo "Backup integrity check passed"
else
    echo "ALERT: Backup integrity check failed" | mail -s "Backup Corruption Alert" $ALERT_EMAIL
    exit 1
fi

Schedule health checks:

# Run every 4 hours
echo "0 */4 * * * /usr/local/bin/check_backup_health.sh" | crontab -

Recovery Testing

Automated Recovery Testing

#!/bin/bash
TEST_DB="myapp_recovery_test"
BACKUP_FILE="/var/backups/postgres/myapp_20251106_020000.dump"
BACKUP_FILENAME=$(basename "$BACKUP_FILE")

# Create test database
docker compose exec -T postgres createdb -U postgres $TEST_DB

# Copy backup file into container
docker cp "$BACKUP_FILE" $(docker compose ps -q postgres):/tmp/$BACKUP_FILENAME

# Restore backup to test database
docker compose exec -T postgres pg_restore -U postgres -d $TEST_DB "/tmp/$BACKUP_FILENAME"

# Run basic integrity checks
RECORD_COUNT=$(docker compose exec -T postgres psql -U postgres -d $TEST_DB -t -c "SELECT COUNT(*) FROM users;")
if [ "$RECORD_COUNT" -gt 0 ]; then
    echo "Recovery test passed: $RECORD_COUNT records restored"
else
    echo "Recovery test failed: No records found"
fi

# Cleanup test database and temp file
docker compose exec -T postgres dropdb -U postgres $TEST_DB
docker compose exec -T postgres rm /tmp/$BACKUP_FILENAME

What Could Go Wrong

Common Backup Failures

Backup script permissions:

# Fix permissions
sudo chown postgres:postgres /var/backups/postgres
sudo chmod 755 /var/backups/postgres

Insufficient disk space:

# Monitor backup directory size
df -h /var/backups
du -sh /var/backups/postgres/*

# Compress old backups
find /var/backups -name "*.dump" -mtime +7 -exec gzip {} \;

Network timeouts to cloud storage:

# Add retry logic to rclone
rclone copy $LOCAL_BACKUP_DIR $CLOUD_REMOTE --retries 3 --low-level-retries 10

Recovery Troubleshooting

Corrupted backup file:

# Test backup integrity before recovery
pg_restore --list backup.dump | head -10

Missing WAL files:

# Check WAL archive directory
ls -la /var/backups/postgres/wal/

# Verify archive_command is working
docker compose exec postgres psql -c "SELECT pg_switch_wal();"

Quick Checks

Setup time: 2-4 hours for comprehensive backup strategy

Storage costs (monthly):

Local: 50GB = included in VPS storage (CAX11 has 40GB, CAX21 has 80GB)
Hetzner ↗️ Storage Box: 100GB = $3.20/month
AWS S3: 100GB = $2.30/month + transfer costs

Critical decision: Point-in-time recovery adds complexity but saves hours during actual disasters. Worth it for any database you can’t afford to lose.

Most production issues come from:

No backup testing - backups exist but are corrupted
Missing WAL files - gaps in point-in-time recovery chain
Slow recovery - no documented recovery procedures

Test your recovery process monthly. When disaster strikes, you want muscle memory, not learning.