3 simple steps to create a local backup of Postgres database using Bash script

Published on

Introduction

So just like me, you must also be having some databases running on the cloud for which you want to create backup snapshots periodically.

Most of the cloud providers would already have some service (paid or free) for having automatic backups, but today we will learn how to create local backups of your Postgres database using a Bash script.

I am using this script myself for taking backups from Postgres Database hosted on Supabase for some self-hosted services that I use. So let's get started.

Only here for the script? Skip to the final script section

Bash variables

Let's first check what are the few variables you need to run the script and register them.

  1. BACKUP_DIR - The directory where your backups would be stored
  2. DB_HOST - Database hostname, this would be the hostname that you would get from your cloud instance, also you can set a local database hostname as well if you want to create a backup for the local database.
  3. DB_PASSWORD - Database password
  4. DB_PORT - Ideally for the Postgres database, it would be 5432.
  5. APP_NAME - Application name, we will use this to append in the backup file name.
  6. tables - List of tables you want to backup.

We will have a DATE variable as well which again would be used in the filename.

So your initial bash script should start like this

#!/bin/bash

DATE=$(date +%d-%m-%Y)
BACKUP_DIR="/backups"

# Ideally you would want to keep these variables outside of the script
# and add them to your .bashenv/.zshenv file
DB_HOST=<remote-hostname or local-hostname>
DB_PASSWORD=securepassword
DB_NAME=postgres
DB_PORT=5432
APP_NAME=myawesomeapp

# List of tables you want to backup
tables=("table1" "table2" "table3")

Backup database

We will use the pg_dump utility for creating a backup of the database.

We can dump the entire database in a single .sql file however I like to create a separate file per table.

Personal Preference

The reason I prefer to do this way is that for my application I like to do some post-processing on the insert statements because of which the database restoration becomes 50x faster on large datasets. More details on this can be part of another blog.

So here we would now loop over all the tables we have mentioned before and start dumping our database into a local file.

The pg_dump command would look like this

pg_dump -d postgresql://postgres:$DB_PASSWORD@$DB_HOST:$DB_PORT/$DB_NAME --column-inserts --data-only --table=tablename

To break down the command, let's go through the set of flags we are using.

  1. -d - the database we want to connect.
  2. --column-inserts - This would dump the data as INSERT commands with explicit column names like INSERT INTO table (column, ...) VALUES .... This will make restoration very slow. it is mainly useful for making dumps that can be loaded into non-PostgreSQL databases. Any error during restoring will cause only rows that are part of the problematic INSERT to be lost, rather than the entire table contents.
  3. --data-only - Dump only the data, not the schema (data definitions). Table data, large objects, and sequence values are dumped.
  4. --table - Dump only tables with names matching patterns.

These are the flags I use, feel free to add/remove any based on your preferences.

Caution

As quoted here "This will make restoration very slow" for --column-inserts, this indeed would be the case because the dump would have multiple INSERT statements which would be very slow to restore. There are multiple approaches using which you can speed up the restoration. However, you can skip this flag if you don't want your database dump to be in the format of column inserts.

So this would allow us to a backup single table, but what we want is to backup multiple tables and store them in a separate file, for this we would loop over the tables variable and redirect the output from stdout to a file.

The updated bash script would look like this

#!/bin/bash

DATE=$(date +%d-%m-%Y)
BACKUP_DIR="/backups"

# Ideally you would want to keep these variables outside of the script
# and add them to your .bashenv/.zshenv file
DB_HOST=<remote-hostname or local-hostname>
DB_PASSWORD=securepassword
DB_NAME=postgres
DB_PORT=5432
APP_NAME=myawesomeapp

# List of tables you want to backup
tables=("table1" "table2" "table3")

# Go through each table and pull the records
for table in ${tables[@]};
do
	echo "Downloading records for the table: $table"
	# Run pg_dump command against each table and pull the records
	pg_dump -d postgresql://postgres:$DB_PASSWORD@$DB_HOST:$DB_PORT/$DB_NAME --column-inserts --data-only --table="$table" > $BACKUP_DIR/$APP_NAME-$table-$(date +%d-%m-%Y).sql

	echo "Downloading finished for the table: $table"
done

Purge old backups

This is an optional step, but what we can do additionally is to remove old backups which are older than let's say 15 or 30 days and purge them whenever we run our script.

# To delete files older than 30 days
find $BACKUP_DIR/* -mtime +30 -exec rm {} \;

Caution

Always verify the commands which remove any data before running them, same goes for here as well.

Final Bash Script

After merging all the steps we saw, our final bash script should look like this.

db-backup.sh
#!/bin/bash

DATE=$(date +%d-%m-%Y)
BACKUP_DIR="/backups"

# Ideally you would want to keep these variables outside of the script
# and add them to your .bashenv/.zshenv file
DB_HOST=<remote-hostname or local-hostname>
DB_PASSWORD=securepassword
DB_NAME=postgres
DB_PORT=5432
APP_NAME=myawesomeapp

# List of tables you want to backup
tables=("table1" "table2" "table3")

# Go through each table and pull the records
for table in ${tables[@]};
do
	echo "Downloading records for the table: $table"
	# Run pg_dump command against each table and pull the records
	pg_dump -d postgresql://postgres:$DB_PASSWORD@$DB_HOST:$DB_PORT/$DB_NAME --column-inserts --data-only --table="$table" > $BACKUP_DIR/$APP_NAME-$table-$(date +%d-%m-%Y).sql

	echo "Downloading finished for the table: $table"
done

# To delete files older than 30 days
find $BACKUP_DIR/* -mtime +30 -exec rm {} \;

Make sure you give executable permissions to this script. You can do this by using chmod command.

chmod +x ./db-backup.sh

Bonus

Running the script manually is nice, but why do it manually when you can automate this right? So let's just do that.

We will use a similar setup that we saw in the backup home directory using tar blog

We will set up a cron job to run this script automatically. We will enter into crontab using

crontab -e

Inside it paste the following lines at the bottom.

0 12 * * 5 /path/to/script/db-backup.sh

This will run this script every Friday at 12:00. To learn more about how to configure these values, I would highly recommend using crontab.guru.

Hope you found this helpful, see you in the next one.

Updates straight in your inbox!

A periodic update about my life, recent blog posts, TIL (Today I learned) related stuff, things I am building and more!

No spam - unsubscribe at any time!

Share with others

Liked it?