Introduction
So just like me, you must also be having some databases running on the cloud for which you want to create backup snapshots periodically.
Most of the cloud providers would already have some service (paid or free) for having automatic backups, but today we will learn how to create local backups of your Postgres database using a Bash script.
I am using this script myself for taking backups from Postgres Database hosted on Supabase for some self-hosted services that I use. So let’s get started.
Only here for the script? Skip to the final script section
Bash variables
Let’s first check what are the few variables you need to run the script and register them.
BACKUP_DIR
- The directory where your backups would be storedDB_HOST
- Database hostname, this would be the hostname that you would get from your cloud instance, also you can set a local database hostname as well if you want to create a backup for the local database.DB_PASSWORD
- Database passwordDB_PORT
- Ideally for the Postgres database, it would be 5432.APP_NAME
- Application name, we will use this to append in the backup file name.tables
- List of tables you want to backup.
We will have a DATE
variable as well which again would be used in the filename.
So your initial bash script should start like this
Backup database
We will use the pg_dump ↗️ utility for creating a backup of the database.
We can dump the entire database in a single .sql
file however I like to create a separate file per table.
Personal Preference
The reason I prefer to do this way is that for my application I like to do some post-processing on the insert statements because of which the database restoration becomes 50x faster on large datasets. More details on this can be part of another blog.
So here we would now loop over all the tables we have mentioned before and start dumping our database into a local file.
The pg_dump command would look like this
To break down the command, let’s go through the set of flags we are using.
-d
- the database we want to connect.--column-inserts
- This would dump the data asINSERT
commands with explicit column names likeINSERT INTO table (column, ...) VALUES ...
. This will make restoration very slow. it is mainly useful for making dumps that can be loaded into non-PostgreSQL databases. Any error during restoring will cause only rows that are part of the problematic INSERT to be lost, rather than the entire table contents.--data-only
- Dump only the data, not the schema (data definitions). Table data, large objects, and sequence values are dumped.--table
- Dump only tables with names matching patterns.
These are the flags I use, feel free to add/remove any based on your preferences.
Caution
As quoted here “This will make restoration very slow” for —column-inserts, this indeed would be the case because the dump would have multiple INSERT statements which would be very slow to restore. There are multiple approaches using which you can speed up the restoration. However, you can skip this flag if you don’t want your database dump to be in the format of column inserts.
So this would allow us to a backup single table, but what we want is to backup multiple tables and store them in a separate file, for this we would loop over the tables variable and redirect the output from stdout
to a file.
The updated bash script would look like this
Purge old backups
This is an optional step, but what we can do additionally is to remove old backups which are older than let’s say 15 or 30 days and purge them whenever we run our script.
Caution
Always verify the commands which remove any data before running them, same goes for here as well.
Final Bash Script
After merging all the steps we saw, our final bash script should look like this.
Make sure you give executable permissions to this script. You can do this by using chmod
command.
Bonus
Running the script manually is nice, but why do it manually when you can automate this right? So let’s just do that.
We will use a similar setup that we saw in the backup home directory using tar blog
We will set up a cron job to run this script automatically. We will enter into crontab using
Inside it paste the following lines at the bottom.
This will run this script every Friday at 12:00. To learn more about how to configure these values, I would highly recommend using crontab.guru ↗️.
Hope you found this helpful, see you in the next one.