Database backups in S3

While I work on a Go backend for RunHedgie, I'm still hosting the database locally on a Raspberry Pi. That requires me to be at the office to play with the data, which is sub-optimal.

So I'd like to make a database backup each day and upload it to AWS's S3 storage service. That way I can download a copy of the data and use it in my computer whenever I want, not depending from an external CPU or server.

Database backup

To make a DB dump with PostgreSQL we'll need pg_dump:

# Create a temporary file
tmp_file=$(mktemp -t backup.XXXXXX)  
# Dump the db
pg_dump -p 3306 -h $DBHOST -U $DBUSER -p$DBPASS $DBNAME > $tmp_file  
# Zip it
gzip $tmp_file $tmp_file.gz  

mktemp is a great utility to generate random temporary files. Fixed-value temporary files are usually a bad option.

If you're in Linux you should take a look at ionice. It tells the SQL dump program to go easy on OS I/O operations, lowering disk and CPU usage (but making your backups a bit longer). I prefer using the -c2 options, which makes the "best effort" to back it up as quickly and seemlessly as possible.

File upload

To upload the file to S3, we'll need the AWS command line utilities. With those we can easily upload a file to S3:

aws s3 cp $tmp_file.gz s3://my_bucket/any_format  

Running it periodically

Finally, let's put it in a script (I'll call it backup.sh and put it in the $HOME folder):

#!/bin/bash
ionice -c2 pg_dump -p 3306 -h $DBHOST -U $DBUSER $PASS $DBNAME > $tmp_file

gzip $tmp_file

aws s3 cp "$tmp_file.gz" s3://$BUCKET_NAME/$FOLDER_FORMAT/

rm $tmp_file.gz

and configure crontab to run it every day at 1 p.m. UTC:

$ chmod +x bachup.sh
$ crontab -e
$ # Put the following line:
$ # * 13 * * * ~/.backups/backup.sh &> /dev/null

And that's it.

comments powered by Disqus