Rsync Incremental Backup Script (v2)

  •  
  •  

A better Rsync Incremental Back-up script…

Recently, we published our First Rsync Incremental Backup Script, which was nice, but we could improve. There was no support for setting intervals and the retention period was a function of the amount of increments. So, to have a daily backup go back one year would require 365 increments, not optimal at all.

This newer script support defining your own intervals (e.g. daily, weekly, 4hourly, etc.) and the amount of increments (e.g. 7 daily versions per week) per interval to save. Then, the script itself determines what interval is next to back-up.

This will hopefully be the last back-up script you'll need in a long time. It's the only one I'm using anyway.

TL;DR: The entire script

#!/bin/bash
# Title: Perfacilis Incremental Back-up script
# Description: Create back-ups of dirs and dbs by copying them to Perfacilis' back-up servers
# We strongly recommend to put this in /etc/cron.hourly/backup
# Author: Roy Arisse <support@perfacilis.com>
# See: https://www.perfacilis.com/blog/systeembeheer/linux/rsync-daily-weekly-monthly-incremental-back-ups.html
# Version: 0.12
# Usage: bash /etc/cron.hourly/backup

readonly BACKUP_LOCAL_DIR="/backup"
readonly BACKUP_DIRS=($BACKUP_LOCAL_DIR /home /root /etc /var/www)

readonly RSYNC_TARGET="username@backup.perfacilis.com::profile"
readonly RSYNC_DEFAULTS="-trlqpz4 --delete --delete-excluded --prune-empty-dirs"
readonly RSYNC_EXCLUDE=(tmp/ temp/)
readonly RSYNC_SECRET='RSYNCSECRETHERE'

readonly MYSQL="mysql --defaults-file=/etc/mysql/debian.cnf"
readonly MYSQLDUMP="mysqldump --defaults-file=/etc/mysql/debian.cnf -E -R --max-allowed-packet=512MB -q --single-transaction -Q --skip-comments"

# Amount of increments per interval and duration per interval resp.
readonly -A INCREMENTS=([hourly]=24 [daily]=7 [weekly]=4 [monthly]=12 [yearly]=5)
readonly -A DURATIONS=([hourly]=3600 [daily]=86400 [weekly]=604800 [monthly]=2419200 [yearly]=31536000)

# ++++++++++ NO CHANGES REQUIRED BELOW THIS LINE ++++++++++

set -e
export LC_ALL=C

log() {
MSG=`echo $1`
logger -p local0.notice -t `basename $0` -- $MSG

# Interactive shell
if tty -s; then
echo $MSG
fi
}

check_only_instance() {
# Assign file handle Bob to this script, because we can
exec 808<$0
flock -n 808

if [ $? -gt 0 ]; then
log "Already running"
exit 0
fi
}

prepare_local_dir() {
[ -d $BACKUP_LOCAL_DIR ] || mkdir -p $BACKUP_LOCAL_DIR
}

prepare_remote_dir() {
local TARGET="$1"
local RSYNC_OPTS=$(get_rsync_opts)
local EMPTYDIR=$(mktemp -d)
local DIR TREE

if [ -z "$TARGET" ]; then
echo "Usage: prepare_remote_dir remote/dir/structure"
exit 1
fi

# Remove options that delete empty dir
RSYNC_OPTS=$(echo "$RSYNC_OPTS" | sed -E 's/--(delete|delete-excluded|prune-empty-dirs)//g')

for DIR in ${TARGET//\// }; do
TREE="$TREE/$DIR"
rsync $RSYNC_OPTS $EMPTYDIR $RSYNC_TARGET/${TREE/#\//}
done

rm -rf $EMPTYDIR
}

get_last_inc_file() {
local PERIOD="$1"

if [ -z "$PERIOD" ]; then
echo "Usage: ${FUNCTION[0]} daily"
exit 1
fi

echo "$BACKUP_LOCAL_DIR/last_inc_$PERIOD"
}

get_next_increment() {
local PERIOD="$1"
local LIMIT="${INCREMENTS[$PERIOD]}"
local LAST NEXT INCFILE

if [ -z "$PERIOD" -o -z "$LIMIT" ]; then
echo "Usage: get_next_increment period"
echo "- period = 'hourly', 'daily', 'weekly', 'monthly'"
exit 1
fi

INCFILE=$(get_last_inc_file $PERIOD)
if [ -f "$INCFILE" ]; then
LAST=$(cat "$INCFILE" | tr -d "\n")
fi

if [ -z "$LAST" ]; then
echo 0
return
fi

NEXT=$(($LAST+1))
if [ "$NEXT" -ge "$LIMIT" ]; then
echo 0
return
fi

echo $NEXT
}

# Return biggest interval to backup
get_interval_to_backup() {
local NOW=$(date +%s)
local LAST PERIOD INCFILE DURATION DIFF
local TODO=""

# Sort associative array: biggest first
for PERIOD in "${!DURATIONS[@]}"; do
echo "${DURATIONS["$PERIOD"]} $PERIOD"
done | sort -rn | while read DURATION PERIOD; do
# Skip disabled intervals
if [[ ${INCREMENTS[$PERIOD]} -eq 0 ]]; then
continue;
fi

LAST=0
INCFILE=$(get_last_inc_file $PERIOD)
if [ -f "$INCFILE" ]; then
LAST=$(date -r "$INCFILE" +%s)
fi

DIFF=$(($NOW - $LAST))
if [ $DIFF -ge $DURATION ]; then
echo "$PERIOD"
break
fi
done
}

get_rsync_opts() {
local EXCLUDE=`dirname $0`/rsync.exclude
local SECRET=`dirname $0`/rsync.secret
local OPTS="$RSYNC_DEFAULTS"

if [ ! -z "$RSYNC_EXCLUDE" ]; then
if [ ! -f $EXCLUDE ]; then
printf '%s\n' "${RSYNC_EXCLUDE[@]}" > $EXCLUDE
chmod 600 $EXCLUDE
fi

OPTS="$OPTS --exclude-from=$EXCLUDE"
fi

if [ ! -z "$RSYNC_SECRET" ]; then
if [ ! -f $SECRET ]; then
echo $RSYNC_SECRET > $SECRET
chmod 600 $SECRET
fi

OPTS="$OPTS --password-file=$SECRET"
fi

echo "$OPTS"
}

backup_packagelist() {
local TODO=$(get_interval_to_backup)

if [ -z "$TODO" ]; then
return
fi

log "Back-up list of installed packages"
dpkg --get-selections > $BACKUP_LOCAL_DIR/packagelist.txt
}

backup_mysql() {
local TODO=$(get_interval_to_backup)
local DB

if [ -z "$TODO" ]; then
return
fi

if [ -z "$MYSQL" -o -z "$MYSQLDUMP" ]; then
log "MySQL not set up, skipping database backup."
return
fi

log "Back-up mysql databases:"
for DB in `$MYSQL -e 'show databases' | grep -v 'Database'`; do
if [ $DB = 'information_schema' -o $DB = 'performance_schema' ]; then
continue
fi

log "- $DB"
$MYSQLDUMP $DB | gzip > $BACKUP_LOCAL_DIR/$DB.sql.gz
done
}

backup_folders() {
local RSYNC_OPTS=$(get_rsync_opts)
local DIR TARGET INC INCDIR
local VANISHED='^(file has vanished: |rsync warning: some files vanished before they could be transferred)'
local PERIOD=$(get_interval_to_backup)

if [ -z "$PERIOD" ]; then
log "No intervals to back-up yet."
exit
fi

INC=$(get_next_increment $PERIOD)
log "Moving $PERIOD back-up to target: $INC"

prepare_remote_dir "current"

for DIR in ${BACKUP_DIRS[@]}; do
TARGET=${DIR/#\//}
TARGET=${TARGET//\//_}

# Make path absolute if target is not RSYNC profile
# Also remove "user@server:" for SSH setups
INCDIR="/$PERIOD/$INC/$TARGET"
if [ -z "$RSYNC_SECRET" ]; then
INCDIR="${RSYNC_TARGET##*:}$INCDIR"
fi

log "- $DIR"
rsync $RSYNC_OPTS --backup --backup-dir=$INCDIR \
$DIR/ $RSYNC_TARGET/current/$TARGET 2>&1 | (egrep -v "$VANISHED" || true)
done
}

signoff_increments() {
local STARTTIME="$1"
local PERIOD=$(get_interval_to_backup)
local INC INCFILE

INC=$(get_next_increment $PERIOD)
INCFILE=$(get_last_inc_file $PERIOD)
echo $INC > "$INCFILE"
touch -t "$STARTTIME" "$INCFILE"
}

cleanup() {
rm -f `dirname $0`/rsync.exclude
rm -f `dirname $0`/rsync.secret
}

main() {
starttime=$(date +%Y%m%d%H%M.%S)

log "Back-up initiated at `date`"

trap "cleanup" EXIT

check_only_instance
prepare_local_dir

backup_packagelist
backup_mysql
backup_folders

signoff_increments $starttime

log "Back-up completed at `date`"
}

main

How it works

Setting increments

The  INCREMENTS  variable stores the amount of increments to save per period. The  DURATIONS  variable stores how long — in seconds — a period is, this variable only needs changing if you want to alter the duration or add new periods.

In INCREMENTS , you can set the amount to " 0 " to exclude the increment. For every increment you include, a folder on the back-up target location is created automatically.

Keep in mind both vars are associative arrays, make sure the formatting is right. If you're interested, Andy Balaam's blogpost is a great explanation. If you're not interested, just look at the current formatting and change as you wish.

Installation

Copy the contents of the entire script in a file you name " backup ", store it in " /etc/cron.hourly ":

sudo nano /etc/cron.hourly/backup
sudo chmod +x /etc/cron.hourly/backup

Don't forget, if you've created an hourly or even shorter period, the script needs to be called more often. Save the file somewhere else and call it accordingly from /etc/crontab (or any other method you like).

The following variables probably need changing:

  • BACKUP_LOCAL_DIR : Folder to keep required tracking files;
  • BACKUP_DIRS : The folders you want to have back when you computer or server dies, don't remove $BACKUP_LOCAL_DIR ;
  • RSYNC_TARGET : Where the actual back-up should be stored — the remote, possibly off-site, location;
  • RSYNC_SECRET : Optional, if Rsync profile on the remote server requires a secret;
  • MYSQL : Either leave as is or replace --defaults-file=/etc/mysql/debian.cnf with -uUSERNAME -pPASSWORD parameters.
  • MYSQLDUMP : Same as the MSQL variable.

The following variables only need checked:

  • INCREMENTS : Change the amount of increments you want to save in addition to the full back-up per period. 

Full and Incremental Back-ups

The first time the script runs, it creates one full back-up and stores it in the  current  folder.

The next run — the first increment — it stores that increment in 1 , e.g. daily/1 . Files modified since the last run are moved to this folder and the latest copy is moved to the " current " folder. For the amount of given increments for a period, new increments are created every run, e.g daily/2 , daily/3 , etc.

Finally, when the amount of increments has reached, a new full back-up is stored in the " current " folder. After that, the increment folders are updated one by one.

To make it easier to find files modified before certain date or time, each folder's timestamp is updated to match the time it ran.

Back-up to a local folder or USB disk instead of a remote Rsync server

The RSYNC_TARGET variable dictates the remote — preferably off-site location — for the back-up. For example, if your back-up USB disk is mounted at /dev/sdc1 (use lsblk to find out where it's mounted) change it as follows:

readonly RSYNC_TARGET="/dev/sdc1"
readonly RSYNC_DEFAULTS="-trlqz4 --delete --delete-excluded --prune-empty-dirs"
readonly RSYNC_EXCLUDE=(/tmp /temp)
readonly RSYNC_SECRET=""

Don't forget to empty the RSYNC_SECRET variable, to ensure it all works as it should.

What's BACKUP_LOCAL_DIR for?

The back-up script keeps track of which increment it last completed, by storing a file per period in the BACKUP_LOCAL_DIR folder, e.g. " last_inc_hourly ", " last_inc_daily ", etc. The timestamp of these files is used to determine when that increment was created, to see if the period is elapsed. If you remove these files, or the dir entirely, the script will start with "current" as explained above.

This ensures that if a run was missed — because your laptop was powered off, or because your server was rebooting — the next increment is created as soon as it powers on again, though not sooner than given period duration.

Finally, this folder contains a local copy of all created database dumps.

Conclusion

Is this the final back-up scrip we'll ever create? Probably not, there's always room for improvement. This script allows to create proper back-ups you can rely on, that span big retention periods, without requiring an unhealthy amount of disk space. In our opinion, it's a healthy mix between incremental an full back-ups, allowing for proper disaster recovery.

The current script will only function on Linux systems, or at least systems running Bash. It's unknown if it will run using the Linux Subsystem for Windows 10. Therefore we made a Windows Powershell Rsync backup script instead.

Finally, the script is probably not suitable for the less tech-savvy among us, but that — in my humble opinion — might be a pretty good user filter on itself: If you can't get it running, don't use it.

Perfacilis Back-up Service

The Perfacilis Back-up service alerts you when a back-up didn't finish in the scheduled time-frame, or if a lot of data changes at once (which might indicate an encrypter virus). You only pay for the amount of space your back-ups use.

Want to learn more?

Contact us

Changelog

2021-02-01
Bump version to 0.7.1
Removed $RSYNC_TARGET from --backup-dir=$RSYNC_TARGET/$PERIOD/$INC/$TARGET
2021-04-06
Removed log/ logs/ *.log from exclusion list, it's good practice to back-up log files as well.
2021-04-25
Bump version to 0.8.
Removed current dir per increment folder, only one current dir inside the root directory will be created.
2021-05-26
Bump version to 0.8.1.
Added check to enforce only one running instance, using pidof check.
2021-09-24
Bump version to 0.8.2.
Added --single-transaction , as suggested by mysqldump manpage and ClusterEngine's artile on dumping live MySql databases.
Using shorthand options -E, -R, -q and -Q for mysqldump command, because it's gets too long.
Remove trailing slash after $EMPTYDIR in rsync $RSYNC_OPTS $EMPTYDIR/ $RSYNC_TARGET/${TREE/#\//} , ensuring folder timestamps are properly set.
2022-03-07
Bump version to 0.9 code name Doug.
No longer overwrite current dir if increment is 0, so we have true incremental backups.
Fixed setting LAST_INC_XXX files modified time to ensure intervals are more closely met.
Added -p argument to rsync command to make sure file permissions are set.
Created GitHub repository to keep track of changes.
Special thanks to Doug for his suggestions!
2022-03-17
Bump version to 0.9.1
Check if $MYSQL or $MYSQLDUMP are empty, if no don't make mysql backups.
2022-03-29
Bump version to 0.9.2 code name Gabriel.
Fixed checking for empty $RSYNC_SECRET
Special thanks to Gabriel for his suggestions!
2022-05-09
Bump version to 0.9.3
Making touch BSD-compatible, using xave's suggestions.
2022-06-03
Bump version to 0.10
Using flock instead of pidof for more BDS-compatibility
2022-06-17
Bump version to 0.10.1
More BSD compatibility, now for date command.
Bump version to 0.10.2
Fix for local backups: Force increment dirs on target.
2022-06-27
Bump version to 0.11
Fixed some off by one errors, backup updates only one interval per run.
Thanks to mgoerens for his suggestion
2022-06-28
Bump version to 0.11.1
Magic to sort associative array, to be sure biggest — most important — interval is back-upped