Just recently I deleted a folder in Syncthing which I did not want to delete. Since I had set up an hourly incremental backup of Syncthing some time ago, I wanted to use it to restore the deleted data.
Unfortunately I had to notice that the automatic backup did not work anymore since more than half a year because the SSH connection to the backup server was blocked by a firewall change.
This then led me to include my backup server (as described in the article Backups using Borgbackup) in my checkmk monitoring with additional monitoring. So I have a permanent monitoring with a warning after 28 hours without backup and a critical message after 3 days without backup.
Monitoring of the backups is set up directly on the backup server via checkmk local check. Every 30 minutes a bash script is called by checkmk, which reports the current status of the configured repositories to checkmk.
The script is located in /usr/lib/check_mk_agent/local/1800/borg
and contains the following content. Please note to replace the path /mnt/borgbackup/
with the path to your backup repositories.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
| #!/bin/bash
# checkmk Borgbackup check
# Author: Thomas Bella
# Source: https://blog.bella.network/monitor-borgbackup-with-checkmk/
# Version 1.0
set -o nounset
export PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
STATE_OK=0
STATE_WARNING=1
STATE_CRITICAL=2
STATE_UNKNOWN=3
debug(){ ([ "${verbose}" -gt 1 ] && echo "$*") || return 0; }
verbose(){ ([ "${verbose}" -gt 0 ] && echo "$*") || return 0; }
error(){ echo "UNKN - $*"; exit "${STATE_UNKNOWN}"; }
# define warning and critical states of backup age
crit='3 days ago'
warn='28 hours ago'
verbose=0
: "${BORG:=borg}"
command -v "${BORG}" >/dev/null 2>/dev/null \
|| error "No command '${BORG}' available."
: "${DATE:=date}"
command -v "${DATE}" >/dev/null 2>/dev/null \
|| error "No command '${DATE}' available."
# convert values to seconds to enable comparison
sec_warn="$(${DATE} --date="${warn}" '+%s')"
sec_crit="$(${DATE} --date="${crit}" '+%s')"
# check warning and critical values
if [ ${sec_crit} -gt ${sec_warn} ] ; then
error "Warning value has to be a more recent timepoint than critical."
fi
for ENV in $(ls -d /mnt/borgbackup/*.env)
do
source "${ENV}"
# get unixtime of last backup
export BORG_PASSPHRASE BORG_REPO
last="$(${BORG} list --sort timestamp --last 1 --format '{time}')"
[ "$?" = 0 ] || error "Cannot list repository archives. Repo Locked?"
size="$(du -sm ${BORG_REPO} | awk '{ print $1 }')"
num="$(${BORG} list | wc -l)"
if [ -z "${last}" ]; then
echo "CRITICAL - no archive in repository"
exit "${STATE_CRITICAL}"
fi
sec_last="$(${DATE} --date="${last}" '+%s')"
# interpret the amount of fails
if [ "${sec_crit}" -gt "${sec_last}" ]; then
state="${STATE_CRITICAL}"
elif [ "${sec_warn}" -gt "${sec_last}" ]; then
state="${STATE_WARNING}"
else
state="${STATE_OK}"
fi
echo "${state} \"BORG: ${NAME}\" size=${size}|number=${num};5:;3: last backup made on ${last}"
unset BORG_PASSPHRASE
done
|
The above script scans the target directory /mnt/borgbackup/
for files with the extension .env
. These files contain the environment variables NAME
, BORG_REPO
and BORG_PASSPHRASE
which are used to access the backup repository. Every repository is then checked for the last backup and the result is reported to checkmk as a single line result like:
1
| 0 "BORG: homecontrol.bella.pm" size=25630|number=24;5:;3: last backup made on Sun, 2023-01-01 06:26:24
|
Such an .env
file contains the following content:
1
2
3
| NAME='homecontrol.bella.pm'
BORG_REPO='/mnt/borgbackup/homecontrol.bella.pm'
BORG_PASSPHRASE='mysecretpassword'
|
This active monitoring of the backups gives me an overview of the number of backups, memory usage and whether they are up-to-date within checkmk. If a backup is no longer up to date, I receive a notification and can check if there are problems. In addition, the script gives me a convenient overview of the number of recent backups and disk usage.
PS: I was able to restore my deleted data from Syncthing afterwards by restoring a VM backup. The only disadvantage here was that it was not an hourly backup and I had to boot the VM separately which took a bit more time.