It’s so fluffy!

(Or: Backblaze B2 cloud backups from a Proxmox Virtual Environment)

Backups are one of those things that have a tendency to become unexpectedly expensive – at least through the eyes of a non-techie: Not only do you need enough space to store several generations of data, but you want at least twice that, since you want to protect your information not only from accidental deletion or corruption, but also from the kind of accidents that can render both the production data and the backup unreadable. Ultimately, you’ll also want to spend the resources to automate as much of the process as possible, because anything that requires manual work will be forgotten at some point, and by some perverse law of the Universe, that’s when it would have been needed.

In this post I’ll describe how I’ve solved it for full VM/container backups in my lab/home environment. It’s trivial to adapt the information from this post to apply to regular file system backups. Since I’m using a cloud service to store my backups, I’m applying a zero trust policy to them at the cost of increased storage (and network) requirements, but my primary dataset is small enough that this doesn’t really worry me.

Backblaze currently offers 10 GB of B2 object storage for free. This doesn’t sound like a lot today, but it will comfortably fit several compressed and encrypted copies of my reverse proxy, and my mail and web servers. That’s Linux containers for you.

First of all, we’ll need an account at Backblaze. Save your Master Application Key in your password manager! We’ll need it soon. Then we’ll want to create a Storage Bucket. In my case I gave it the wonderfully inventive name “pvebackup”.

Next, we shall install a program called rclone on our Proxmox server. The version in the apt repository as I write this seems to have a bug vis à vi B2, that will require us to use the Master Application Key rather than a more limited Application Key specifically for this bucket. Since we’re encrypting our cloud data anyway, I feel pretty OK with this compromise for home use.

# apt install rclone

Now we’ll configure the program:

# rclone config --config /etc/rclone.conf
Config file "/etc/rclone.conf" not found - using defaults
No remotes found - make a new one
n) New remote
s) Set configuration password
q) Quit config

Type n to create a new remote configuration. Name it b2, and select the appropriate number for Backblaze B2 storage from the list: In my case it was number 3.

The Account ID can be viewed in the Backblaze portal, and the Application Key is the master key we saved in our password manager earlier. Leave the endpoint blank and save your settings. Then we’ll just secure the file:

# chown root. /etc/rclone.conf && chmod 600 /etc/rclone.conf

We’ll want to encrypt the file before sending it to an online location. For this we’ll use gpg, for which the default settings should be enough. The command to generate a key is gpg –gen-key, and I created a key in the name of “proxmox” with the mail address I’m using for notification mails from my PVE instance. Don’t forget to store the passphrase in your password manager, or your backups will be utterly worthless.

Next, we’ll shamelessly steal and modify a script to be used for hooking into the Proxmox VE backup process (I took it from this github repository and repurposed it for my needs):

#!/usr/bin/perl -w
# VZdump hook script for offsite backups to Backblaze B2 storage
use strict;

print "HOOK: " . join (' ', @ARGV) . "\n";

my $phase = shift;

if ($phase eq 'job-start' ||
        $phase eq 'job-end'  ||
        $phase eq 'job-abort') {

        my $dumpdir = $ENV{DUMPDIR};

        my $storeid = $ENV{STOREID};

        print "HOOK-ENV: dumpdir=$dumpdir;storeid=$storeid\n";

        if ($phase eq 'job-end') {
                        # Delete backups older than 8 days
                        system ("/usr/bin/rclone delete -vv --config /etc/rclone.conf --min-age 8d b2:pvebackup") == 0 ||
                                die "Deleting old backups failed";
        }
} elsif ($phase eq 'backup-start' ||
        $phase eq 'backup-end' ||
        $phase eq 'backup-abort' ||
        $phase eq 'log-end' ||
        $phase eq 'pre-stop' ||
        $phase eq 'pre-restart' ||
        $phase eq 'post-restart') {
        my $mode = shift; # stop/suspend/snapshot
        my $vmid = shift;
        my $vmtype = $ENV{VMTYPE}; # lxc/qemu
        my $dumpdir = $ENV{DUMPDIR};
        my $storeid = $ENV{STOREID};
        my $hostname = $ENV{HOSTNAME};
        # tarfile is only available in phase 'backup-end'
        my $tarfile = $ENV{TARFILE};
        my $gpgfile = $tarfile . ".gpg";
        # logfile is only available in phase 'log-end'
        my $logfile = $ENV{LOGFILE};
        print "HOOK-ENV: vmtype=$vmtype;dumpdir=$dumpdir;storeid=$storeid;hostname=$hostname;tarfile=$tarfile;logfile=$logfile\n";
        # Encrypt backup and send it to B2 storage
        if ($phase eq 'backup-end') {
                system ("/usr/bin/gpg -e -r proxmox $tarfile") == 0 ||
                        die "Encrypting tar file failed";
                system ("/usr/bin/rclone copy -v --config /etc/rclone.conf $gpgfile b2:pvebackup") == 0 ||
                        die "Copying encrypted file to B2 storage failed";
        }
        # Copy backup log to B2
        if ($phase eq 'log-end') {
                system ("/usr/bin/rclone copy -v --config /etc/rclone.conf $logfile b2:pvebackup") == 0 ||
                        die "Copying log file to B2 storage failed";
        }
} else {
      die "got unknown phase '$phase'";
}
exit (0);

Store this script in /usr/local/bin/vzclouddump.pl and make it executable:

# chown root. /usr/local/bin/vzclouddump.pl && chmod 755 /usr/local/bin/vzclouddump.pl

The last cli magic for today will be to ensure that Proxmox VE actually makes use of our fancy script:

# echo "script: /usr/local/bin/vzclouddump.pl" >> /etc/vzdump.conf

To try it out, select a VM or container in the PVE web interface, select Backup -> Backup now. I use Snapshot as my backup method and GZIP as my compression method. Hopefully you’ll see no errors in the log, and the B2 console will display a new file with a name corresponding to the current timestamp and the machine ID.

Conclusion

The tradeoffs with this solution compared to, for example, an enterprise product from Veeam are obvious, but so is the difference in cost. For a small business or a home lab, this solution should cover the needs to keep the most important data recoverable even if something bad happens to the server location.

Back on (tunnelled) IPv6

On principle, I dislike not being able to present my Internet-facing services over IPv6. The reasoning is simple: Unless services exist that use IPv6, ISPs have no reason to provide it. I’m obviously microscopic in this context, but I’m doing my thing to help the cause.

As mentioned earlier, I first experimented with Hurricane Electric’s tunneling service, which caused issues with Netflix because of silly geofencing rules.

After that I tried Telia, who at the time did not provide IPv6 natively, but who have a 6rd service, which generates a /64 subnet for you based on your (DHCP-issued) IPv4 address. For home use, I could accept that, but when I got my fibre connection, I moved away from that ISP. Unfortunately, neither the ISP nor their service provider do IPv6 in my area, so then I didn’t have access to Telia’s 6rd service, and for practical reasons I couldn’t route client traffic from my home over HE’s tunnel service.

PfSense and Proxmox VE to the rescue: I set up the Hurricane Electric tunnel as per the regular pfSense instructions, but I assigned that network to a separate internal NIC on my firewall instead of routing it to the regular LAN.

I then set up a new network bridge in Proxmox VE, assigning a hitherto unused NIC to it, and connected the two ports. Voìla! I now have a trouble-free client network where Netflix and similar services work well, and I also have an IPv6 capable server network to which I’ve added relevant machines.

In other words, while a functioning native IPv6 solution is not available to me, I now have a workaround for IPv6 server connectivity until my service providers get with the times…

Fixing lack of console video in Proxmox on HP MicroServer Gen7

After my latest experiment I encountered an issue where the current lack of video output from Proxmox on my N54L-based HP Microserver Gen 7 became a serious issue: I would see the Grub menu, then the screen would turn blank and enter power-save mode, I’d see the disk activity light blink a few times, but the system wouldn’t start up.

Naturally, without seeing the error message I couldn’t do anything about the issue, but I had seen a similar symptom earlier, namely when installing Proxmox for the first time: The installer USB image behaves the same way on this computer, and the workaround there is simply to press Enter to enter the graphical install environment after which the screen is visible.

This time I re-created a Proxmox 5.2 USB stick, booted the server from it, and correctly assumed that arrow down followed by Enter would likely get me into some sort of rescue environment. Sure enough I was soon greeted by a root prompt. At this stage, the ZFS modules weren’t loaded, so again I guessed that pressing Ctrl+D to exit the rescue environment would start the install environment where I know ZFS is available, and pressing the Abort button there luckily got me back to a shell.

From here I mounted my ZFS environment and chrooted into it:

# zpool import -f -a -N -R /mnt
# zfs mount rpool/ROOT/pve-1
# zfs mount -a
# mount --rbind /dev /mnt/dev
# mount --rbind /proc /mnt/proc
# mount --rbind /sys /mnt/sys
# chroot /mnt /bin/bash --login

I could now confirm that I was within my regular Proxmox file system, and so I got to work on the Grub configuration:

In /etc/default/grub, I found the commented out line #GRUB_GFXMODE=640×480 and changed that portion of the file so it now looks like this:

GRUB_GFXPAYLOAD_LINUX="keep"
GRUB_GFXMODE=1024x768
GRUB_CMDLINE_LINUX_DEFAULT="nomodeset"

I then ran update-grub and rebooted the server, after which I could see the boot process including the issue that prevented the system from booting fully: It turns out my ZFS pool didn’t want to mount after replacing the drive. I quickly ran zpool import -f, and exited the shell, and then the system successfully booted the rest of the way. An additional reboot confirmed that the system was functional.

Summary

Troubleshooting gets a lot harder when you’re blind. The solution is to attempt to become less blind.

Replacing ZFS system drives in Proxmox

Running Proxmox in a root-on-zfs configuration in a RAID10 pool results in an interesting artifact: We need a boot volume from which to start our system and initialize the elements required to recognize a ZFS pool. In effect, the first mirror pair in our disk set will have (at least) two partitions: a regular filesystem on the first partition and a second partition to participate in the ZFS pool.

To see how it all works together, I tried failing a drive and replacing it with a different one.

Happy-case

If the drives would have had identical sector sizes, the operation would have been simple. In this case, sdb is the good mirror volume and sda is the new, empty drive. We want to copy the working partition table from the good drive to the new one, and then randomize the UUID of the new drive to avoid catastrophic confusion on the part of ZFS:

# sgdisk /dev/sdb -R /dev/sda
# sgdisk -G /dev/sda

After that, we should be able to use gdisk to view the partition table, to identify what partition does what, and simply copy the contents of the good partitions from the good mirror to the new drive:

# gdisk /dev/sda
GPT fdisk (gdisk) version 1.0.1

Partition table scan:
  MBR: protective
  BSD: not present
  APM: not present
  GPT: present

Found valid GPT with protective MBR; using GPT.

Command (? for help): p
Disk /dev/sda: 5860533168 sectors, 2.7 TiB
Logical sector size: 512 bytes
Disk identifier (GUID): xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx
Partition table holds up to 128 entries
First usable sector is 34, last usable sector is 5860533134
Partitions will be aligned on 8-sector boundaries
Total free space is 0 sectors (0 bytes)

Number  Start (sector)    End (sector)  Size       Code  Name
   1              34            2047   1007.0 KiB  EF02  
   2            2048      5860516749   2.7 TiB     BF01  zfs
   9      5860516750      5860533134   8.0 MiB     BF07  

Command (? for help): q
# dd if=/dev/sdb1 of=/dev/sda1
# dd if=/dev/sdb9 of=/dev/sda9

Then we would add the new disk to our ZFS pool and have it resilvered:

# zpool replace rpool /dev/sda2

To view the resilvering process:

# zpool status -v
  pool: rpool
 state: DEGRADED
status: One or more devices is currently being resilvered.  The pool will
	continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
  scan: resilver in progress since Sat Sep  1 18:48:13 2018
	2.43T scanned out of 2.55T at 170M/s, 0h13m to go
	1.22T resilvered, 94.99% done
config:

	NAME             STATE     READ WRITE CKSUM
	rpool            DEGRADED     0     0     0
	  mirror-0       DEGRADED     0     0     0
	    replacing-0  DEGRADED     0     0     0
	      old        UNAVAIL      0    63     0  corrupted data
	      sda2       ONLINE       0     0     0  (resilvering)
	    sdb2         ONLINE       0     0     0
	  mirror-1       ONLINE       0     0     0
	    sdc          ONLINE       0     0     0
	    sdd          ONLINE       0     0     0
	logs
	  sde1           ONLINE       0     0     0
	cache
	  sde2           ONLINE       0     0     0

errors: No known data errors

The process is time consuming on large drives, but since ZFS both understands the underlying disk layout and the filesystem on top of it, resilvering will only occur on blocks that are in use, which may save us a lot of time, depending on the extent to which our filesystem is filled.

When resilvering is done, we’ll just make sure there’s something to boot from on the new drive:

# grub-install /dev/sda
Installing for i386-pc platform.
Installation finished. No error reported.

Real life intervenes

Unfortunately for me, the new drive I tried had the modern 4 KB sector size (“Advanced Format / 4Kn”), while my old drives were stuck with the older 512 B standard. This led to the interesting side effect that my new drive was too small to fit volumes according to the healthy mirror drive’s partition table:

# sgdisk /dev/sdb -R /dev/sda
Caution! Secondary header was placed beyond the disk's limits! Moving the header, but other problems may occur!

In the end, what I ended up doing was to use gdisk to create a new partition table with volume sizes for partitions 1 and 9 as similar as possible to those of the healthy mirror (but not smaller!), entirely skipping the steps involving the sgdisk utility. The rest of the steps were identical.

The next problem I encountered was a bit worse: Even though ZFS in the Proxmox VE installation managed 4Kn drives just fine, there was simply no way to get the HP MicroServer Gen7 host to boot from one, so back to the old 3 TB WD RED I went.

Conclusion

Running root-on-zfs in a striped mirrors (“RAID10”) configuration complicates the replacement of any of the drives in the first mirror pair slightly compared to a setup where the ZFS pool is used for storage only.

Fortunately the difference is minimal, and except for the truly dangerous syntax and unclear documentation of the sgdisk command, replacing a boot disk really boils down to four steps:

  1. Make sure the relevant partitions exist.
  2. Copy non-ZFS-data from the healthy drive to the new one.
  3. Resilver the ZFS volume.
  4. Install GRUB.

In a pure data disk, the only thing we have to think about is step 3.

On the other hand, running too new hardware components in old servers doesn’t always work as intended. Note to the future me: Any meaningful expansion of disk space will require newer server hardware than the N54L-based MicroServer.

IKEv2 IPsec VPN with pfSense and Apple devices

Part 2: Apple VPN clients

(Part 1)

In the first part, we configured the pfSense firewall to allow clients to establish secure VPN connections to it. Now we’ll look at what needs to be done to get the clients to actually connect.

Specifically, we’ll create an Apple configuration profile that we can deliver to devices that we want to use as VPN clients.

We’ll start by getting the necessary certificates.

CA and Server certificates

As usual with a PKI-based solution, we need to trust the Root certificate to trust any certificates signed by the Root. Then we need a copy of the Server certificate’s public key to be able to establish an encrypted connection to it from the client. The VPN host in this case already has the client’s public key since we generated the client key-pair locally on the host.

In System – Cert. manager, select the “CAs” tab. Next to the “mydomain VPN-root-CA [year-month]” certificate we created earlier, there’s a row of blue icons. We’re interested in the middle one that represents a round seal. Press it, and your browser will download a .crt file; named something akin to “mydomain+VPN-root-CA+[year-month].crt

Then select the “Certificates” tab and do the same for the server certificate we created earlier. You will now have an additional file called “mydomain+VPN-server+[year-month].crt” in your Downloads directory.

Now for the only bit of shell magic we’ll need to do:

Client certificate

In System – Cert. manager, select the “Certificates” tab. This time download both the certificate (represented by the round seal icon” and the private key (represented by a key icon). This will store “mydomain+VPN-client+[year-month.crt]” and “mydomain+VPN-client+[year-month].key” in your Downloads directory.

Open a Terminal and run the following two commands:

$ cd ~/Downloads
$ openssl pkcs12 -export \
-in mydomain+VPN-client+[year-month].crt \ 
-inkey mydomain+VPN-client+[year-month].key \
-out mydomain+VPN-client+[year-month].p12

You will be asked for an export passphrase. Generate a secure one and store it in your password manager along with the certificate files.

Create an Apple Configuration Profile

This step requires a Mac with Apple Configurator 2 installed.

Start the program and create a new profile. Store it as “[year-month]-mydomain.tld-VPN.mobileconfig

General

Name: mydomain.tld VPN
Identifier: [Reverse FQDN of the VPN gateway, e.g. “tld.mydomain.vpn”
[The rest of the fields are optional]

Certificates

Using the “+” button, add the Root CA certificate (“mydomain+VPN-root-CA+[year-month].crt“), the Server certificate (“”mydomain+VPN-server+[year-month].crt“), and the client certificate bundle we generated earlier (“mydomain+VPN-client+[year-month].p12“). When adding the latter, we also need to enter the export pass phrase.

VPN

Connection Name: mydomain.tld VPN
Connection Type: IKEv2
Always-on VPN: Unchecked
Server: [The Common Name from the Server certificate]
Remote Identifier: [The Common Name from the Server certificate]
Local Identifier: [The Common Name from the Client certificate]
Machine Authentication: Certificate
Certificate Type: RSA
Server Certificate Issuer Common Name: [The Common Name from the Root CA]
Server Certificate Common Name: [The Common Name from the Server certificate]
Enable EAP: Checked
Disconnect on Idle: Optional – I have it set to Never
EAP Authentication: Certificate
Identity Certificate: Select your Client certificate
Dead Peer Detection Rate: Medium
Disable redirects: Unchecked
Disable Mobility and Multihoming: Unchecked
Use IPv4 / IPv6 Internal Subnet Attributes: Unchecked
Enable perfect forward secrecy: Unchecked
Enable certificate revocation check: Unchecked
[Note: The following checkboxes may be changed depending on requirements, but that is outside the scope for this article]
Disable redirects: Unchecked
Disable Mobility and Multihoming: Unchecked
Use IPv4 / IPv6 Internal Subnet Attributes: Unchecked
Enable perfect forward secrecy: Unchecked
Enable certificate revocation check: Unchecked

Select the “IKE SA Params” tab and fill in the following:
First set the Integrity Algorithm to SHA2-384
Then set the Encryption Algorithm to AES-256-GCM
Diffie-Hellman Group: 20
Lifetime In Minutes: 720
Proxy Setup: [Optional]

Select the “Child SA Params” and fill in the following:
First set the Integrity Algorithm to SHA2-256
Then set the Encryption Algorithm to AES-256-GCM
Diffie-Hellman Group: 20
Lifetime In Minutes: 60
Proxy Setup: [Optional]

Save the .mobileconfig.

Using the profile

macOS

The profile can be installed on a Mac by double-clicking the file and entering administrative credentials to allow it to install. When installed, System Preferences – Network will contain a new “network device” called mydomain.tld VPN, with a padlock as an icon. It’s possible to start the VPN connection from here. It’s also possible to check the “Show VPN status in menu bar” checkbox, and manage the VPN by clicking the resulting icon.

iOS

The simplest way to install the profile on an iOS device is by mailing it and tapping the file from within Mail. After providing the device password to allow system changes, there will be a new “mydomain.tld VPN” profile in Settings – VPN. Select it and change Status to Connected.

Conclusion

We have enabled a simple and secure way to reach our home network and to reach the Internet via a known and trusted gateway from our Apple devices even when on the move.
With the proper client configuration, the same principles should be applicable to a client running any modern operating system.

IKEv2 IPsec VPN with pfSense and Apple devices

Part 1: pfSense configuration

For a long time I’ve been content running a simple SSH gateway into my network, since I was severely bandwidth-limited.

The connection was secured in a number of ways I consider a sort of best practice: no remote login for the root account, key based (as opposed to password based) logon, and a custom port which doesn’t add any security per se, but which let me avoid the most common hammering from Asian botnets looking for a way in.

However now that I have a good connection, I have some use for accessing more bandwidth-hungry services from home – or, for that matter, to redirect my Internet traffic via my home when surfing the web over insecure Internet connections.

Here’s the first part of a howto that works with pfSense 2.4, macOS High Sierra (10.13), and iOS 11:

Certificates

The first thing we need is a set of certificates to for mutual identification and encryption between the clients and the VPN endpoint. We’ll start the process on the pfSense box:

CA Certificate

In SystemCert. manager, choose the “CAs” tab and Add a CA certificate.

Descriptive name: mydomain VPN-root-CA [year-month]
Method: Create an internal Certificate Authority
Key length: 2048
Digest algorithm: SHA256
Lifetime (Days): 3650
[Fill in everything down to but not including Common Name]
Common Name: mydomain.tld-vpnrootca

Save the certificate.

Server certificate

In System – Cert. manager, choose the “Certificates” tab and Add/Sign a Server certificate.

Method: Create an internal Certificate
Descriptive name: mydomain VPN-server [year-month]
Certificate Authority: mydomain.tld-vpnrootca
Key length: 2048
Digest algorithm: SHA256
Lifetime (Days): 3650
[….]
Common name: [FQDN of VPN gateway]
Certificate type: Server certificate
Alternative names: Type: FQDN or Hostname Value: [FQDN of VPN gateway]

NB! Do not forget to add an Alternative name even if it’s identical to the Common name!

Save the certificate.

Client certificate

In System – Cert. manager, choose the “Certificates” tab and Add/Sign a User certificate.

Method: Create an internal Certificate
Descriptive name: mydomain VPN-client [year-month]
Certificate Authority: mydomain.tld-vpnrootca
Key length: 2048
Digest algorithm: SHA256
Lifetime (Days): 3650
[….]
Common name: vpnclient.mydomain.tld
Certificate type: User certificate
Alternative names: Type: FQDN or Hostname Value: vpnclient.mydomain.tld

NB! Do not forget to add an Alternative name even if it’s identical to the Common name!

Save the certificate.

VPN configuration

Mobile Client settings

In VPN – IPsec, choose the “Mobile clients” tab and fill in the following values:

IKE Extensions: Enable IPsec mobile client support – checked
User Authentication: Source: Local Database
Group Authentication: Source: system
Virtual Address Pool: Provide a virtual IP address to clients – checked
Network configuration for Virtual Address Pool: 10.200.250.0/24
Provide a virtual IPv6 address to clients: Unchecked
Provide a list of accessible networks to clients:
Unchecked
Allow clients to save Xauth passwords (Cisco VPN client only).: Unchecked
Provide a default domain name to clients: Checked
Specify domain as DNS Default Domain: mydomain.tld
Provide a list of split DNS domain names to clients.: Unchecked
Provide a DNS server list to clients: Checked
[Fill in your DNS servers]
Provide a WINS server list to clients: Unchecked
Provide the Phase2 PFS group to clients: Unchecked
Provide a login banner to clients: Unchecked

Save the settings.

Phase 1 settings

In VPN – IPsec, choose the “Tunnels” tab and Add P1.

Disabled: Unchecked
Key Exchange version: IKEv2
Internet Protocol: IPv4
Interface: WAN
Description: IKEv2 Phase 1
Authentication Method: EAP-TLS
My identifier: Distinguished Name; [Common Name of your Server certificate]
Peer identifier: Any
My Certificate: [Descriptive Name of your Server certificate]
Peer Certificate Authority: [Descriptive Name of your CA certificate]
Encryption Algorithm: AES256-GCM
Key length: 128 bits
Hash: SHA384
DH Group: 20 (nist ecp384)
Lifetime (Seconds)28800
Disable rekey: Unchecked
Margintime (Seconds): 20
Disable Reauth: Unchecked
Responder Only: Checked
MOBIKE: Enable
Split connections: Unchecked
Dead Peer Detection: Checked
Delay: 10
Max failures: 5

Save the settings.

Phase 2 settings

In VPN – IPsec, choose the “Tunnels” tab, Show Phase 2 Entries, and Add P2.

Disabled: Unchecked
Mode: Tunnel IPv4
Local Network: Type: Network
Address: 0.0.0.0/0
NAT/BINAT translation: None
Description: IKEv2 Phase 2
Protocol: ESP
Encryption Algorithms: Check AES256-GCM/128 bits only
Hash Algorithms: Check SHA256 only
PFS key group: 20 (nist ecp384)
Lifetime: 3600
Automatically ping host: [empty]

Save the settings.

Firewall settings

In Firewall – Rules, choose the “IPsec” tab and Add a rule. In this case we’re not interested in limiting traffic, so it will be an “allow all” type rule:

Action: Pass
Disabled: Unchecked
Interface: IPsec
Address Family: IPv4
Protocol: Any
Source: Any
Destination: Any
Log: Unchecked
Description: Allow all VPN traffic to anywhere.

Save the firewall rule.

This is it for the firewall configuration. In the next part (Part 2) we’ll export the certificates and set up an Apple Configurator config for iOS and macOS devices.

SMS notifications from Zabbix

To increase visibility of critical alerts, I’ve set up Zabbix to send messages to the phones of certain technicians via a GSM modem. This post documents what I had to do to get things running.

  • Zabbix server
  • MOXA ethernet-to-serial gateway
  • Siemens GSM modem

The first thing to do is to get the driver for the MOXA gateway running. The documentation is alrightish and covers some of the most used Linux distributions. Dissappointingly, after running through the checklist in the README file, the MOXA gateway will only work until after the next reboot.

The problem here, is that the init script for the MOXA driver removes and then recreates the device nodes on startup, and the scripts being called (mxmknod and mxrmnod) contain references to current dir (./) rather than their absolute path (/usr/lib/npreal2/driver). Edit those scripts, and you’ll have a working driver even after the Zabbix server reboots.

It should now be possible to test the modem by running, for example, socat - /dev/ttyr00 and entering some AT commands like in the olden days.

The next step is to install the gsm-utils package, which will create the possibility to easily send messages. Without any other options set, the gsmsendsms command expects there to be a device called /dev/mobilephone. We can create that one by making a symlink with that name to /dev/ttyr00, or we can tell the command which device to use with the -d argument.

Try sending a message: gsmsmssend -d /dev/ttyr00 <phone_number> "Hello World!"

This does not immediately solve how to send messages from Zabbix, though. Recent versions of Zabbix do have an SMS media type that can be called on, but it doesn’t seem to work when not speaking directly to one of the two modems on the compatibility list. Fortunately it’s very easy to create a new media type:

To prepare for this, create the folder /etc/zabbix/alertscripts, then modify the AlertScriptsPath definition in /etc/zabbix/zabbix-sever.conf to point at this directory, and restart the Zabbix Server service.

Now create the script /etc/zabbix/alertscripts/sendsms with the following contents:

#!/bin/bash
serial=`date +%s | cut -c 6-`
/usr/bin/gsmsendsms -d /dev/ttyr00 -b 115200 -c $serial $1 "$2"

The -c $serial argument is used to create a unique ID when serializing messages longer than 160 characters, as per the SMS standard. $1 will be populated with the contact phone number, and $2 will be populated by the actual message from Zabbix.

Finally let’s create our Media Type definition in Zabbix:

Under Administration -> Media Types -> Create Media Type define the following:

Name: SMS via MOXA
Type: Script
Script Name: sendsms
Script Parameters:
{ALERT.SENDTO}
{ALERT.MESSAGE}

To be able to alert users, Zabbix needs to know how to contact them using this new media type:

Administration -> Users -> <username> -> Media -> Add

Type: SMS via Moxa
Send to: <phonenumber>
When active: schedule during which to receive alerts
Use if severity: Severity levels to send via SMS
Enabled:

And finally you need to create Actions that define when to send messages via this solution, under Configuration -> Actions.

Voilà: Zabbix just got a bit noisier, in a good way.