IPv6 guests in KVM

I’ve been experimenting with IPv6 at home, and spent some time trying to get it working in my virtual machines.

The first symptom I got was that VMs got a “Network unreachable” error when trying to ping6 anything but their own address. The cause was a complete brainfart on my side: We need a loopback interface network definition for IPv6 in /etc/network/interfaces:

auto lo
iface lo inet loopback
iface lo inet6 loopback

The second problem took a bit more digging to understand: I would get an IPv6 address, and I could ping stuff both on my own network and on the Internet from the VM, but no other computers could reach the virtual machine over IPv6.

According to this discussion, QEMU/KVM has support for multicast (required for proper IPv6 functioning), but it’s turned off by default. Remedy this by running virsh edit [vm-name] and adding trustGuestRxFilters='yes' to the appropriate network interface definition:

    
      
      
      
      

As usual, when you understand the problem the solution is simple.

Frustrations in Ubuntu 18.04

My first frustration with Ubuntu 18.04 came yesterday. I created a template VM with my basic toolkit that any machine in my network should have. I then deployed the VM and asked vSphere to set the hostname to the value of the VM name. Strangely, this didn’t happen: The new machine booted up alright, but its name remained that of the template.

Remember the old way to manually change the name of a machine in Linux? It went something like this:

  1. Add the new hostname to your /etc/hosts so sudo doesn’t get confused.
  2. Replace the old hostname in /etc/hostname with the new one.
  3. Reboot the computer or restart all affected services.

The new way goes like this:

  1. Add the new hostname to your /etc/hosts so sudo doesn’t get confused.
  2. Replace the old hostname in /etc/hostname with the new one.
  3. Reboot the computer.
  4. Notice that the hostname is the same as it was before you attempted to change it.
  5. Web search “change hostname ubuntu 18.04”.
  6. Discover that there’s a new utility, hostnamectl, which has a command, change-hostname, that takes the new hostname as an argument.
  7. Run hostnamectl change-hostname [newname]
  8. Run hostnamectl without any arguments to confirm that “Static hostname” has the correct value.
  9. Log off and back on again and be happy that everything seems to be working.
  10. Reboot the computer after doing some changes.
  11. Notice that the hostname is back to what it was.
  12. Run hostnamectl change-hostname [newname] again, and check in /etc/hostname just to see that it actually did change the file to contain the new hostname.
  13. Check in /etc/hosts and see that the new name appears there too.
  14. Scour the web some more for additional information.
  15. Find some mention of cloud-init.
  16. Read up on it and see the point of it – but also that it doesn’t apply to my current environment.
  17. Run sudo apt remove cloud-init
  18. Reboot the server and see that it works as expected again.
  19. (In the future: Learn more about cloud-init and re-evaluate whether it should be implemented in my environment as a complement to Ansible).

Transport security with Postfix

I had a “Face: Meet Palm” moment today, and as usual when that happens, I learned something new:

What happened was that I noticed that mail from a Postfix server I use for sending mail from a couple of domains was marked with the red “no encryption” label rather than the expected grey “standard encryption” icon when I looked at the message details in Gmail. I was sure that I had set the server to use what they call “opportunistic TLS”; that is: Attempt to use TLS but fall back to no encryption if that’s unavailable.

Reading the Postfix documentation, however, I saw the problem: there are two sets of TLS rules in the main.cf configuration file: those starting with “smtpd_“, which deal with how the server responds to its clients, and those who start with “smtp_“, which deal with how Postfix acts when working in client mode towards other servers.

So now I have the following two lines in my /etc/postfix/main.cf:

smtp_tls_security_level = may
smtpd_tls_security_level = may

Resizing the system volume on a Linux VM

Background

With LVM, the preferred way of adding storage space to a computer running a Linux-based operating system seems to be to add disks, judging by my search results. Naturally, this is a great way of minimizing disruption in a physical machine, but what if you’re running your machines virtually? Adding virtual disks tends to get messy after a while, and hypervisors allow you to simply grow the vdisk, so why not do that?

Problem is, the old way I used to do it (using partprobe after growing the partition) required a system reboot to see the entire available new space if I attempted it on the system volume. Documented below is a better way.

The process

Start by confirming the current disk size so we know our baseline.

# fdisk -l

Disk /dev/sda: 26.8 GB, 26843545600 bytes, 52428800 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk label type: dos
Disk identifier: 0x000ba3e8

   Device Boot      Start         End      Blocks   Id  System
/dev/sda1   *        2048     2099199     1048576   83  Linux
/dev/sda2         2099200    52428799    25164800   8e  Linux LVM

Disk /dev/mapper/ol-root: 18.2 GB, 18249416704 bytes, 35643392 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes

OK, so we have slightly less than 27 GB of disk space. Let’s grow the disk image in the hypervisor, and then re-scan the device.

# ls /sys/class/scsi_device/
1:0:0:0 2:0:0:0
# echo 1 > /sys/class/scsi_device/1\:0\:0\:0/device/rescan
# fdisk -l

Disk /dev/sda: 80.5 GB, 80530636800 bytes, 157286400 sectors
(...)

Now we have the disk space available, let’s perform the steps to grow our file system.

# fdisk /dev/sda


Command (m for help): p

Disk /dev/sda: 80.5 GB, 80530636800 bytes, 157286400 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk label type: dos
Disk identifier: 0x000ba3e8

Device Boot Start End Blocks Id System
/dev/sda1 * 2048 2099199 1048576 83 Linux
/dev/sda2 2099200 52428799 25164800 8e Linux LVM

Command (m for help): d
Partition number (1,2, default 2): 
Partition 2 is deleted

Command (m for help): n
Partition type:
 p primary (1 primary, 0 extended, 3 free)
 e extended
Select (default p): 
Using default response p
Partition number (2-4, default 2): 
First sector (2099200-157286399, default 2099200): 
Using default value 2099200
Last sector, +sectors or +size{K,M,G} (2099200-157286399, default 157286399): 
Using default value 157286399
Partition 2 of type Linux and of size 74 GiB is set

Command (m for help): t
Partition number (1,2, default 2): 
Hex code (type L to list all codes): 8e
Changed type of partition 'Linux' to 'Linux LVM'

Command (m for help): w
The partition table has been altered!

The above statement is followed by what used to be a problem:

Calling ioctl() to re-read partition table.

WARNING: Re-reading the partition table failed with error 16: Device or resource busy.
The kernel still uses the old table. The new table will be used at
the next reboot or after you run partprobe(8) or kpartx(8)
Syncing disks.

Partprobe won’t help us here, and kpartx for some reason doesn’t consistently catch the entire new disk size. The correct way, then, is the following:

# partx -u /dev/sda2

The result?

# partx -s /dev/sda
NR START END SECTORS SIZE NAME UUID
 1 2048 2099199 2097152 1G 
 2 2099200 157286399 155187200 74G

Now let’s finish extending everything up to the actual file system:

# pvresize /dev/sda2
 Physical volume "/dev/sda2" changed
 1 physical volume(s) resized / 0 physical volume(s) not resized
# lvextend -l 100%VG /dev/mapper/ol-root
 Size of logical volume ol/root changed from <17.00 GiB (4351 extents) to <72.00 GiB (18431 extents).
 Logical volume ol/root successfully resized.
# xfs_growfs /dev/mapper/ol-root

And finally let’s check that everything worked out as we expected:

# df -h
Filesystem Size Used Avail Use% Mounted on
(...)
/dev/mapper/ol-root 72G 17G 56G 24% /
(...)

Conclusion

The Windows family of operating systems has had the ability to grow any volume on the fly since Server 2008. I couldn’t imagine that Linux would lack this ability, but I didn’t know how to do it the right way. Now I do.

Test whether a git pull is needed from within a batch script

Just a quick hack I did to avoid having to sync a couple of scripts unnecessarily when deploying my load balancers. Underlying idea stolen from a post by Neil Mayhew on Stackoverflow.

Shell session script:

#!/bin/bash
UPSTREAM=${1:-'@{u}'}
LOCAL=$(git rev-parse @)
REMOTE=$(git rev-parse "$UPSTREAM")
BASE=$(git merge-base @ "$UPSTREAM")
if [ $LOCAL = $REMOTE ]; then
    GIT_STATUS=nochange
elif [ $LOCAL = $BASE ]; then
    GIT_STATUS=changed
    git pull
fi

Ansible playbook:

---
vars:
-   version_status: "{{ lookup ('env', 'GIT_STATUS') }}"
-   tasks:
    -   name: Update HAProxy scripts
        copy:
            src: "{{ config_root }}/etc/haproxy/scripts"
            dest: "/etc/haproxy"
        when: version_status=="changed"

Environment variables for services in a systemd-based system

My current config deployment automation project has required me to set up a dev/staging environment for my load balancers, since I don’t want to break stuff by deploying untested configurations.
This environment is functionally identical to a single load balancer and can be used along with a hosts file on a client to not only develop configurations and make sure they are syntactically correct, but also to fully test the functionality of the load balancer rules.

As part of this, I naturally need to change the listener addresses in the dev/staging HAProxy environment compared to the production environment that I keep version controlled in my git repository.
My first instinct was to use a script to pull the latest versions, modify the necessary lines using sed, and copy the config to the correct location. I didn’t really like this concept since it would by definition mean that the configs weren’t fully identical between the production environment+git repo and the dev/staging environment.
If I used environment variables, the version controlled configuration could be kept fully identical across all instances.

The first mistake I made took me a while to grasp. HAProxy parsed an obviously fully valid configuration, but intermittently presented a certificate I didn’t expect, and web services intermittently failed to reply to requests.
It turns out Linux daemons don’t inherit even system-wide environment variables.
So how do we check what environment variables a service does see?
First get the PID(s) of the service:

$ pgrep haproxy
1517
1521
1523
$ 

In the case of my HAProxy installation, I got a list of three processes, so I chose the last one and checked out its environment:

# cat /proc/1523/environ

This results in a list of its current environment variables, and naturally the ones I thought I’d added were nowhere to be seen.

So why did HAProxy start without complaining? Naturally since the environment variables weren’t defined in this context, their implicit value was NULL, and so HAProxy figured I wanted to listen on the assigned ports on all interfaces.

How do we assign environment variables to a service in a modern, systemd-based Linux, then?
On a command-prompt, run systemctl edit servicename. This starts your default editor. A valid config looks like this:

[Service]
Environment=ENVVAR1=value
Environment=ENVVAR2=value

On Ubuntu, this file is stored in /etc/systemd/system/servicename.service.d/override.conf, but naturally this file can be renamed to something more descriptive. The systemctl edit command doesn’t do anything magical, it’s just a shortcut.
After the file is in place, run systemctl daemon-reload to make the configuration active, and then the HAProxy service needs to be restarted, not only reloaded, for the changes to apply.

Of course, I want this config too to be deployable through Ansible.
The relevant lines from my playbook:

---
    -   name: Update environment variables for HAProxy service
        copy:
            src: "{{ config_root }}/etc/systemd/system/haproxy.service.d/10-listeners.conf"
            dest: "/etc/systemd/system/haproxy.service.d/"
        register: ha_envvar_status
    
    -   name: Reload systemd service configuration
        systemd:
            daemon_reload: yes
        when: ha_envvar_status|changed
...
    -   name: Reload HAProxy configuration
        service:
            name: haproxy
            state: reloaded
        when: ha_envvar_status|skipped and haproxy_cfg_status|changed
        register: reloaded_haproxy

    -   name: Restart HAProxy daemon
        service:
            name: haproxy
            state: restarted
        when: ha_envvar_status|changed or (haproxy_cfg_status|changed and not reloaded_haproxy|skipped)

Key lines:
The register line stores the result of the command to a variable. Thanks to that, I can use the when keyword to only reload daemons when anything has actually changed.

Summary

Linux daemons don’t automatically inherent environment variables.
In systemd-based distros (which today means pretty much anyone with corporate backing), environment variables can be added using the Environment keyword in the Service section of a file in /etc/systemd/system/servicename.service.d/.

Continuous Deployment of Load Balancer Configurations

I thought I’d describe some optimizations I’ve made to my load balancers at work, both for the good of the older me, and in case someone would benefit from some of my ideas.

Background

The load balancers are based on four software packages that integrate to create a powerful whole:
Keepalive Daemon provides a common set of virtual IP addresses and ensures that failover happens to a Backup server if the Master would cease responding.
HAProxy does most of the actual load balancing and mangles network traffic when required.
SNMPD throws SNMP trap events from keepalived whenever a failover occurs.
The Zabbix Agent enumerates current configuration and system state for detailed system monitoring.

Now, all of these components get the occasional configuration change, except for HAProxy, which pretty much sees changes on at least a weekly basis.
The procedure for updating the configuration must cover the following steps:

  1. Run a pre-check to confirm that both load balancers in the pair work; we don’t want to initiate an automated update that could kill off service availability completely.
    On the Backup load balancer node:
  2. Backup the current configuration.
  3. Deploy the new configuration.
  4. Reload services.
  5. Run a post-op check on the secondary node to confirm that the new config hasn’t broken anything important.
  6. Fail over operations from the Master load balancer node to the Backup node and repeat steps 2-5 on the Master node.
  7. Perform a final check on the load balanced services to confirm functionality hasn’t been lost.

From experience, this procedure is tedious to say the least. In addition there’s always the risk of introducing a change to an active load balancer and forgetting to deploy the same change to the backup one; something that may not become obvious until after the next major configuration update when the last change disappears and functionality breaks.

These are just the most obvious arguments for an automated and version controlled deployment procedure. So how do we go about that?

Version control

In my case, I use Git connected to a GitLab server for version control, and Ansible for automation.

Configuration changes are prepared in a development environment, from which the relevant files are committed to a git repository.

Other components in the load balancer config – Lua scripts or tools made by our developers are stored in other repositories, and can be pulled by git before a new deployment.

Ansible structure

For each load balancer pair, I’ve built a directory structure containing a playbook directory for the Ansible YAML scripts, and a filesystem directory that mirrors the movable parts of the load balancer, where the relevant parts exist in the etc directory tree.

Automation

Deployment is initialized by a shell script that git-pulls the latest versions of dependencies we have and then ensures that the Ansible playbooks can work on remote computers by wrapping them in an ssh-agent environment.
The execution of Ansible playbooks happens from within a session script called by the ssh-agent.

Ansible-specific tips

The key to ensuring that the production environment doesn’t break lies in the header of the playbook:

---

-   name: Update PRODUCTION load balancer configuration
    hosts: lb_hadmzprod
    serial: true
    any_errors_fatal: true

The serial keyword makes the script work on one server at a time rather than executing in parallel.
The any_errors_fatal parameter is combined with relevant service checks interspersed among the deployment tasks to ensure that the script fails fast and loudly if a backend web service stops responding while deployment is underway, so that we don’t break both servers in a pair. Note that this requires some thought on the part of the person running the scripts, so they fix the problem before re-attempting to run the script, or fecal matter will hit the fan quickly enough.

The most basic of tests just ensures I can reach the statistics page of my load balancer:

    -   name: Fail task if lb1 is unavailable
        uri: 
            url: https://lb1.domain.com:1936

A typical file copying task:

    -   name: Update Keepalived configuration
        copy:
            src: "{{ config_root }}/etc/keepalived/{{ item }}"
            dest: "/etc/keepalived/"
            mode: 0600
        with_items:
        -   keepalived-master.conf
        -   keepalived-slave.conf

As a side note: Since I don’t want the script to have to care about which server is which, I’ve created one config file for the keepalived master instance and one for the slave. On the actual servers, a symlink points to the correct configuration for the instance.

By reloading the HAProxy service, existing sessions are not lost even though the configuration gets updated. As a bonus, in the Ansible service module, the reloaded state request also starts the service if it wasn’t started before.

    -   name: Reload HAProxy configuration
        service:
            name: haproxy
            state: reloaded

With way less than a day’s worth of work, a workflow has been introduced for the deployment process that is repeatable and that mitigates some of the risks involved in letting humans tamper with production systems.

Load Balancing Exchange 2016 behind HAProxy

I recently started the upgrade to Exchange 2016 at work. A huge benefit over Exchange 2010, is that REST based client connections are truly stateless. In effect this means that if a server goes down, clients shouldn’t really notice any issues as long as something redirects them to a working server. In my system, this something is HAProxy.

The guys at HAProxy have their own excellent walkthroughs for setting up their load balancer for Exchange 2013, which can pretty much be lifted verbatim to Exchange 2016, but I want to add a few key points to think about:

Service health checks

Each web service has a virtual file to tell its state, called HealthCheck.htm. Let HAProxy use the contents of this file for the server health check. That way it’ll know to redirect clients if one of the services is down, even though the Exchange server in question may still be listening on port 443.

Example config:

    option httpchk GET /owa/HealthCheck.htm
    http-check expect string 200\ OK
    server Exchange1 192.168.1.13:443 maxconn 10000 ssl cafile cacert.pem weight 20 check 

This example shows a test of the Outlook Web Access service state. Naturally the config can be set to test each of the REST services each Exchange server presents.

Exchange server default firewall rules

Our design puts the load balancer in a DMZ outside of our server networks. Clients connecting through the load balancer will be dropped by Windows firewall rules generated by Exchange; specifically the edge traversal rules for the POP3 and IMAP protocols. Make sure you allow edge traversal for these protocols, letting the network firewall take care of limiting external client connections to them. Also take note there are multiple firewall rules for IMAP and POP3 traffic. Only the ones concerned with client traffic are relevant for this change. There’s no point in kicking open holes in your firewall for no good reason.

Exchange and Outlook suck at IMAP

We use IMAP for an internal order management system. Outlook and Exchange aren’t the best tools for this protocol, but unfortunately we have to live with those due to sins committed long ago. I spent quite some time troubleshooting our IMAP connections:
No matter how I configured Outlook I couldn’t get it to open an IMAP connection to the Exchange servers. Error messages varied depending on the client settings, but essentially I couldn’t log on, couldn’t establish a secure connection, or couldn’t synchronize my folders.

I would get the regular banner when telnetting from the client machine, so I knew traffic was getting through all the way from Exchange via the load balancer.
Mozilla Thunderbird could connect perfectly well and sync accounts, both using STARTTLS on port 143 and over a TLS encrypted connection on port 993. After mulling it over, I turned on debug logging in Outlook and quickly saw that the client was trying and failing to perform an NTLM logon to Exchange. Using the error messages as search terms, I found others who had experienced the same issue. Their solution had been to turn off NTLM authentication for the IMAP protocol on the Exchange server. This seems to be a regression in Exchange Server 2016 from an earlier bug in Exchange 2013.
The command in the Exchange Management Shell:

Set-IMAPSettings -EnableGSSAPIAndNTLMAuth $false

After this, Outlook still is incapable of logging on using TLS over port 993, but at least it consistently manages to run STARTTLS over port 143, which is good enough for my use case.

All in all, the most complicated part here wasn’t to make HAProxy do its magic, but to get Exchange and Outlook do what they should.

SFTP revelations

I got myself into a situation where I had to copy some files from my computer to a server that presented sftp but not scp. Since I’ve never needed to use the sftp protocol from a cli-only machine, I haven’t really thought about how it works in non-interactive mode. Batch mode allows you to create a batch file of sftp commands to execute on the server, but what if you just want to do a single operation?

Pipes to the rescue:

$ echo put filename.tgz | sftp -i private.key -b - username@hostname.domain.com

Putting a dash after the -b option causes the command to take batch input from stdin. Piping text to the command, then, means that text is swallowed by the sftp client. Nice and simple.

Securing an Internet accessible server – Part 3

This post is part of a series. Part 1, Part 2.

In the last part I briefly mentioned load balancers and proxies. After thinking about it for a while, I realized I see no reason not to run one, since it simplifies things a bit when setting up secure web services. In this part, we will be setting up a HAProxy server which won’t actually load balance anything, but which will act as a kind of extensible gatekeeper for our web services. In addition, the HAProxy instance will act as the TLS termination point for secure traffic between clients on the Internet and services hosted on our server(s).

This article is written from the perspective of running HAProxy on a separate virtual machine. That’s just for my own convenience, though. If you’re running pfSense for a firewall, you already have HAProxy as a module. It is also possible to run HAProxy directly on your web server, just logically putting it in front of whatever web server software you’re running.

Let’s get started. This post will be a rather long one.

Continue reading “Securing an Internet accessible server – Part 3”