IPv6 guests in KVM

I’ve been experimenting with IPv6 at home, and spent some time trying to get it working in my virtual machines.

The first symptom I got was that VMs got a “Network unreachable” error when trying to ping6 anything but their own address. The cause was a complete brainfart on my side: We need a loopback interface network definition for IPv6 in /etc/network/interfaces:

The second problem took a bit more digging to understand: I would get an IPv6 address, and I could ping stuff both on my own network and on the Internet from the VM, but no other computers could reach the virtual machine over IPv6.

According to this discussion, QEMU/KVM has support for multicast (required for proper IPv6 functioning), but it’s turned off by default. Remedy this by running virsh edit [vm-name] and adding trustGuestRxFilters='yes' to the appropriate network interface definition:

As usual, when you understand the problem the solution is simple.

Frustrations in Ubuntu 18.04

My first frustration with Ubuntu 18.04 came yesterday. I created a template VM with my basic toolkit that any machine in my network should have. I then deployed the VM and asked vSphere to set the hostname to the value of the VM name. Strangely, this didn’t happen: The new machine booted up alright, but its name remained that of the template.

Remember the old way to manually change the name of a machine in Linux? It went something like this:

  1. Add the new hostname to your /etc/hosts so sudo doesn’t get confused.
  2. Replace the old hostname in /etc/hostname with the new one.
  3. Reboot the computer or restart all affected services.

The new way goes like this:

  1. Add the new hostname to your /etc/hosts so sudo doesn’t get confused.
    The doctor, exasperated
    My initial reaction to Ubuntu 18.04 host name management
  2. Replace the old hostname in /etc/hostname with the new one.
  3. Reboot the computer.
  4. Notice that the hostname is the same as it was before you attempted to change it.
  5. Web search “change hostname ubuntu 18.04”.
  6. Discover that there’s a new utility, hostnamectl, which has a command, change-hostname, that takes the new hostname as an argument.
  7. Run hostnamectl change-hostname [newname]
  8. Run hostnamectl without any arguments to confirm that “Static hostname” has the correct value.
  9. Log off and back on again and be happy that everything seems to be working.
  10. Reboot the computer after doing some changes.
  11. Notice that the hostname is back to what it was.
  12. Run hostnamectl change-hostname [newname] again, and check in /etc/hostname just to see that it actually did change the file to contain the new hostname.
  13. Check in /etc/hosts and see that the new name appears there too.
  14. Scour the web some more for additional information.
  15. Find some mention of cloud-init.
  16. Read up on it and see the point of it – but also that it doesn’t apply to my current environment.
  17. Run sudo apt remove cloud-init
  18. Reboot the server and see that it works as expected again.
  19. (In the future: Learn more about cloud-init and re-evaluate whether it should be implemented in my environment as a complement to Ansible).

 

DNS/DHCP issues in modern Windows versions

Static IP addresses are a solid way to configure machines if you have few enough of them to manage them manually. But the more ability you want to have to change things on the fly, the more limiting such a configuration scheme becomes.

Unfortunately I’ve had severe problems with getting servers with DHCP leases (or even DHCP reservations) to have their names stick in DNS over time. Suddenly, after a reboot a machine would seemingly drop off the network even though it had the same IP address as earlier. Rebooting or manually re-registering its DNS record would solve the problem, but it wasn’t an acceptable solution to the underlying issue.

I found a discussion that gave a few pointers on how to get these things working in Windows, and I’ve shamelessly ripped the relevant points to present them here:

Step one: Define a user in whose context the DHCP server will run

Simply add a domain user with no special rights, and give it a properly strong password. Then open the DHCP management console, right-click the protocol you want to change (IPv4 or IPv6), and select Properties and the Advanced tab. Click Credentials and enter the relevant information for the account.

Step two: Tell DHCP to always attempt to update DNS records

In the same properties window, select the DNS tab. Ensure the following choices are ticked:

Enable DNS Dynamic Updates(…) -> Always dynamically update DNS records
Dynamically update DNS records for DHCP clients that do not request updates

Step three: Ensure DHCP server AD group membership

The DHCP server(s) should exist in the group DNSUpdateProxy. No other user or computer accounts may exist in this group.

Other tips

Make sure DHCP leases are longer than 24 hours, or bad things are likely to happen. A concrete example given is that Microsoft KMS servers have a 24 hour update cycle.

 

SMS notifications from Zabbix

To increase visibility of critical alerts, I’ve set up Zabbix to send messages to the phones of certain technicians via a GSM modem. This post documents what I had to do to get things running.

  • Zabbix server
  • MOXA ethernet-to-serial gateway
  • Siemens GSM modem

The first thing to do is to get the driver for the MOXA gateway running. The documentation is alrightish and covers some of the most used Linux distributions. Dissappointingly, after running through the checklist in the README file, the MOXA gateway will only work until after the next reboot.

The problem here, is that the init script for the MOXA driver removes and then recreates the device nodes on startup, and the scripts being called (mxmknod and mxrmnod) contain references to current dir (./) rather than their absolute path (/usr/lib/npreal2/driver). Edit those scripts, and you’ll have a working driver even after the Zabbix server reboots.

It should now be possible to test the modem by running, for example, socat - /dev/ttyr00 and entering some AT commands like in the olden days.

The next step is to install the gsm-utils package, which will create the possibility to easily send messages. Without any other options set, the gsmsendsms command expects there to be a device called /dev/mobilephone. We can create that one by making a symlink with that name to /dev/ttyr00, or we can tell the command which device to use with the -d argument.

Try sending a message: gsmsmssend -d /dev/ttyr00 <phone_number> "Hello World!"

This does not immediately solve how to send messages from Zabbix, though. Recent versions of Zabbix do have an SMS media type that can be called on, but it doesn’t seem to work when not speaking directly to one of the two modems on the compatibility list. Fortunately it’s very easy to create a new media type:

To prepare for this, create the folder /etc/zabbix/alertscripts, then modify the AlertScriptsPath definition in /etc/zabbix/zabbix-sever.conf to point at this directory, and restart the Zabbix Server service.

Now create the script /etc/zabbix/alertscripts/sendsms with the following contents:

The -c $serial argument is used to create a unique ID when serializing messages longer than 160 characters, as per the SMS standard. $1 will be populated with the contact phone number, and $2 will be populated by the actual message from Zabbix.

Finally let’s create our Media Type definition in Zabbix:

Under Administration -> Media Types -> Create Media Type define the following:

To be able to alert users, Zabbix needs to know how to contact them using this new media type:

Administration -> Users -> <username> -> Media -> Add

And finally you need to create Actions that define when to send messages via this solution, under Configuration -> Actions.

Voilà: Zabbix just got a bit noisier, in a good way.

Transport security with Postfix

I had a “Face: Meet Palm” moment today, and as usual when that happens, I learned something new:

What happened was that I noticed that mail from a Postfix server I use for sending mail from a couple of domains was marked with the red “no encryption” label rather than the expected grey “standard encryption” icon when I looked at the message details in Gmail. I was sure that I had set the server to use what they call “opportunistic TLS”; that is: Attempt to use TLS but fall back to no encryption if that’s unavailable.

Reading the Postfix documentation, however, I saw the problem: there are two sets of TLS rules in the main.cf configuration file: those starting with “smtpd_“, which deal with how the server responds to its clients, and those who start with “smtp_“, which deal with how Postfix acts when working in client mode towards other servers.

So now I have the following two lines in my /etc/postfix/main.cf:

Resizing the system volume on a Linux VM

Background

With LVM, the preferred way of adding storage space to a computer running a Linux-based operating system seems to be to add disks, judging by my search results. Naturally, this is a great way of minimizing disruption in a physical machine, but what if you’re running your machines virtually? Adding virtual disks tends to get messy after a while, and hypervisors allow you to simply grow the vdisk, so why not do that?

Problem is, the old way I used to do it (using partprobe after growing the partition) required a system reboot to see the entire available new space if I attempted it on the system volume. Documented below is a better way.

The process

Start by confirming the current disk size so we know our baseline.

OK, so we have slightly less than 27 GB of disk space. Let’s grow the disk image in the hypervisor, and then re-scan the device.

Now we have the disk space available, let’s perform the steps to grow our file system.

The above statement is followed by what used to be a problem:

Partprobe won’t help us here, and kpartx for some reason doesn’t consistently catch the entire new disk size. The correct way, then, is the following:

The result?

Now let’s finish extending everything up to the actual file system:

And finally let’s check that everything worked out as we expected:

Conclusion

The Windows family of operating systems has had the ability to grow any volume on the fly since Server 2008. I couldn’t imagine that Linux would lack this ability, but I didn’t know how to do it the right way. Now I do.

When the French attack…

A consultant working with our Alcatel phone system encountered a weird issue that caused us some problems the other day. When attempting to install an Open Touch Media Server (used for receiving fax, for example), the entire vCenter client environment froze, and a reload of the page resulted in the following error message:

503 Service Unavailable (Failed to connect to endpoint: [N7Vmacore4Http20NamedPipeServiceSpecE:0x0000…] _serverNamespace = / action = Allow _pipeName =/var/run/vmware/vpxd-webserver-pipe)

A lot of searching the web led me nowhere – there were a bunch of solutions, but none of whose symptoms agreed with what I was experiencing; I had not changed IP addresses on the vCenter Appliance, nor had I changed its name, and I did not have an issue with logs reporting conflicting USB device instances.

What I did have, though, was a new OpenTouch server on one of my ESXi hosts, which did not have a network assigned to its network interface, and this, apparently is not a configuration that vCenter was written to take into consideration.

Logging on to the local web client on the specific ESXi host where the machine was running (after identifying that…), and selecting the machine in question, I got a warning message specifying the network problem, and a link to the Action menu. Simply selecting a valid network and saving the machine configuration was enough to allow me to ssh to the vCenter Appliance and start the vmware-vpxd service:

# service-control –start vmware-vpxd

We’ll just have to see how we proceed from here…

Manually removing ghost vVols from IBM SVC-based storage

As part of my evaluation of presenting vVols to vCenter from an IBM FlashSystem V9000, I decided to start from scratch after learning a bit about the benefits and limitations of the system. That is: I like vVols a lot, but I learned some things in my tests that I wanted to do differently in actual production.

Unfortunately, once I had migrated my VMs off the vVol datastores, I still couldn’t detach the relevant storage resources from the storage service in Spectrum Control Base. The error message was clear enough: I’m not allowed to remove a storage resource that still has vVols on it. My frustration was based in the fact that vCenter showed no VMs nor files on any of the vVol datastores, but I could clearly see them (labeled as “volume copies”) in the “Volumes by Pool” section in the SVC webUI on the V9000.

At least as of version 7.6.x of the SVC firmware, there is no way of manually removing vVols from the GUI, and as usual in such cases, we turn to the CLI:
I connected to the V9000 using ssh, taking care to log on as my VASA user. All virtual disks on the V9000 can be listed using the lsvdisk command. The first two columns are their ID and name, and any of these parameters can be fed to the rmvdisk command to manually remove a volume.

Just to be clear: The rmvdisk command DELETES stuff. Do not use it unless you really mean it! With that warning out of the way; once I had removed the volumes and waited a couple of minutes for the change to propagate to Spectrum Control Base, detaching storage resources from storage services was a breeze.

FTP server on IIS

I recently had cause to set up an FTP server for internal use on a Windows server, and I bumped into an issue that took me a little while to figure out, since I had never done it before:
I wanted to give a domain user account an isolated directory, and started out by creating the user directory inside the inetpub\ftproot folder. However when I tried to log on with a client, it would I’ve me an error message “530 User cannot log in, home directory inaccessible”.

Turns out the answer was simply to create a subdirectory to the ftproot with the name of the domain, and then move the user directory inside that one.

Siri on the Original Apple Watch (Series 0)

A few days ago I wrote about how I was gradually growing less satisfied with the responsiveness of my Apple Watch with the added bloat of newer systems, especially to the Hey Siri command, which besides actually looking at the watch has been my most commonly used interaction with it. Today I stumbled upon a setting that made a huge difference in Siri responsiveness: For some reason I held my phone in my hand as I summoned Siri to trigger a timer for me, and I realized that the phone rather than the watch responded to the command. I had activated Hey Siri on the phone a while ago from curiosity and never turned it off.

This time I did turn off Hey Siri functionality on the phone, and sure enough, response time to the command on the watch dropped to become almost instantaneous. It still takes a while for the command uttered to be processed by the watch, but it now has restored my trust in it happening in most cases, which has returned my Apple Watch to its status of “significantly more useful than a regular watch”.

As to what caused the issue, my current hypothesis is that the faster CPU in the iPhone 6S realized “Hey Siri” had been said first, then the watch chimed in and tried to claim the action for what had been said since it was the active device at the time, and this whole negotiation process was what made Siri on the Watch unbearably slow to use.