Restoring a really old domain controller from backups

I had an interesting experience this week, where I was faced with the need to restore an entire Active Directory environment from backups that were more than a year old.

The company whose servers I was restoring had been using an older version of Veeam Backup and Recovery, which always simplifies matters a lot: The entire thing was delivered to me over sneaker net, on a 2.5″ USB drive containing several restore points for each machine.

The restore was uneventful, as expected, and most machines simply started up in their new home. Unfortunately, one of the Active Directory controllers would bluescreen on boot, with a C00002E2 error message.

After some reading up on things, I realized the machine had passed the Active Directory tombstone period: as I wrote, the backups were taken over a year ago. Since I had one good domain controller, I figured I would simply cheat with the local time on the failing DC. It would boot successfully into Directory Services Recovery Mode, so I could set the local clock, but anybody who has a bit of experience with the VMware line of virtualization products knows that by default, VMware ESXi synchronizes the guest system clock in a few situations; amongst them on reboot.

Fortunately VMware has a knowledgebase article covering how to disable all synchronization of time between guests and hosts. A total of eight advanced settings must be set to False, with the guest turned off:

tools.syncTime
time.synchronize.continue
time.synchronize.restore
time.synchronize.resume.disk
time.synchronize.shrink
time.synchronize.tools.startup
time.synchronize.tools.enable
time.synchronize.resume.host

The procedure is documented in KB1189.

After setting these properties on the machine, I started it back up, with the system time set well back into the range before the tombstone cutoff date, let it start up and rest for a while for all services to realize everything was alright, and then I set the time forward to the current date, waited a bit longer, and restarted the VM. After this, the system started working as intended.

Managing Windows servers with Ansible

Although I to a large degree get to play with the fun stuff at work, much of our environment still consists of Windows servers, and that will not be changing for a long time. As I’ve mentioned in earlier posts, I try to script my way around singular Windows servers using Powershell whenever it makes sense, but when a set of changes needs to be performed across groups of servers – especially if it’s something recurring – my tool of choice really is Ansible.

The Ansible management server (which has to be running a Unix-like system) needs to be able to communicate securely with the Windows hosts. WinRM, which is the framework used under the hood, allows for a number of protocols for user authentication and transfer of commands. I personally like to have my communications TLS secured, and so I’ve opted for using CredSSP which defaults to an HTTPS-based communications channel.

A huge gotcha: I tried running the tasks below from a Ubuntu 16.04 LTS server, and there was nothing I could do to get the Python 2.7-dependent Ansible version to correctly verify a TLS certificate from our internal CA. When I switched to running Ansible through Python 3, the exact same config worked flawlessly. The original code has been updated to reflect this state of things.

Enable CredSSP WinRM communications in Windows

Our production domain has a local Certificate Authority, which simplifies some operations. All domain members request their computer certificates from this CA, and the resulting certs have subject lines matching their hostname. The following PowerShell script will allow us to utilize the existing certificates to secure WinRM communications, along with enabling the necessary listener and firewall rules.

$hostname=hostname
# Get the thumbprint of the latest valid machine certificate
$cert=Get-ChildItem -Path cert:\LocalMachine\My -Recurse|? { ($_.Subject -match $hostname) -and ($_.NotAfter -gt $today.date) } | sort { $_.NotAfter } | select -last 1
# Enable Windows Remote Management over CredSSP
Enable-WSManCredSSP -Role Server -Force
# Set up an HTTPS listener with the machine certificate’s thumbprint
New-Item -Path WSMan:\LocalHost\Listener -Transport HTTPS -Address * -CertificateThumbPrint $cert.Thumbprint -Force
# Allow WinRM HTTPS traffic through the firewall
New-NetFirewallRule -DisplayName 'Windows Remote Management (HTTPS-In)' -Name 'Windows Remote Management (HTTPS-In)' -Direction Inbound -Protocol TCP -LocalPort 5986 -RemoteAddress LocalSubnet

Depending on your desired security level you may want to change the RemoteAddress property of the firewall rule to only allow management traffic from a single host or similar. It is a bad idea to allow remote management from untrusted networks!

Enable CredSSP WinRM communications from Ansible

To enable Ansible to use CredSSP on an Ubuntu server, we’ll install a couple of packages:

sudo apt install libssl-dev
pip3 install pyOpenSSL
pip3 install pywinrm[credssp]

We then need to ensure that the Ansible server trusts the certificates of any Windows servers:

sudo chown root our-ca.crt
sudo chmod 744 our-ca.crt
sudo mv our-ca.crt /usr/local/share/ca-certificates/
sudo update-ca-certificates

And finally we’ll tell Ansible how to connect to our Windows servers – including where to find the CA-file – by adding the following to the group_vars for the server group:

ansible_user: "username@domain.tld"
ansible_password: "YourExcellentPasswordHere"
ansible_connection: winrm
ansible_port: 5986
ansible_winrm_transport: credssp
ansible_winrm_ca_trust_path: /etc/ssl/certs

Naturally, if we’re storing credentials in a file, it should be protected as an Ansible vault.

Finally we can try our config out. Note, as mentioned in the beginning of this article, that I had to resort to running Ansible through Python3 to correctly validate my CA cert. It’s time to get with the times, folks.. 🙂

python3 $(which ansible) windowsserver.domain.tld --ask-vault-pass -m win_ping
Vault password: 
windowsserver.domain.tld | SUCCESS => {
    "changed": false,
    "ping": "pong"
}

To ensure that playbooks targeting Windows servers run using Python3, add the following to the Windows server group_vars:

ansible_python_interpreter: /usr/bin/python3  

Happy server management!