Exchange 2013 DAG quorum lost

Today some maintenance had to be done on a Exchange 2013 mailbox server, which was in a 2-node cluster using a fileserver share as witness.

The particular Exchange server was disabled on our load balancer to drain connections. Next, the StartDagServerMaintenance.ps1 script was used to prevent new sessions and to failover the mailbox databases to the other Exchange server.

These actions were performed OK and the server was ready to be shut down and perform maintenance. After shutting down, the mailbox databases were dismounted on the second Exchange server and could not be mounted anymore. Uh-oh..

The reason for not being able to mount the mailbox databases was due to the fact that quorum was lost. I saw the following error when opening up the Microsoft Failover Cluster Manager:

The Cluster service is shutting down because quorum was lost. This could be due to the loss of network connectivity between some or all nodes in the cluster, or a failover of the witness disk.
Run the Validate a Configuration wizard to check your network configuration. If the condition persists, check for hardware or software errors related to the network adapter. Also check for failures in any other network components to which the node is connected such as hubs, switches, or bridges.

The strange thing was, that the fileserver running the witness share was fine and reachable.
Because the offline Exchange server could not be brought online in a matter of minutes, I had to override the quorum safety and bring the Cluster Service back online using the ForceQuorum command:

net start clussvc /fq

I got this command from the following Microsoft TechNet Article:

After running the command, the cluster was back online and mailbox databases were abled to be mounted again. Just before maintenance was completed on the Exchange server and before booting it up again, I disabled the Cluster Service on the secondary server because of the fact that this server was running in ForceQuorum state. This to prevent data loss or corruption.

When the server was booted up again, I started the Cluster Services on both servers and everything returned back to normal.

The reason for the lost quorum is probably due to the fact that the Cluster Service is configured with “Node Majority”, which isn’t a setting you want with 2 nodes =)
Tomorrow we will investigate if the “Node and File Share Majority” is a better choice, which probably is due to the fact that we are using a file server share as witness.

Merging a 140GB Hyper-V 2008 R2 snapshot

Last week I was notified that one of the production LUNs of a customer using Hyper-V 2008 R2 was filling up and the reason for this was a ‘deleted’ snapshot of a production system.

Deleting snapshots in Hyper-V 2008 R2 requires a shutdown of the VM in order to completely remove the snapshot (AVHD file) on your storage system. Just removing the snapshot/checkpoint using the Virtual Machine Manager is not sufficient. The AVHD file will still exist and keeps growing until you shut the VM down. This is a feature according to Microsoft.

This growing had been going on for a few weeks and the AVHD file has reached a size of 140GB. We made a rough estimation that the storage system would support a minimum of 15 MB/s throughput and with the size we had to process, this would’ve taken 2 to 3 hours. That meant 2 to 3 hours downtime for this particular VM.

Some people on the net were arguing whether extra space was required on the Cluster Shared Volume to merge the snapshot. This is not true.

Just to be sure, I created a backup of the VM just before starting the merge / shutting down the VM. After office hours I shut down the VM and kept an eye on the merge progress using the following PowerShell command:

Get-WmiObject -Namespace "rootvirtualization" -Query "select * from Msvm_ConcreteJob" | Where {$_.ElementName -eq 'Merge in Progress'}

The merge started within 5 minutes after shutting the VM down and within 15 minutes it reached about 5 percent. In just 90 minutes the merge was completed and the VM was booted back up to restore functionality!

So, snapshotting in Hyper-V 2008 R2 is still shit. It still requires downtime but not as much as calculated. This ‘feature’ is removed in Hyper-V 2012 and will automatically clean up after itself 🙂

Buggy DNS resolution using Microsoft ForeFront TMG 2010

I was experiencing very weird DNS issues with a Windows Server 2008 R2 machine.
While resolving external domain names, it would sometimes come back with a response and some times with a timeout.

I tested this using nslookup and using the server parameter to point to the Google public DNS server. I am trying to resolve



DNS request timed out.
timeout was 2 seconds.
*** Request to timed-out

DNS request timed out.
timeout was 2 seconds.
DNS request timed out.
timeout was 2 seconds.
*** Request to timed-out

DNS request timed out.
timeout was 2 seconds.
Non-authoritative answer:

As you can see, 1 out of 4 requests succeeded. Something was corrupting my DNS query.

In this scenario, Microsoft ForeFront Threat Management Gateway 2010 (TMG 2010) was used.
The client, in this case a DNS server, was placed in the internal network and was NAT’d thru the external interface of the TMG, which was an interface with public IP addresses.

Somehow, the query was not arriving at the external DNS server.
Testing the same queries directly from the TMG, no issues were active.

It had to do with the internal-external NAT translation and specific for DNS traffic, because HTTP/S traffic was working without any trouble.

After some investigation NIS (Network Inspection System, part of the Intrustion Prevention System) was doing something with the queries. In our case NIS was dropping these queries.
We added our DNS server to the NIS exclusion list and the resolution issue was gone!

Since we are yet preparing to implement an alternative to TMG we didn’t see the urge to research this issue further.

Hopefully this will help some people resolve DNS issues with their clients behind TMG.

We will add NIS exclusions to all of our internal DNS servers to prevent DNS issues to arrise in the future.

Windows Server 2012 DFS not replicating all files

In my testlab, some files were not replicating between two Windows Server 2012 fileservers with the DFS Namespace and DFS Replication role installed.

This was caused by files with the temp attribute which can be done by some applications or when you download files from the internet.
You can check if this is the case by using the PowerShell command below and use your own path name.

Get-ChildItem "C:FolderX" -Recurse | ForEach-Object -process {if(($_.attributes -band 0x100) -eq 0x100) {write-output $_}}

The command will show you all files with the attribute, per folder.

Next, if you want to change these files and remove the attribute, use the command below.

Get-ChildItem "C:FolderX" -Recurse | ForEach-Object -process {if (($_.attributes -band 0x100) -eq 0x100) {$_.attributes = ($_.attributes -band 0xFEFF)}}

After modifying my files, replication kicked in within the second and all files were replicated.

Exchange 2013 Management Shell from Windows 8

I’m testing out the new functionalities of Exchange 2013 in my testlab and get familiar with the product as we are probably going to use this in production.

While testing, I was wondering if it would be possible to manage Exchange 2013 remotely from my Windows 8 client.
Ofcourse you can use the ECP (Exchange Control Panel) but managing your environment with Powershell is something ‘more compliant’ with the management ways Microsoft sees it (And it’s cooler).

Installing the Exchange Management Shell on Windows 8 is not going to work (unless you are in the same AD domain as the Exchange server, correct me if I’m wrong).

So here I am, wanting to remotely manage Exchange 2013 and having a Windows 8 client in a workgroup.

Back in the office, my Exchange-guru colleague sees me stumble and mumble and quietly sent me a mail message with a script included. He created it on the fly and was hoping my cranky face would turn into a happy face =)
He succeeded! And that only by using 11 lines of code.. What a boss!

With all credits going to my colleague Jens Giessler, I am posting his created script on my blog, hoping other people’s faces will turn into happy ones.
Replace the bold parts with your own credentials and Exchange server FQDN, and save as a .ps1 file to run it with PowerShell.

Before running the script, you also need to enable Basic Authentication on the PowerShell virtual directory, using the ECP (Servers menu, virtual directories tab).

Oh; as with all my previous and future script postings; use them at your own risk.

Function Query-Credentials
$Global:Cred = Get-Credential -Credential <b>domainuser</b>

Function Connect-Exchange
$Session = New-PSSession -ConfigurationName Microsoft.Exchange -ConnectionUri https://exchangeserverfqdn/powershell/ -Credential $Cred -Authentication Basic -AllowRedirection
Import-PSSession $Session

#Establish connection