This isn’t really a blog where you will get a recipe on how to implement VMware Virtual SAN (VSAN) or InfiniBand technologies, but more a small account of my troubles I experienced yesterday with my infrastructure. I did publish a picture yesterday on twitter, that didn’t look to go.
Cause: Network infrastructure transporting the VSAN traffic because unavailable for 5-6 minutes
Issue: All VMs became frozen, as all Read/Write where blocked. I Powered Off all the VMs. Each VMs became an Unidentified object as seen above.
Remediation: Restarted all VSAN hosts at the same time, and let the infrastructure stabilize about 10 minutes before restarting the first VM.
I got myself into this state, because I was messing with the core networking infrastructure in my lab, this was not a VSAN product error, but a side effect of the network loss. After publishing this tweet and picture, I had a dinner that lasted a few hours, and when I got home, I simply decided to restart the four VSAN nodes at the same time, let the infrastructure simmer for 10 minutes while looking at the host logs, then I restarted my VMs.
Since beginning of December 2013, I’m running all my VMs direct from my VSAN datastore, no other iSCSI/NFS repository is used. If VSAN goes down, everything goes down (including Domain Controllers, SQL Server and vCenter).
As some of you know, the VSAN traffic in my lab, is being transported by InfiniBand. Each host has two 20Gbps connections to the InfiniBand switches. My InfiniBand switches are described in my LonVMUG presentation about using Infiniband in the Lab. An InfiniBand fabric needs a Subnet Manager to control the various entries, I got lucky in my first InfiniBand switch purchase, I got myself a Silverstorm 9024-CU24-ST2 model from 2005.
Yet the latest firmware that can be found on Intel’s 9000 Edge Managed Series website. And the latest firmware 188.8.131.52.1 from Jul 2012 now adds a hardware Subnet Manager. This is simply awesome for a switch created in 2005.
Okay, I disgress here…. bear with me. Now, not all the InfiniBand switches come with a Subnet Manager, actually only a select few and more expensive switches have this feature. What can you do, when you have an InfiniBand switch without a management stack, well you run the Software version of the Open Subnet Manager (OpenSM) directly on the ESXi host, or a dedicated Linux node.
Yesterday, I was validating a new build of the OpenSM daemon compiled by Raphael Schitz (@Hypervisor_fr) that has some improvements. I had placed the new code on each of my VSAN nodes, and shutdown the Hardware Subnet Manager to use only the Software Hardware Manager. It worked well enough, only seeing a simple 2 second RDP interuption to the vCenter.
It was only when I attempted to fake the death of the Master OpenSM on my esx13.ebk.lab host, that I created enough fluctuation in the InfiniBand fabric, causing an outage, that I estimate to have lasted between 3 and 5 minutes. But as the InfiniBand fabric is used to transport all my VSAN traffic at high-speed, all my VMs because frozen, all IOPs suspended, leaving me only the option to connect with the vSphere C# Client to the hosts directly, wait to see if things would stabilize. Unfortunately, that did not seem to be the case after 10 minutes, so I powered off the running VMs.
Yet each of my hosts, was now disconnected from the other VSAN nodes, and the vsanDatastore was not showing at it’s usual 24TB, but at 8TB. It bit of a panic set in, and I tweeted about a Shattered VSAN Cluster.
When I came home a few hours later, I simply restarted all my four VSAN nodes (3 Storage+Compute and 1 Compute-Only), lets some synchronization take place, and I was able to restart my VMs.
These recommendations are only if you use VSAN with an InfiniBand backbone used to replicate the storage objects across nodes. If you have a InfiniBand switch which support a hardware Subnet Manager, use it. If you have an unmanaged InfiniBand switch, you need to ensure that the Subnet Manager is kept stable and always available.
If you use InfiniBand as the network backbone for vMotion or other IP over IB, the impact of having a software Subnet Manager election is not the same (HA reactivity)
I don’t have yet a better answer yet, but I know Raphael Schitz (@Hypervisor_fr) has some ideas, and we will test new OpenSM builds for this kind of issues.
Your comments are welcome…