Monday, February 11, 2013

VMware ESXi Host Slow Boot Times

Recently, I had to create a couple of VMs that would be used with Microsoft's clustering services. As a result, these VMs were created with RDM drive mappings straight into our EMC VNX SAN. As the number of RDMs created went up, I noticed severe performance symptoms. These include:

  • Slow ESXi host boot times
    • boot screen stuck at: vmw_vaaip_cx loaded successfully
    • boot screen stuck at: Running usbarbitrator start
  • Slow HBA rescan times



After some investigation, it appears that this is a known issue in vSphere 5.0 (other versions as well). During the initial boot of an ESXi 5 host, the system attempts to discover all storage devices present to the host. Since these RDMs are likely to be in use by another host, the boot host has to wait for the process to time out as it cannot scan the LUN due to the persistent SCSI reservation. More details can be found here.

Based on my experience, the boot time delay does not depend on the size of the RDM LUNs, but rather the number of RDM LUNs. In my case, boot times were delayed for appoximately 5 minutes for each RDM LUN. Also, although the VMware knowledge base article does not mention the slow HBA rescan times, it has been my experience that RDM LUNs negatively affect them as well.

The solution to this is very simple. For ESXi 5.0, simply SSH into your ESXi host and run the following command:

esxcli storage core device setconfig -d <naa.id> --perennially-reserved=true


"naa.id" can typically found on the SAN under the properties of the RDM LUN. Run this command for every RDM LUN in the environment and the slow boot/HBA rescan times will be resolved.

Update: Script for setting the flag using PowerCLI can be found here.

No comments:

Post a Comment