Friday, August 7, 2015

Finding VMs Affected by HA Failover for a Specific Cluster

While vSphere HA is a lifesaver in terms of managing and automatically recovering from host failures, finding out the VMs that were impacted is not as easy. In order to find out which VMs were restarted successfully by HA, you would have to drill down into the vCenter event logs. This is normally not an issue for small and medium environments, but could be an issue for larger environments where many events are generated, causing the logs to roll over.

Luckily, PowerCLI can be used to get a list of all VMs affected by a HA failover. The original script was obtained from here, and has been modified to allow a specific cluster from being specified.


Pre-requisites:

  • Name of the vCenter Server where the failover occurred
  • Name of the ESXi cluster where the failover occurred
  • Hainfo.ps1 script
  • PowerCLI 5.1 or higher installed
  • A minimum of read-only access to the vCenter

Usage:

  1. Open a PowerCLI console
  2. Change the working directory to where the hainfo.ps1 script is located
  3. Run the powershell script with the required parameters shown below

    .\hainfo.ps1 –vcenter vcenter_name –cluster cluster_name

    By default, the script above will look at all vCenter Events within the last 7 days up to a maximum of 100,000 events and will search for any HA failover events experienced by the specified cluster.
  4. Observe the output to the console for the list of VMs affected by the HA failover
  5. If there are too many affected VMs listed in the console, the output can be piped to a text file

    .\hainfo.ps1 –vcenter vcenter_name –cluster cluster_name > c:\results.txt


hainfo.ps1 Script:


param(
    [string]$vcenter = $null,
[string]$cluster = $null,
    [int]$last = 7,
[int]$maxEvents = 100000,
    [switch]$help = $false
)

Write-Host "`nScript to generate list of successful and failed VM restart attempts after a HA host failure."

if ($help)
{
Write-Host "`nRequired parameters:
`n-vcenter server_name connect to vCenter server servername
`n-cluster cluster_name check for HA failovers in the cluster_name cluster
`n
`nOptional Parameters:
`n-help display this help
`n-last n analyze events from the last n days (default is 7)
`n-maxEvents n analyze a maximum of n events (default is 100,000)
"
exit
}
else
{
Write-Host "`n(Use -help for list of parameters)"
}

if (!(get-pssnapin -name "VMware.VimAutomation.Core" -ErrorAction SilentlyContinue )) { add-pssnapin "VMware.VimAutomation.Core" }

Write-Host "`nConnecting to vCenter server $vcenter ..."
try
{
Connect-VIServer $vcenter | Out-Null
Write-Host "Success: Connected to vCenter Server $vcenter . "
}
Catch
{
Write-Host "Error: Unable to connect to vCenter Server $vcenter . "
exit
}


$Date = (Get-Date).AddDays(-$last)


write-host "`nList of Successful VM restarts on cluster $cluster"

Get-VIEvent -maxsamples $maxEvents -Start $Date -type warning | Where {($_.EventTypeID -eq "com.vmware.vc.ha.VmRestartedByHAEvent") -and ($_.ComputeResource.Name -eq $cluster)} | Select CreatedTime, ObjectName,FullFormattedMessage

write-host "`n`nList of Unsuccessful VM restarts on cluster $cluster"

Get-VIEvent -maxsamples $maxEvents -Start $Date -type warning | Where {($_.FullFormattedMessage -like "vSphere HA stopped trying*") -and ($_.ComputeResource.Name -eq $cluster)} | Select CreatedTime, ObjectName,FullFormattedMessage



Disconnect-VIServer -Force -Confirm:$false



Optional Parameters

Optional Parameters and their defaults are shown below:

-help display the help menu
-last n analyze events from the last n days (default is 7)

-maxEvents n analyze a maximum of n events (default is 100,000)