25 Sep 2013

Unveiling Cluster overcommit in SCVMM 2012 / Hyper-V

The story

“Why does my cluster show a warning state claiming a Cluster Overcommit?”
“What the heck does that mean?” you might be asked_.. “which resources are in a over-committed state?”_

Well first you have to be aware that the only resource VMM claims on the cluster reserve state is Memory. Nor physical CPU resources, the core-virtualization ratio or storage / network throughput are considered, again… only the physical Memory available to the cluster nodes.

Why does memory overcommit matter at all and result in warnings and decreased functionality on the cluster?

The first thing you have to remember is that Hyper-V does not allow memory overcommit. The big difference to VMware ESX is that Hyper-V will never grant more memory to virtual machines that it has physically available for assignment (which is IMHO the right way to do it BTW). If you want to know more about the differences of memory management between ESX and Hyper-V I’d recommend this post on Altaro’s Hyper-V Blog
The only exception to this rule applies when Dynamic Memory feature is used. In situation where the startup memory would exceed the available memory on the host but the minimum memory could be served, Hyper-V uses a temporary file on a storage location to skip the physical limitation just for the VM startup process. The file it creates is called “Smart Paging File”. The location for this file can be defined on a per VM base.

How does VMM calculate the Cluster reserve state?

This is a bit different as it  has been prior to 2012 version. For more Information about the method used in 2008  see here

Basically VMM calculates if the cluster would be able to server all current memory assignments if loosing the amount of nodes defined as the “Cluster Reserve” value. Meaning in a 4 Node Cluster with a Cluster Reserve value of 1 it calculates each node as a single outage and if the remaining nodes may server the used memory of the failing node.

The values that matter for the calculation are:

  • Current Memory demand of the “largest” VM on the failing host
  • Current Memory demand of all other VMs running on the failing host
  • Extra capacity on each of the remaining hosts (memory available for assignment)
  • Balance between the extra capacity and the used memory on the failing host

A little calculation example

The picture below illustrates the calculation and the resulting state of the cluster. We’ll have a look at case 1, meaning Host1 will be considered as the failing host. (This method is based on the original post of Hilton Lange (MSFT)

  • We take the virtual machine with the highest memory usage (96 GB)
  • The total memory usage of all other VMs on the host is 8GB
  • Now we look at each remaining host (Host2 through Host 4) to calculate their extra capacity:
    • (Total Memory) – (Host Reserved) – (Total Used) – (Largest demand on Host1)
    • If the extra capacity value is greater than zero, we add it to a global value called “TotalExtraCapacity”
    • We calculate the extra capacity balance  (ECB) for each host with the following formula: (TotalExtraCapacity) – (Usage of other VMs on Host1 )
    • If ECB is negative then we have a Cluster Reserve Over-Commit  (as shown on case 1)

 

Here’s another example with a 4-Node cluster and a Cluster Reserve Value of 2. This scenario is far more complex as you have to calculate each pair of hosts which can fail simultanously.

When does VMM calculate or re-calculate the state?

  • Upon a manual cluster refresh
  •  The failure or removal of nodes from the host cluster
  •  The addition of nodes to the host cluster
  •  The discovery of new virtual machines on nodes in the host cluster

How can I quickly get the status information to perform my own calculation?

I’ve created a Powershell script which reads out the relevant values for each node in the cluster. Additionally it calculates which node failure might cause a cluster overcommit. This can be useful too if you run into a overcommit warning and want to know which node might be currently suspicious for high memory usage. The script does only honour a cluster reserve value of 1^at the moment. All other scenarios where you have a value of 2 or higher have to be calculated manually. But at least you have collected the relevant memory usage information to do the maths.

Here’s a sample output…

 

And here’s the code or the copy of the script

<#
.Synopsis
   Calculates Memory Commitment on a Hyper-V Cluster within SCVMM for potential overcommitment
.DESCRIPTION
   The script uses the calculation method described in the following article:
   http://blogs.technet.com/b/scvmm/archive/2012/03/27/system-center-2012-vmm-cluster-reserve-calculations.aspx
   to check if the cluster can still serve all the running memory when a node
   in the cluster fails. The calculation currently supports a node reserve value of 1.
   Scenarios with a higher reserve values must be calculated manually. However the reported
   Memory usage Values can help for the calculation.
.PARAMETER
  Clustername
  The name of the clusterobject within VMM to be checked for memory overcommitment
.EXAMPLE
  Get-ClusterMemoryStatus.ps1 -Clustername acluster.contoso.com
.NOTES
    NAME: GetClusterMemoryStatus.ps1
    VERSION: 1.0
    AUTHOR: Michael Rueefli
    LASTEDIT: 13.09.2013
#>

[CMDLETBinding(SupportsShouldProcess = $False, ConfirmImpact = "None", DefaultParameterSetName = "")]
Param(
    [Parameter(Mandatory=$True,
    ValueFromPipeline=$True)]
    $ClusterName
)

#Set strict mode
Set-StrictMode -Version 3

Function Get-LargestVMOnHost
{
    param(
    [STRING]$nodename
    )
    $vms = Get-SCVirtualMachine -VMHost $nodename
    [ARRAY]$a = @()
    Foreach ($vm in $vms)
    {

        If ($vm.DynamicMemoryMaximumMB)
        {
            $vmmaxmem = [decimal]::round(($vm.DynamicMemoryMaximumMB)/1024)
            $vmdemandmem = "{0:N1}" -f (($vm.DynamicMemoryDemandMB)/1024)
        }
        Else
        {
            $vmmaxmem =  [decimal]::round(($vm.Memory)/1024)
            $vmdemandmem = $vmmaxmem
        }
        $vmmemstatus = New-Object -TypeName PSObject -Property @{
        VMname=($vm.Name)
        VMMemDemand=$vmdemandmem
        VMMaxMem=$vmmaxmem
        }
        $a += $vmmemstatus
    }
    Return ($a | Sort-Object VMMaxMem -Descending)[0]
}

Function Get-RemainingVMTotalMemory
{
    param(
    [STRING]$nodename,
    [STRING]$largestVMName
    )
    [INT]$TotalRemainingMemMax=0
    [INT]$TotalRemainingMemDemand=0
    $remainingVMs = Get-SCVirtualMachine -VMHost $nodename | ? {$_.Name -ne $largestVMname -and $_.IsHighlyAvailable -eq $true}
    Foreach ($rvm in $remainingvms)
    {
        If ($rvm.DynamicMemoryMaximumMB)
        {
            $vmmaxmem = [decimal]::round(($rvm.DynamicMemoryMaximumMB)/1024)
            $vmdemandmem = "{0:N1}" -f (($rvm.DynamicMemoryDemandMB)/1024)
        }
        Else
        {
            $vmmaxmem =  [decimal]::round(($rvm.Memory)/1024)
            $vmdemandmem = $vmmaxmem
        }
        $TotalRemainingMemMax += $vmmaxmem
        $TotalRemainingMemDemand += $vmdemandmem
    }

    $result = New-Object -TypeName PSObject -Property @{OtherVMMax=$TotalRemainingMemMax;OtherVMDemand=$TotalRemainingMemDemand}
    return $result
}

## Main Routine ###
If (!(Get-Module VirtualMachineManager))
{
    Import-Module VirtualMachineManager
}

$cluster = Get-SCVMHostCluster $clustername
$Clusterreserve = $cluster.ClusterReserve
$ClusterNodes = $cluster | Get-SCVMHost
$NodeCount = $Clusternodes.count
$statusreport = @()

Foreach ($node in $ClusterNodes)
{
    #Get the largest VM on Host
    $largestVMonHost = Get-LargestVMOnHost -nodename $node.Name

    #Get the Total Memory of all remaining VMs except the largest
    $remaingVMMem = Get-RemainingVMTotalMemory -nodename $node.name -largestVMName $largestVMonHost.VMName
    [INT]$otherVMMax = $remaingVMMem.OtherVMMax
    [INT]$otherVMDemand = $remaingVMMem.OtherVMDemand

    #Create new Host Object
    $hoststatus = new-object -TypeName PSObject -Property @{
    NodeName=($node.Name)
    TotalGB=([decimal]::round((Get-SCVMHost $node.name).TotalMemory/1024/1024/1024))
    ReserveGB=([decimal]::round(($node.MemoryReserveMB)/1024))
    AvailableGB=([decimal]::round($node.AvailableMemory/1024))
    LargestVMName=$largestVMonHost.VMName
    LargestVMDemandGB=$largestVMonHost.VMMemDemand
    LargestVMMaxGB=$largestVMonHost.VMMaxMem
    OtherVMTotalMaxGB=$otherVMMax
    OtherVMTotalDemandGB=$otherVMDemand
    TotalVMUsedGB=$otherVMDemand + $largestVMonHost.VMMemDemand
    }
    $statusreport += $hoststatus 

 }

  #Report Host Memory Summary
 "=============================================================================="
 "Clustername: $clustername"
 "Node Count: $nodecount"
 "Cluster Reserve Count: $clusterreserve"
 "=============================================================================="
 $statusreport | Sort-Object NodeName | Format-Table NodeName,TotalGB,AvailableGB,ReserveGB,TotalVMUsedGB,LargestVMName,LargestVMMaxGB,LargestVMDemandGB,OtherVMTotalMaxGB,OtherVMTotalDemandGB -AutoSize

 Foreach ($obj in $statusreport)
 {
    "Calculating with node failure: $($obj.NodeName)"
    $nodeotherVMTotalDemand = ($obj.OtherVMTotalDemandGB)
    $othernodes = $statusreport | Sort-Object NodeName | ? {$_.NodeName -ne $obj.NodeName}
    [INT]$totalextracapacity=0

    Foreach ($element in $othernodes)
    {

        $nodeextracapacity = (($element.TotalGB) - ($element.ReserveGB) - ($element.TotalVMUsedGB) - ($obj.LargestVMDemandGB))

        If ($nodeextracapacity -gt 0)
        {
            $totalextracapacity += $nodeextracapacity
        }

        #Report
        write-verbose "    Balance for Node: $($element.NodeName) on Failure of Node: $($obj.NodeName)"
        If ($nodeextracapacity -gt 0)
        {
            write-verbose "        Extra Capacity Balance (GB) looks ok: $nodeextracapacity"
        }
        Else
        {
            write-verbose "        Negative Extra Capacity Balance (GB): $nodeextracapacity"
        }
        #

    }
    If ($nodeotherVMTotalDemand -gt $totalextracapacity)
    {
        write-host "WARNING! Cluster will be overcommitted upon a failure of node: $($obj.NodeName)" -ForegroundColor red
    }
    Else
    {
        write-host "Cluster can serve all current Memory Resources upon a failure of node: $($obj.NodeName)" -ForegroundColor Green
    }
 }