Perform better Storage Spaces Direct maintenance with these Powershell functions

Storage Spaces Direct (S2D) is an incredibly powerful technology, and makes up a huge part of the new AzureStack HCI Solution, however performing maintenance on it can catch out new players.

One of the biggest causes of failures while performing maintenance on S2D hosts and clusters, is that the hosts haven’t been correctly put into maintenance mode, so I set out to simplify the process with 3 new functions.

The 3 activities I’ve targeted with these functions are enabling and disabling maintenance mode on a host correctly, and checking the current state of a host or cluster.

With all maintenance, it’s a good idea to make sure the cluster is healthy before starting, and that the host you’re going to make changes to is actually in maintenance mode.
To assist with this, I’ve created Get-S2DNodeMaintenanceState. It can be used to query one or more hosts, as well as one or more clusters at a time.

# Query the local host
Get-S2DNodeMaintenanceState

# Query a single host
Get-S2DNodeMaintenanceState -Name S2DHost01

# Query multiple hosts
Get-S2DNodeMaintenanceState -Name 'S2DHost01','S2DHost02'

# Query the state of all hosts in a cluster
Get-S2DNodeMaintenanceState -Cluster S2D-Cluster
Querying the state of all hosts in a cluster

Next, I have created a function to put a host into maintenance mode, which by default will follow best practice and pause the cluster node, and then put the host’s disks into storage maintenance mode.
This function is called Enable-S2DNodeMaintenance and is designed to be run against a single host at a time.

# Enable maintenance mode on the local host
Enable-S2DNodeMaintenance

# Enable maintenance mode on a remote host
Enable-S2DNodeMaintenance -Name S2DHost01

# Pause a host's cluster service but not it's storage
Enable-S2DNodeMaintenance -OnlyCluster

# Put a host's disks into Storage maintenance mode
Enable-S2DNodeMaintenance -OnlyStorage

The final action when performing maintenance is to bring a host out of maintenance mode, so we have Disable-S2DNodeMaintenance for that.

Like Enable-S2DNodeMaintenance, Disable-S2DNodeMaintenance is designed to be run against a single host at a time.

# Disable maintenance mode on the local host
Disable-S2DNodeMaintenance

# Disable maintenance mode on a remote host
Disable-S2DNodeMaintenance -Name S2DHost01

# Resume a host's cluster service but not it's storage
Disable-S2DNodeMaintenance -OnlyCluster

# Bring a host's disks out of Storage maintenance mode
Disable-S2DNodeMaintenance -OnlyStorage

All of these new functions have been uploaded to Github to make it easier for me to maintain them going forwards and can be located here.

To get these into your environment, the below script will help you download the latest version of the functions and import them.

# Download location
$FileLocation = 'C:\Scripts\S2D-Maintenance.ps1'
# Download link
$URL = "https://raw.githubusercontent.com/comnam90/bcthomas.com-scripts/master/Powershell/Functions/S2D-Maintenance.ps1"

# Download the file
[Net.ServicePointManager]::SecurityProtocol = [Net.SecurityProtocolType]::Tls12
Invoke-WebRequest -Uri $URL -UseBasicParsing -OutFile $FileLocation

# Import Functions for use
. $FileLocation

So now hopefully everyone has a much easier way of ensuring that their hosts are correctly in maintenance mode when they need it.

Update (2019-05-26):
Additional commands have been added to the S2D-Maintenance.ps1 script since this article was initially posted. These commands will be explained in a follow-up blog series on best practices for patching S2D and AzureStack HCI Clusters.

Original Code

Function Enable-S2DNodeMaintenance {
    [cmdletbinding()]
    Param(
        [alias('ComputerName', 'StorageNodeFriendlyName')]
        [string]$Name = $Env:ComputerName,
        [switch]$OnlyCluster,
        [switch]$OnlyStorage
    )
    begin {
        Write-Verbose "Checking that the required powershell modules are available"
        $AvailableModules = Get-Module -ListAvailable -Verbose:$false
        if ( $AvailableModules.Name -inotcontains "FailoverClusters" -or $AvailableModules.Name -inotcontains "Storage" ) {
            throw "Required modules FailoverClusters and Storage are not available"
        }
    }
    process {
        Write-Verbose "Checking that all volumes are currently healthy before enabling maintenance"
        $UnhealthyDisks = Get-VirtualDisk -CimSession $Name | Where-Object {
            $_.HealthStatus -ine "Healthy" -and $_.OperationalStatus -ine "OK"
        }
        if ($UnhealthyDisks.Count -gt 0) {
            throw "Cannot enter maintenance mode as the follow volumes are unhealthy`n$($UnhealthyDisks.FriendlyName -join ", ")"
        }
        if ( -not $OnlyStorage ) {
            Write-Verbose "Pausing and draining $Name"
            $ClusterName = Get-Cluster -Name $Name | Select-Object -ExpandProperty Name
            $NodeState = (Get-ClusterNode -Name $Name -Cluster $ClusterName).State
            if ( $NodeState -ieq 'Up' ) {
                Suspend-ClusterNode -Name $Name -Drain -Cluster $ClusterName -Wait | Out-Null
            }
            elseif ($NodeState -ieq 'Paused') {
                Write-Verbose "Skipping $Name as it is already paused"
            }
            else {
                Write-Warning "$Name is currently $NodeState"
            }
        }
        if ( -not $OnlyCluster ) {
            Write-Verbose "Enabling Storage Maintenance Mode on disks in $Name"
            $ScaleUnit = Get-StorageFaultDomain -Type StorageScaleUnit -CimSession $Name | Where-Object { $_.FriendlyName -eq $Name }
            $ScaleUnit | Enable-StorageMaintenanceMode -CimSession $Name | Out-Null
            Start-Sleep -Seconds 5
        }
    }
    end {
        Write-Verbose "Returning current state of host"
        Get-S2DNodeMaintenanceState -Name $Name
    }
}

Function Disable-S2DNodeMaintenance {
    [cmdletbinding()]
    Param(
        [alias('ComputerName', 'StorageNodeFriendlyName')]
        [string]$Name = $Env:ComputerName,
        [switch]$OnlyCluster,
        [switch]$OnlyStorage
    )
    begin {
        Write-Verbose "Checking that the required powershell modules are available"
        $AvailableModules = Get-Module -ListAvailable -Verbose:$false
        if ( $AvailableModules.Name -inotcontains "FailoverClusters" -or $AvailableModules.Name -inotcontains "Storage" ) {
            throw "Required modules FailoverClusters and Storage are not available"
        }
    }
    process {
        if ( -not $OnlyCluster ) {
            Write-Verbose "Disabling Storage Maintenance Mode on disks in $Name"
            $ScaleUnit = Get-StorageFaultDomain -Type StorageScaleUnit -CimSession $Name | Where-Object { $_.FriendlyName -eq $Name }
            $ScaleUnit | Disable-StorageMaintenanceMode -CimSession $Name | Out-Null
            Start-Sleep -Seconds 5
        }
        if ( -not $OnlyStorage ) {
            Write-Verbose "Resuming $Name and moving roles back"
            $ClusterName = Get-Cluster -Name $Name | Select-Object -ExpandProperty Name
            $NodeState = (Get-ClusterNode -Name $Name -Cluster $ClusterName).State
            if ( $NodeState -ieq 'Paused' ) {
                Resume-ClusterNode -Name $Name -Failback Immediate -Cluster $ClusterName | Out-Null
            }
            elseif ($NodeState -ieq "Up") {
                Write-Verbose "Skipping $Name as it is not currently paused"
            }
            else {
                Write-Warning "$Name is currently $NodeState"
            }
        }
    }
    end {
        Write-Verbose "Returning current state of $Name"
        Get-S2DNodeMaintenanceState -Name $Name
    }
}

Function Get-S2DNodeMaintenanceState {
    [cmdletbinding(DefaultParameterSetName = "Node")]
    Param(
        [parameter(ParameterSetName = "Node")]
        [alias('ComputerName')]
        [string[]]$Name = $Env:ComputerName,
        [parameter(ParameterSetName = "Cluster")]
        [string[]]$Cluster
    )
    begin {
        Write-Verbose "Checking that the required powershell modules are available"
        $AvailableModules = Get-Module -ListAvailable -Verbose:$false
        if ( $AvailableModules.Name -inotcontains "FailoverClusters" -or $AvailableModules.Name -inotcontains "Storage" ) {
            throw "Required modules FailoverClusters and Storage are not available"
        }
        if ($PSCmdlet.ParameterSetName -ieq 'Cluster') {
            Write-Verbose "Execution context: Cluster"
            Write-Verbose "Getting Node to Cluster mappings"
            $NodeMappings = @{ }
            $Name = foreach ($S2DCluster in $Cluster) {
                $Nodes = Get-ClusterNode -Cluster $S2DCluster | Select-Object -ExpandProperty Name
                Foreach ($Node in $Nodes) {
                    Write-Verbose "$Node is part of $S2DCluster"
                    $NodeMappings[$Node] = $S2DCluster
                }
                $Nodes
            }
        }
        $results = @()
    }
    process {
        Foreach ($ClusterNode in $Name) {
            Write-Verbose "$ClusterNode - Starting to gather state info"
            Write-Verbose "$ClusterNode - Gather physical disk details"
            $NodeDisks = Get-StorageSubSystem clu* -CimSession $ClusterNode |
            Get-StorageNode | Where-Object { $_.Name -ilike "*$ClusterNode*" } |
            Get-PhysicalDisk -PhysicallyConnected -CimSession $ClusterNode

            Write-Verbose "$ClusterNode - Gather Cluster Name"
            if ($PSCmdlet.ParameterSetName -ieq 'Node') {
                $ClusterName = Get-Cluster -Name $ClusterNode | Select-Object -ExpandProperty Name
            }
            else {
                $ClusterName = $NodeMappings[$ClusterNode]
            }
            Write-Verbose "$ClusterNode - Gather Cluster Node State"
            $ClusterNodeState = Get-ClusterNode -Name $ClusterNode -Cluster $ClusterName | Select-Object -ExpandProperty State
            Write-Verbose "$ClusterNode - Gather Storage Node State"
            $StorageNodeState = switch ($NodeDisks.Where( { $_.OperationalStatus -icontains "In Maintenance Mode" } ).Count) {
                0 { "Up"; Break }
                { $_ -lt $NodeDisks.Count } { "PartialMaintenance"; Break }
                { $_ -eq $NodeDisks.Count } { "InMaintenance"; Break }
                default { "UNKNOWN" }
            }
            Write-Verbose "$ClusterNode - Compile results"
            $Results += [pscustomobject][ordered]@{
                Cluster      = $ClusterName
                Name         = $ClusterNode
                ClusterState = $ClusterNodeState
                StorageState = $StorageNodeState
            }
        }
    }
    end {
        Write-Verbose "Return state details"
        $results
    }
}