If you’ve ever dealt with a SAN or Storage guy before, you’ll know that they usually have a huge passion for cache stats. This is because the secret sauce of accelerating cheap storage for years has been to stick a small amount of expensive but super fast flash in front of your slower spinning disk, or in recent years, your cheaper low endurance SSDs. Because of this, it was always a good idea to keep an eye on how your cache was going, making sure things like Cache Hit Misses were low, and that your Write Cache wasn’t overallocated.
If you’re running a Storage Spaces Direct (S2D) Cluster, you might have noticed some instability in recent months, specifically when it comes to patching and performing maintenance. Well you’re in luck because 5 days ago, Microsoft released a new KB article that helps explain why you might have seen issues. The scenario targeted by the Microsoft article is S2D Clusters running May (KB4103723) or later patch levels, where you experience Event ID 5120 during patching or maintenance, leading to things like CSV timeouts, VM pauses, or even VM crashes.
If you’ve been anywhere near Twitter or any Tech Blogs and News sites recently, you would have noticed that Microsoft have dropped their first cut of the next Long-Term Service Branch OS, Windows Server 2019, into the Windows Insider ring for people like you and me to start testing. Now most people (like me) don’t have a huge amount of spare hardware sitting round for times like this, especially for testing things like Storage Spaces Direct (S2D).
Hi all, Quite often the best information on new technology is actually found in blog posts and not actual documentation, and while the documentation for Storage Spaces Direct from Microsoft is great, some of the real gems are in the pre-GA blogs they put up. So below, I hope to keep up a list of essential blog posts from both Microsoft and independent bloggers for those of you who wish to really understand what’s happening under the hood!
UPDATE(2017-09-19): Microsoft have officially recognized the bug and have a KB describing the symptoms and workaround much like the below. See here: https://support.microsoft.com/en-us/help/4043361/disks-in-maintenance-mode-status-after-september-cumulative-update-kb I was patching our dev cluster the other day and came across a new issue when applying the latest September Cumulative Update (KB4038782), and it seems others on the internet have hit this issue as well. Background First, a bit of background on the expected behaviour when performing maintenance:
I’ve been deploying a few Storage Spaces Direct (S2D) clusters lately, and I noticed a slight mis-configuration that can occur on deployment. Normally when deploying S2D, the disk types in the nodes are detected and the fastest disk (usually NVMe or SSD) is assigned to the cache, while the next fastest is used for the Performance Tier and the slowest being used in the Capacity Tier. So if you have NVMe, SSD and HDD, you would end up with an NVMe Cache, a SSD Performance Tier and a HDD Capacity Tier.
I’ve used System Center Virtual Machine Manager for a few years now, and I’ve come to like its ups and deal with its downs. It has a lot of great features, like bare-metal deployments and logical networks, which when executed correctly are both huge time savers and take away a lot of human error. With SCVMM, let’s start with some of a new additions in 2016: Converting ‘Standard Switches’ on hosts to Logical Switches Too often in the past I’ve retro-fitted VMM into a Hyper-V Environment and had to wrestle with removing existing Standard Switches and replacing them with Logical switches and had to deal with migrating VMs, losing connectivity, rolling back.
We’ve all had the case where there was a volume running hot on your cluster and you spend ages wrestling with perf counters to try to find that VM that’s causing your storage to burn. Well let me introduce you to a magical new command in Windows Server 2016 Get-StorageQoSFlow This miracle command can give you insights on all the VHD(x)s running on your cluster, revealing IOPS, Latency and Bandwidth stats for them all without the need for large-scale monitoring solutions.
As many of you would have seen, Windows Server 2016 has been officially launched, with evaluation media available and General Availability slated for later this month. One of the great new features in this release, is Storage Spaces Direct, a Software-Defined Storage Solution. There is already plenty of information available on how to get this up and running on Technet, but I thought I’d share some of the operational tasks that aren’t so obvious, starting with expanding volumes.
While sitting in LAX Airport with some time to kill I thought I’d reflect on the last week. I’ve spent the last week at the Microsoft Ignite conference in Atlanta, and what a crazy but fantastic week it has been. Starting with only finding out 2 weeks before the start of the conference that I was invited to speak, frantically trying to sort accommodation and flights, and finally ending with speaking at the conference itself.