VMware VSAN – Why VSAN (for vSphere)?

I don’t really use my blog for product marketing or as a portal for adverts for random products. Its purely for me to blog about technologies I think are cool, awesome, and why I think they are really worth looking in to. On that note, I’ve always wanted to write a quick blog post about VMware VSAN when the first version of it was released with vSphere 5.5 a while back, because I was really excited about the technology and what it could do as it goes through the typical evolution cycle. But at the same time, I didn’t want to come across as I’m aiding the marketing of a brand new technology that I haven’t seen performing in real life. So I kinda reigned myself in a little from blogging about it as I wanted to sit back and wait to see how well it performs out in the real world and whether the architecturally sound technology would actually live up to its reputation & potential out in the field.

And guess what? It sure has lived up to it….. To be honest, even far better than I thought…. and with the most recent release (version 6.1 with ESX 6.1), its grown in its enterprise capabilities significantly as well. Latest features such as Stretched VSAN cluster (Adios Metro Clusters for vSphere), branch office solution (VSAN ROBO), VSAN replication, SMP FT support, Windows failover clustering support and Oracle RAC support….etc.. (more details here) have truly made it an enterprise storage solution for vSphere. And with the massive uptake of HCI solutions (Hyper-converged Infrastructure) by customers where VSAN is also a key part (think VMware Evo:RAIL) as well as with over 2500 global customer base who’re already using it for production use as their preferred storage solution of choice for vSphere (some of the key ones include Walmart, Air France, BAE, Adobe & a well known, global social media site), its about time I start writing something about it, just to give you my perspective…!!

I will aim to put a series of articles about VSAN, addressing number of different aspects of it over the course of next few weeks beginning with the obvious, below.

Why VSAN?

I’ve been a traditional SAN storage guy out in the field where I’ve worked hands on with key enterprise SAN storage tech from NetApp, EMC, HP….etc. for a long time. I’ve worked with these in all aspects, starting from presales , design, deployment and ongoing support. They are all very good I still like (some of) their tech and they sure do have a definite place in the Datacenter still. But they are a nightmare to size accurately, nightmare to design and implement and even a bigger nightmare to support when in production use, and that’s from a techie’s perspective. From a business / commercials perspective, not only are they expensive to buy upfront and maintain, but they typically come with an inevitable vendor lock-in that keeps you on the hook for 2-5 years where you have to buy substantially overpriced components for simple capacity upgrades. It is also very expensive to support (support costs are typically 17%-30% of the cost of SAN) and can be even more expensive when the originally bought support period runs out because the SAN vendor would typically make the support renewal cost more expensive than buying a new SAN, forcing you down to buy another. I suppose this is how the storage industry has always managed to pay for itself to keep innovating & survive but many customers and even startup SAN vendors are waking up to this trick and have now started to look at alternative offerings with a different commercial setup.

As an experienced storage guy, I can tell you first hand that the value of enterprise SAN storage is NOT really in the tin (disk drives or the blue / orange lights) but in fact in the software that manage those tin elements. Legacy storage vendors make you pay for that intelligence once, when you buy the SAN with its typical controllers (brains) where this software live and then every time you add additional disk shelves through guaranteed over priced shelf upgrades subsequently (ever heard your sales person tell you to estimate  all your storage needs for the next 5 years and buy it all up front with your SAN as its cheaper that way??). SAN vendors have been able to overcharge for subsequent shelf upgrades simply because they have managed to get the disk drive manufacturers to inject some special code (proprietary drivers) on to the disk firmware without which their SAN will not recognise the disks in its system so the customer cannot just go buy a similar disk elsewhere, even if that was the same disk made by the same end manufacturer (vendor lock-in). This overpricing is how the SAN vendor gets the customer to pay for their software intelligence again, every time you add additional capacity. I mean think about it, you’ve already paid for the damn SAN and its software IP when buying the SAN in the first place, so why pay for it again through paying over the odds when adding some more shelves to it (which after all, only contain disk drives with no intelligence) to expand its capacity?

To make it even more worse, the SAN vendor then comes up with a brand new version of the SAN in few years time (typically in the form of new software that cannot run on the current hardware you have, or a brand new SAN hardware platform all together). And your current SAN SW has now been made end of life therefore is not in support anymore (even though its working fine still). Now, you are stuck with an artificially created scenario (by the SAN vendor of course and forced upon you) where you cannot carry on running your existing version without paying a hefty support renewal fee (often artificially bloated by the vendor to be more expensive than a new HW SAN) nor can you simply  upgrade the software on the current hardware platform as the new SW is no longer supported by the vendor on your existing HW platform anymore. And transferring the software license you’ve already bought over to a new set of hardware (new SAN controllers) is strictly NOT allowed either.. (A carefully orchestrated and a very convenient scenario isn’t it for the SAN vendor?). Enters the phrase “SAN upgrade” which is a disruptive, labourous and worst of all an un-necessary expense where you are now indirectly forced by the vendor to pay again for the same software intelligence that you’ve already supposedly paid for, on a different set of hardware (new SAN). This is a really good business model for the SAN vendor and there’s also a whole eco system of organisations that benefit massively from this recurring (arguably never ending) procurement cycle, at the expense of the customer.

I see VMware VSAN as one of the biggest answers to this, for the vSphere shared storage use cases… With VMware VSAN, you have the freedom to choose your hardware including cheaper commodity hardware where you only pay the true cost of the disk drive based on its capacity without having to also pay a surcharge for the software intelligence every time you add a disk drive to your SAN. With VSAN which is licensed per CPU socket instead of per capacity unit (MB/GB/TB) so you pay for the software intelligence once irrespective of the actual capacity, during the initial procurement and that’s it. For every scale up requirement (adding capacity), you can simply just buy the disk drives at their true cost and add it to existing nodes. If you need to scale out (add more nodes), you then pay for the CPU sockets on the additional node(s). That to me sounds a whole lot fairer than the traditional SAN vendors model of charging for software upfront and then charging for it again indirectly during every capacity upgrade & SAN upgrade. Unlike traditional SAN vendors, every time a new version of the (VSAN) software comes out, you upgrade your ESXi version which is totally free of charge (if you have on going support) so you never have to pay for the software intelligence again (even when the ESXi host hardware replacement is required in future, you can reuse the VSAN licensing on the new HW nodes which is something traditional SAN vendors don’t let you do)

Typically, due to all these reasons, a legacy HW SAN would cost around $7 – $10 per GB whereas with VSAN, it tends to be around $1 – $2 mark, based on the data I’ve seen.

A simple example of upfront cost comparison is below. Note that show only shows the difference in upfront cost (CAPEX) and doesn’t take in to account ongoing cost differences which makes it even more appealing, due to the reasons explained above.

1

Enough of commercial & business justification as to why VSAN is better. Lets look at few of the technology & operational benefits.

  • Its flexible
    • VSAN being a software defined storage solution gives the customer the much needed flexibility where you are no longer tied in to a particular SAN vendor.
    • You no longer have to buy expensive EMC or NetApp disk shelves either as you can go procure commodity hardware to design your DC environment as you see fit
  • Its a technically better storage solution for vSphere
    • 4
    • Since VSAN drivers are built in to the ESXi kernel itself (Hypervisor), its directly in the IO path of VM’s which gives it superior performance with sub millisecond latency
    • Also tightly integration with other beloved vSphere features such as VMotion, HA, DRS and SVMotion as well as other VMware Software Defined Datacenter products such as vRealize Automation and vSphere replication.
  • Simple and efficient to manage
    • 2
    • Simple setup (few clicks) and policy based management, all defined within the same single pane of glass used for vSphere management
    • No need for expensive storage admins to manage and maintain a complex 3rd party array
    • If you know vSphere, you pretty much know VSAN already
    • No need to manage “LUNs” anymore – If you are a storage admin, you know what a nightmare this is, including the overhead of the management of the HW fabric too.
  • Large scale out capability
    • Support up to 64 nodes currently (64 node limitation is NOT from VSAN but from underlying vSphere. This will go up with future versions of vSphere)
    • 6,400 VMs / 7M iops / 8.8 petabytes
  • High availability
    • 3
    • Provide 99.999 for availability by default
    • No single point of failure due to its distributed architecture
    • Scaling out (adding nodes) or scaling up (adding disks) does not require downtime ever again.

This list can go on but before this whole post end up looking like a product advert on behalf of VMware, I’m going to stop as I’m sure you get my point here…

VMware VSAN to me,  now looks like a far more attractive proposition for vSphere private cloud solutions than having to buy a 3rd party SAN. Some of the new features that will be coming out in the future (NSX integration…etc.) will make it even a stronger candidate for most vSphere storage requirements going forward no doubt. As a technology its sound, backed by one of the most innovative companies on the planet, designed from ground up to work without the overhead of a file system (WAFL people might not like this too much, Sorry guys!) and I would keep a keen eye on how VMware VSAN would be eating in to lots of typical vSphere storage revenue from the legacy hardware SAN vendors over the next few years. Who knows, EMC may well have seen this coming some time ago which may have contributed towards the decision to merge with Dell too.

If you have a new vSphere storage requirement, my advice would be to strongly consider the use of VSAN as your first choice.

In the next post of this series, I will attempt to explain & summarise the VSAN sizing and design guidelines.

Cheers

Chan