Impact from Public Cloud on the storage industry – An insight from SNIA at #SFD12

As a part of the recently concluded Storage Field Day 12 (#SFD12), we traveled to one of the Intel campuses in San Jose to listen to the Intel Storage software team about future of storage from an Intel perspective (You can read all about here).  While this was great, just before that session, we were treated to another similarly interesting session by SNIA – The Storage Networking Industry Association and I wanted to brief everyone on what I learnt from them during that session which I thought was very relevant to everyone who has a vested interest in field of IT today.

The presenters were Michael Oros, Executive Director at SNIA along with Mark Carlson who co-chairs the SNIA technical council.

Introduction to SNIA

SNIA is a non-profit organisation that was formed 20 years ago to deal with inter-operability challenges of network storage by various different tech vendors. Today there are over 160 active member organisations (tech vendors) who work together behind closed doors to set standards and improve inter-operability of their often competing tech solutions out in the real world. The alphabetical list of all SNIA members are available here and the list include key network and storage vendors such as Cisco, Broadcom, Brocade, Dell, Hitachi, HPe, IBM, Intel, Microsoft, NetApp, Samsung & VMware. Effectively, anyone using any and most of the enterprise datacenter technologies have likely benefited from SNIA defined industry standards and inter-operability

Some of the existing storage related initiatives SNIA are working on include the followings.

 

 

Hyperscaler (Public Cloud) Storage Platforms

According to SNIA, Public cloud platforms, AKA Hyperscalers such as AWS, Azure, Facebook, Google, Alibaba…etc are now starting to make an impact on how disk drives are being designed and manufactured, given their large consumption of storage drives and therefore the vast buying power. In order to understand the impact of this on the rest of the storage industry, lets clarify few key basic points first on these Hyperscaler cloud platforms first (for those didn’t know)

  • Public Cloud providers DO NOT buy enterprise hardware components like the average enterprise customer
    • They DO NOT buy enterprise storage systems (Sales people please read “no EMC, no NetApp, No HPe 3par…etc.”)
    • They DO NOT buy enterprise networking gear (Sales people please read “no Cisco switches, no Brocade switches, HPe switches…etc”.)
    • They DO NOT buy enterprise servers from server manufacturers (Sales people please read “no HPe/Dell/Cisco UCS servers…etc.)
  • They build most things in-house
    • Often this include servers, network switches…etc
  • They DO bulk buy disk drives direct from the disk manufacturers & uses home grown Software Defined Storage techniques to provision that storage.

Now if you think about it, large enterprise storage vendors like Dell and NetApp who normally bulk buy disk drives from manufacturers such as Samsung, Hitachi, Seagate…etc would have had a certain level of influence over how their drives are made given the economies of scale (bulk purchasing power) they had. However now, Public cloud providers who also bulk buy, often quantities far bigger than those said storage vendors would have, also become hugely influential over how these drives are made, to the level that their influence is exceeding that of those legacy storage vendors. This influence is growing such that they (Public Cloud providers) are now having a direct input towards the initial design of the said components (i.e disk drives…etc.) and how they are manufactured, simply due to the enormous bulk purchasing power as well as the ability they have to test drive performance at a scale that was not even possible by the drive manufacturers before,  given their global data center footprint.

 

Expanding on the focus these guys have on Software Defined storage technologies to aggregate all these disparate disk drives found in their servers in the data center is inevitably leading to various architectural changes in how the disk drives are required to be made going forward. For example, most legacy enterprise storage arrays would rely on the old RAID technology to rebuild data during drive failures and there are various background tasks implemented in the disk drive firmware such as ECC & IO re-try operations during failures which adds to the overall latency of the drive. However with modern SDS technologies (in use within Public Cloud platforms as well as some new enterprise SDS vendors tech), there are multiple copies of data held on multiple drives automatically as a part of the Software Defined Architecture (i.e. Erasure Coding) which means those specific background tasks on disk drives such as ECC, and re-try mechanism’s are no longer required.

For example, SNIA highlighted Eric Brewer, the VP of infrastructure of Google who talked about the key metrics for a modern disk drive to be,

  • IOPS
  • Capacity
  • Lower tail latency (long tail of latencies on a drive, arguably caused due to various background tasks, typically causes a 2-10x slower response time from a disk in a RAID group which causes a disk & SSD based RAID stripes to experience at least a single slow drive 1.5%-2.2% of the time)
  • Security
  • Lower TCO

So in a nutshell, Public cloud platform providers are now mandating various storage standards that disk drive manufacturers have to factor in to their drive design such that the drives are engineered from ground up to work with Software Defined architecture in use at these cloud provider platforms.  What this means most native disk firmware operations are now made redundant and instead the drive manufacturer provides an API’s through which cloud platform provider’s own software logic will control those background operations themselves based on their software defined storage architecture.

Some of the key results of this approach includes following architectural designs for Public Cloud storage drives,

  • Higher layer software handles data availability and is resilient to component failure so the drive firmware itself doesn’t have to.
    • Reduces latency
  • Custom Data Center monitoring (telemetry), and management (configuration) software monitors the hardware and software health of the storage infrastructure so the drive firmware doesn’t have to
    • The Data Center monitoring software may detect these slow drives and mark them as failed (ref Microsoft Azure) to eliminate the latency issue.
    • The Software Defined Storage then automatically finds new places to replicate the data and protection information that was on that drive
  • Primary model has been Direct Attached Storage (DAS) with CPU (memory, I/O) sized to the servicing needs of however many drives of what type can fit in a rack’s tray (or two) – See the OCP Honey Badger
  • With the advent of higher speed interfaces (PCI NVMe) SSDs are moving off of the motherboard onto an extended PCIe bus shared with multiple hosts and JBOF enclosure trays – See the OCP Lightning proposal
  • Remove the drives ability to schedule advanced background operations such as Garbage collection, Scrubbing, Remapping, Cache flushes, continuous self tests…etc on its own and allow the host to affect the scheduling of these latency increasing drive maintenance operations when it sees fit – effectively remove the drive control plane and move it up to the control of the Public Cloud platform (SAS = SBC-4 background Operation Control, SATA = ACS-4 advanced background operaitons feature set, NVMe = Available through NVMe sets)
    • Reduces unpredictable latency fluctuations & tail latency

The result of all these means Public Cloud platform providers such as Microsoft, Google, Amazon are now also involved at setting industry standards through organisations such as SNIA, a task previously only done by hardware manufacturers. An example is the DePop standard which is now approved at T13 which essentially defines a standard where the storage host will shrink the usable size of the drive by removing the poor performing (slow) physical elements such as drive sectors from the LBA address space rather than disk firmware. The most interesting part is that the drive manufacturers are now required to replace the drives when enough usable space has shrunk to match the capacity of a full drive, without necessarily having the old drive back (i.e. Public cloud providers only pay for usable capacity and any unusable capacity is replenished with new drives) which is a totally different operational and a commercial model to that of legacy storage vendors who consume drives from drive manufacturers.

Another concept that’s pioneered by the Public cloud providers is called Streams which maps lower level drive blocks with an upper level object such as a file that reside on it, in a way that all the blocks making the file object are stored contiguously. This simplifies the effect of a TRIM or a SCSI UNMAP command (executed when the file is deleted from the file system) which reduces delete penalty and causes lowest amount of damage to SSD drives, extending their durability.

 

Future proposals from Public Cloud platforms

SNIA also mentioned about future focus areas from these public cloud providers such as,

  • Hyperscalers (Google, Microsoft Azure, Facebook, Amazon) are trying to get SSD vendors to expose more information about internal organization of the drives
    • Goal to have 200 µs Read response and 99.9999% guarantee for NAND devices
  • I/O Determinism means the host can control order of writes and reads to the device to get predictable response times –
    • Window of reading – deterministic responses
    • Window of writing and background – non-deterministic responses
  • The birth of ODM – Original Design Manufacturers
    • There is a new category of storage vendors called Original Design Manufacturer (ODM) direct who package up best in class commodity storage devices into racks according to the customer specifications and who operate at much lower margins.
    • They may leverage hardware/software designs from the Open Compute Project (OCP) or a similar effort in China called Scorpio, now under an organization called the Open Data Center Committee (ODCC), as well from as other available hardware/software designs.
    • SNIA also mentioned about few examples of some large  global enterprise organisations such as a large bank taking the approach of using ODM’s to build a custom storage platform achieving over 50% cost savings over using traditional enterprise storage

 

My Thoughts

All of these Public Cloud platform introduced changes are set to collectively change the rest of the storage industry too and how they fundamentally operate which I believe would be good for the end customers. Public cloud providers are often software vendors who approaches every solution with a software centric solution and typically, would have highly cost efficient architecture of using cheapest commodity hardware with underpinned by intelligent software. This will likely re-shape the legacy storage industry too and we are already starting to see the early signs of this today through the sudden growth of enterprise focused Software Defined Storage vendors and legacy storage vendors struggling with their storage revenue. All public cloud computing and storage platforms are a continuous evolution for the cost efficiency and each of their innovation in how storage is designed, built & consumed will trickle down to the enterprise data centers in some shape or form to increase overall efficiencies which surely is only a good thing, at least in my view. And smart enterprise storage vendors that are software focused, will take note of such trends and adopt accordingly (i.e. SNIA mentioned that NetApp for example, implemented the Stream commands on the front end of their arrays to increase the life of the SSD media), where as legacy storage / hardware vendors who are effectively still hugging their tin, will likely find the future survival more than challenging.

Also, the concept of ODM’s really interest me and I can see the use of ODM’s increasing further as more and more customers will wake up to the fact that they have been overpaying for their storage for the past 10-20 years in the data center due to the historically exclusive capabilities within the storage industry. With more of a focus on a Software Defined approach, there are large cost savings to be had potentially through following the ODM approach, especially if you are an organisation of that size that would benefit from the substantial cost savings.

I would be glad to get your thoughts, through comments below

 

If you are interested in the full SNIA session, a recording of the video stream us available here and I’d recommend you watch it, especially if you are in the storage industry.

 

Thanks

Chan

P.S. Slide credit goes to SNIA and TFD

Storage Field Day (#SFD12) – Vendor line up

Following on from my previous post about a quick intro to Storage Field Day (#SFD12) that I was invited to attend in San Jose this week as an independent thought leader, I wanted to get a quick post out on the list of vendors we are supposed to be seeing. If you are new to what Tech Field Day / Storage Field Day events are, you’ll also find an intro in my above post.

The event is starting tomorrow and I am currently waiting for my flight to SJC at LHR, and its fair to say I am really looking forward to attending the event. Part of that excitement is due to being given the chance to meet a bunch of other key independent thought leaders, community contributors, Technology evangelists from around the world as well as the chance to meet Stephen Foskett (@SFoskett) and the rest of the #TFD crew from Gestalt IT (GestaltIT.com) at the event. But most of that excitement for me is simply due to the awesome (did I say aaawwwesommmmmmeee?) list of vendors that we are supposed to be meeting with to discuss their technologies.

The full list & event agenda goes as follows

Wednesday the 8th

  • Watch the live streaming of the event @ https://livestream.com/accounts/1542415/events/6861449/player?width=460&height=259&enableInfoAndActivity=false&defaultDrawer=&autoPlay=false&mute=false
  • 09:00 – MoSMB presentation
    • MoSMB is a fully compliant, light weight adaptation of SMB3 made available as proprietory offering by Ryussi technologies. In effect its a BMS3 server on Linux & Unix systems. They are not a technology I had come across before so really looking forward to getting to know more about them and their offerings and their partnership with Microsoft…etc.
  • 10:00 – StarWind Presents
    • Again, new technology to me personally, which appears to be a Hyper-Converged appliance that seem to unify commodity server disks and flash with multiple hypervisors. Hyper-Converged platforms are very much of interest to me and I know the industry leading offerings on this front such as VMware VSAN & Nutanix fairly well. So its good to get to know these guys too and understanding what are their Unique Selling Points / differentiators to the big boys.
  • 13:00 – Elastifile Presents
    • Elastic Loud File System from Elastafile is supposed to be able to provide application level distributed file / object system spanning private cloud and public cloud to provide a hybrid cloud data infrastructure. This one is again new to me so keen to understand more about what makes them different to other similar distributed object / storage solutions such as HedVig / Scality from my perspective. Expect my analysis blog post on this one after I’ve met up with them for my initial take!
  • 16:00 – Excelero Presents (hosted at Excelero office in the Silicon Valley)
    • These guys are a new vendor that is literally due to launch themselves on the same day as we speak to them. Effectively they don’t exists quite yet. So quite exciting to find out who they are what they’ve got to offer us in this increasingly growing, rapidly changing world of enterprise IT.
  • 19:00 – Dinner and Reception (Storage Cocktails?) with presenters and friends at Loft Bar and Bistro in San Jose
    • Good networking event with the presenters from the day for peer to peer networking and further questioning on what we’ve heard from them during the day.

Thursday the 9th of March

  • 08:00 (4pm UK time) – Nimble Storage Presents
    • Nimble are a SAN vendor that I am fairly familiar with and have known them for a fairly long time and I also have few friends that work at Nimble UK. To be fair, I was never a very big fan of Nimble personally as a hybrid SAN vendor as I was  more a NetApp, EMC, HPe 3Par kinda person for hybrid SAN offering which I’ve always thought offer the same if not better tech for roughly a similar price point, with the added benefit of being large established vendors. Perhaps I can use this session to understand where Nimble is heading now as an organisation and what differentiators / USP’s they may have compared to big boys and how they plan to stay relevant in an industry which is generally in decline as a whole.
  • 10:45 – NetApp Presents (At NetApp head office in Silicon Valley)
    • Now I know a lot about NetApp :-). NetApp was my main storage skill in the past (still is to a good level) and I have always been very close to most NetApp technologies, from both presales and deliver perspective and was also awarded as the NetApp partner System Engineer of the Year (2013) for UK & Ireland by NetApp. However since the introduction of cDOT properly to their portfolio, I’ve always felt like they’ve lost a little market traction a little. I’m very keen to listen to NetApp’s current messaging and understand where their heads are at, and how their new technology stack including SolidFire is going to be positioned against other larger vendors such as Dell EMC, HPe 3Par as well as all the disruption from Software Defined storage vendors.
  • 12:45 (20:45 UK time) – Lunch at NetApp with Dave Hitz
    • Dave Hitz  (@DaveHitz) who was the NetApp founder is a legend… Nuff said!
  • 14:00 – Datera Presents
    • Datera is a high performance elastic block storage vendor and is again quite new to me. So looking forward to understanding more about what they have to offer.
  • 19:30 – San Jose Sharks hockey game at SAP Center
    • Yes, its an evening watching a bit of Ice Hockey which, I’ve never done before. To be clear, Ice Hockey is not one of my favourite sports but happy to take part in the event :0).

Friday the 10th of March

  • 09:00 (17:00 UK time) – SNIA Presents (@Intel Head office)
    • The Storage Networking Industry Association is a non profit organisation made up of various technology vendor companies.
  • 10:30 (18:30 UK time) – Intel Presents (@Intel Head office)
    • I don’t think I need to explain / introduce Intel to anyone. If I must, they kinda make some processors :-). Looking forward to visiting Intel office in the valley.

All and all, its an exciting line up of vendors and some old and some new vendors which I’m looking forward to meeting.

Exciting stuff, cant wait…! Now off to board the flight. See you on the other side!

Chan