Cohesity: A secondary storage solution for the Hybrid Cloud?

Background

A key part of my typical day job involves staying on top of new technologies and key developments in the world of enterprise IT, with an aim to spot commercially viable, disruptive technologies that are not just cool tech but also have a good business value proposition with a sustainable use case.

To this effect, I’ve been following Cohesity since its arrival to the mainstream market back in 2015, keeping up to date on some of their platform developments with various feature upgrades such as v2.0, v3.0…etc with interest. SFD15 gave me another opportunity to catch up with them and get an up to date view on their latest offerings & the future direction. I liked what I heard from them! Their solution now looks interesting, their marketing message is a little sharper than it was a while ago and I like the direction they are heading in.

Cohesity: Overview


Cohesity claims to be a specialist, software defined, secondary storage vendor who specializes in modernization of the secondary storage tier within the hybrid cloud. Such secondary storage requirements typically include copies of your primary / tier 1 data sets (Such as test & dev VM data and reporting & analytics data) or file shares (CIFS, NFS…etc.). These types of data  tends to be often quite large and therefore typically cost more to store and process. Therefor storing them on the same storage solution as your tier 1 data can be un-necessarily expensive which I can relate to, as an enterprise storage customer as well as a channel SE in my past lives, involved in sizing and designing various storage solutions for my customers. Often, most enterprise customers need separate, dedicated storage solutions to store such data outside of the primary storage cluster but they are stuck with the same, expensive primary storage vendors for choice. Cohesity offers to provide a single, tailor made secondary data platform that spans across both ends of the hybrid cloud to address all these secondary storage requirements. They also provide the ability to act as a hybrid cloud backup storage target too with some added data management capabilities on top so that not only can they store data backups, but also do interesting things with those backup data, across the full Hybrid Cloud spectrum.

With what appears to be decent growth last year (600% revenue growth YoY) and some good customers already onboard, it appears that customers may be taking notice too.

Cohesity: Solution Architecture


A typical Cohesity software defined storage (SDS) solution on-premises comes as an appliance and can start with 3 nodes to form a cluster that provide linear scalable growth. An appliance will typically be a 2U chassis that accommodate 4 nodes and any commodity or an OEM HW platform is supported. Storage itself consist of PCI-e Flash (up to 2TB per node) + capacity disk, which is the typical storage architecture of every SDS manufacturer these days. Again, similar to most other SDS vendors, Cohesity uses Erasure coding or RF2 data sharding across the Cohesity nodes (within each cluster) to provide data redundancy, as a part of the SpanFS file system. Note that given its main purpose as a secondary storage unit, it doesn’t have (or need) an All Flash offering, though they may move in to the primary storage use case, at least indirectly in the future.

Cohesity storage solution can be deployed across to remote and branch office locations as well as to cloud platforms using virtual Cohesity appliances to work hand in hand with the on-premises cluster. Customers can then enable cross cluster data replication and various other integration / interaction activities in a similar way to NetApp Data Fabric works for example for primary data. Note however that Cohesity does not permit the configuration of a single cluster across platforms as of yet (where you can deploy nodes from the same cluster on premises as well as on the cloud enabling Erasure Coding to perform data replication in the way Hedvig storage solution permits for example), but we were hinted that this is in the works for a future release.

Cohesity also have some analytics capabilities built in to the platform which can be handy. The analytics engine uses MapReduce natively within its engine to avoid the need to build external analytic focused compute clusters (such as Hadoop clusters) and having to move (duplicate) data sets to be presented for analysis. The Analytics Workbench on Cohesity platform currently permits external custom code to be injected in to the platform. This can be used to search for contents inside various files held on the Cohesity platform including pattern matching that enables customers to search for social security or credit card numbers which would be quite handy to enforce regulatory compliance. During the SFD15 presentation, we were explained that the capabilities of this platform is being rapidly enhanced to enhance additional regulatory compliance policy enforcements such as those of GDPR. Additional information on Cohesity Analytics capabilities can be found here. Additional video explaining how this works can also be found here.

Outside of these, given the whole Cohesity solution is backed by a distributed file system that is software defined, they naturally have all the software defined goodness expected from any SDS solution such as global deduplication, compression, replication, file indexing, snapshots, multi protocol access, Multi tenancy and QoS within their platform.

My thoughts

I like Cohesity’s current solution and where they are potentially heading. However, the key to their success in my view, would ultimately be their price point which I am yet to see to make sense of where they belong amongst competition.

From a technology and strategy standpoint, Cohesity’s key use cases are very valid and the way they aim to address those is pretty damn good. When you think about the secondary storage use case, cost of serving out less performance hungry, tier 2 data (often large and clunky in size) through an expensive tier 1 storage array (where you have to include larger SAN & NAS storage controllers + additional storage), I cannot help but think that Cohesity’s secondary storage play is quite relevant for many customers. Tier 1 storage solutions, classic SAN /NAS solutions as well HCI solutions such as VMware vSAN or Nutanix, are typically priced to reflect their tier 1 use case. So, a cheaper, more appropriate secondary storage solution such as Cohesity could help save lots of un-necessary SAN / NAS / HCI costs for many customers by being able to now downsize their primary storage solution requirements. This may even further enable more and more customers to embrace HCI solutions for their tier 1 workload too resulting in even less of a need to have expensive, hardware centric SAN / NAS solutions except for when they are genuinely necessary. After all, we are all being taught the importance of rightsizing everything (thanks to the utility computing model introduced by the Public clouds), so perhaps it’s about time that we all look to break down the tier 1 and tier 2 data in to appropriately sized tier 1 and tier 2 storage solutions to benefit from the reduced TCO for the customer? It’s important to note though, that this rightsizing will only likely going to appeal to customers with heavy storage use cases such as typical enterprises and large corporate customers rather than the average small to medium customer who requires a typical multipurpose storage solution to host some VMs + some file data. This is evident in the customer stats provided to us during SFD15, where 70% of their customers are enterprise customers.

Both their 2 key use cases, Tier 2 data storage as well as backup storage now looks to incorporate cloud capabilities and allows customers to do more than just storing tier 2 data and storing back ups. This is good and is very time relevant indeed. They seem to take a very data centric approach to their use cases and their secret source behind most of the capabilities, the proprietary file system called SpanFS looks and feels very much like NetApp’s cDOT architecture with some enhancements in parts. They are also partnering up with various primary storage solutions such as Pure to enable replication of backup snapshots from Pure to Cohesity, while introducing additional features like built in NAS data protection from NetApp, EMC, Pure, direct integration with VMware vCF for data protection, direct integration with Nutanix for AHV protection kind of moves them closer to Rubrik’s territory which is interesting and ultimately provides customers the choice which is a good thing.

From a hardware & OEM standpoint, Cohesity has partnered up with both HPe and Cisco already and have also made themselves available on HPe pricebook so that customers can order the Cohesity solution using a HPe SKU which is convenient, though I’d personally urge customers to order directly from Cohesity (using your trusted solutions provider) where possible, rather than ordering through an OEM vendor where the pricing may be fixed or engineered to position OEM HW when its not always required.

Given their mixed capabilities of tier 2 data storage, backup storage, and ever-increasing data management capabilities across platforms, they are coopeting if not competing with a number of others such as NetApp who has a similar data management strategy in their “Data pipeline” vision (who also removes the need to have multiple storage silos in the DC for Tier 2 data due to features such as Clustered Data OnTAP & FlexClones), Veeam or even Pure storage. Given their direct integration with various SW & HCI platforms removing the need to have 3rd party backup vendors, they are likely going to be competing directly with Rubrik more and more in the future. Cohesity’s strategy is primarily focused on tier 2 data management and the secondary focus is on data backups and management of that data whereas Rubrik’s strategy appears to be the same but opposite order of priorities (backup 1st, data management 2nd). Personally, I like both vendors and their solution positioning’s as I can see the strategic value in both solutions offerings for customers. But most importantly for Cohesity, there don’t appear to be any other storage vendor, specifically focused on the secondary storage market like they do so I can see a great future for them, as long as their price point remains relevant and that great innovation keeps continuing.

You can watch all the videos from the #SFD15 recorded at the Cohesity HW in Santa Clara here.

If you are an existing Cohesity user, I’d be very keen to get your thoughts, feedback using the comments section below.

A separate post to follow looking at Cohesity’s SmapFS file system and their key use cases!

Chan

Chan

Technologist, lucky enough to be working for a very technical company. Views are my own and not those of my employer..!

22 Comments

  1. We looked at both Rubrik and Cohesity when making the decision to switch our backup solution. Both products had excellent demos. In the end, we went with Cohesity because the price was less. Support has been excellent the few times we’ve needed it.

  2. Cohesity is breeze to install, manage and use. Cost justification for us was simple. By leveraging Cohesity for both file services and data protection, we were able to show a substantial savings from our existing NAS hardware solution and backup software solution. There are no additional backup agent or front-end TB fees when using Cohesity.

  3. I think that this approach is going to pay off in spades.
    I look forward to where they are going.

  4. Interesting that your view on key to success is price point, in reference to enterprise org that are looking to ‘rightsize’ secondary storage!

    • Rightsizing was for the tier 1 data set such that you can then use an appropriate tier 1 solution for that and potentially use something a little cheaper to put the rest of the data in something like a Cohesity platform (if the price point was correct. Otherwise, there’s not much point in introducing a different storage solution, for example if you have a NetApp/EMC/3PAR as your T1, it would make no sense to not put the tier 2 dataset on the same platform, albeit a different controller / cluster if you want to).

  5. We have been Cohesity users since November and couldnt be happier. The SQL clone attach is one of our favorite features and allowed us to fix a table that was mistakenly written over in under 6 minutes. Our director was extremely excited.
    Having just updated to 5.0.1 the BMR functionality with Cristie is something we are eager to put into production.

  6. We have been a Cohesity customer now for about 6 months, and I will say I definitely see much more clearly the path and marketplace Cohesity has now compared to when we first approached them.

    You’re explanation/review is pretty dead on that ultimately the price point is what is going to help them come out of the pack as more than just a backup solution. We are more of a mid-sized company and when reviewing backup solutions, Cohesity fit the bill for user interface, price point (albeit there was a little negotiation here), and then all of the additonal secondary storage features were just icing on the cake for us. We so far have only utilized backups but the idea of using Cohesity for other secondary storage uses, such as file shares, is extremely intriguing and I see us eventually expanding into that use when current file servers edge towards retirement.

    Cohesity itself is a great company, with great staff all around, I’ve never had a bad experience with anyone there. Our sales team (Jerry and Brad) from the PNW were great at helping us out with everything from sizing to pricing and being extremely patient as we waded through the sea of backup solutions. No annoying hounding, phone calls, etc. like many other sales teams (the dreaded experience of having to field the flood of calls daily after you express the slightest interest).

    Their support is also phenomenal. You call or put in an e-mail and get a response from a real technician near instantaneously and that same technician helps you through the entire problem. Having the technician work with their other resources rather than play the never ending passing game helps make them stand out from the perpetually declining support of other tech players.

    • Thanks for the comments Jon. Really great to see feedback echoing my initial thoughts from an actual customer with first hand experience..

  7. To echo the comments of Jon, the price-point, functionality, and UI are the reasons we went with Cohesity. Product growth and stellar support are the reasons we’re happy to stick with them.

  8. We tested both Rubrik and Cohesity and chose Cohesity. The interface is more polished and support is fantastic! Cohesity has also been very receptive to our feature requests.

  9. We have had Cohesity in production for 14 months now and it has completely changed for the good our backup/recover processes. While saving us >50% of our previous solutions.

    Great tech and company!

  10. Cohesity is the worst backup for sql server databases i’ve ever worked with. Don’t know about the other systems, but it clearly doesn’t work to backup databases. We have face many issues here and their customer service is a joke. Just imagine that we haven’t migrated our core system databases to cohesity as its not reliable at all.

    • Thanks for sharing Andres

      Would mind sharing a little more details on why it wouldn’t work with databases pls? We’re they SQL server or Oracle? And what exactly were the issues you were seeing with them pls if you wouldn’t mind being specific?

      Also what type of customer support issues were you having pls? Was it relevant to your database backup issues?

      Many thanks upfront for taking the time out to share details.

      Chan

      • We are having issues at least twice per week taking backups and trying to restore them for SQL Server and Oracle databases. Sometimes we have to run a restore 3 or 4 times to make it work (it just shows a random error and the workaround is to execute it again…)

        About the customer support,there are many delays related to response time and also handover information between cohesity support engineers. They ask twice for the same information or just don’t know what’s the current status of the issue.

        • For databases, Nutanix Era seems to be the real deal. Public cloud like database management, lifecycle management with backup and copy data management functionality. But with native database services fully supported by both Oracle and Microsoft.

Leave a Reply to Ben Price Cancel reply

Your email address will not be published. Required fields are marked *