Hi – it’s Chris again, and this is the second in my blog series discussing the requirements for hosting primary, unstructured enterprise data in the Cloud. Recall this list of requirements from my initial post:
• Accessed like traditional storage
• Easy to use enterprise features
• Comprehensive data integrity/protection
• Data security/privacy
• Continuous availability
• Non-blocking performance
• Administrative transparency and control
• Provides good investment value
In this post, I am advocating the position that in order for a cloud storage service to be a viable option to host primary enterprise data sets, it must be accessed like traditional storage.
Why? Simply put, the design of the cloud storage service must be one that imposes no limits on how the storage can be adopted into the enterprise environment. As such, the solution should appear just like any other traditional NAS filer on your network. If the cloud storage service walks and talks exactly like the NAS filers you are already using (and you probably have at least several on your network today – possibly numbering into the hundreds or thousands), then you can instantly extend your existing environment into the cloud without any modifications to your existing enterprise application infrastructure.
So what do I mean when I say “walks and talks exactly like the NAS filers you are already using?” This list gives some tangible examples; in order to be viable as a repository for enterprise unstructured data, the cloud storage service should be:
• Mounted like any existing network share — mountable as a Unix or Windows network share exactly as if it were on your network inside your firewall.
• Accessed via a full-featured, distributed file system — accessed over existing paths, directories, permissions, and commands, and seamlessly leveraging external system integrations (e.g. LDAP), while delivering all the capabilities of a traditional session (e.g. ACLs and strong consistency). In order to provide this, the cloud storage service must be accessed via a complete, full-featured file system.
• Served up over traditional protocols across all operating systems — accessed over the protocols that your applications and operating systems use today (and have for many years) – mount the storage as a Windows share over CIFS, as a Unix share over NFS, access it as an FTP server for large file transfers, access it in an HTTP-optimized way over WebDAV, etc.
• POSIX compatible — this is perhaps a bit more esoteric than the first three, but it is no less important; in order to be viable as a repository for enterprise unstructured data, the cloud storage service should be compatible with the POSIX command set. POSIX has been around and in use in enterprises for 20+ years, and virtually all enterprise applications are written with the expectation that the POSIX command set will be available. One key piece of this is strong consistency – POSIX compatibility ensures that any read from the data set will yield the data from the most recent write. Without this, the applications must be modified to manage cases where a read is yielding out-of-date data.
There is more to this concept, but this should give you the idea of what I mean. It also follows that since Zetta was designed and built for the express purpose of hosting primary, unstructured, enterprise data in the cloud, a Zetta storage solution fulfills these requirements.
Contrast this with the current generation of HTTP-centric object stores[i]:
• HTTP object stores are not “mounted” like an existing network share, with a robust set of commands available, they are accessed using GET/PUT/POST/DELETE Web APIs, and your enterprise applications would need an overhaul to support that change.
• HTTP object stores lack the file system semantics your applications expect, instead using a simplified “bucket”/”object” storage paradigm.
• HTTP object stores leverage (often proprietary) Web APIs (either REST or SOAP), not the NFS/CIFS/FTP protocols your operating systems are made to use.
• HTTP object stores leverage an “eventual consistency” model that violates POSIX and does not ensure that you applications will read back the most-recent write. Again, your applications would require costly modification to adapt to this deficiency.
More to come next week, when I tackle the requirement that an enterprise-class cloud service should provide easy to use, enterprise features.
[i] I want to reiterate that I am not bad-mouthing the HTTP object store solutions – those solutions serve the needs they were designed to serve very well (HTTP-centric use cases; e.g. Web applications), but were not built to serve the enterprise and therefore lack multiple key features that you (an enterprise IT professional) need when you are looking to extend your existing infrastructure to the cloud without any rewrites to your existing application footprint.