Zetta Scalabytes Blog

In this blog, hear from Zetta’s founders and leaders about cloud computing, storage and data management best practices and Zetta Enterprise Cloud Storage technology.

Posts Tagged ‘file system’

Chris Schin

December 10, 2009

Hosting Primary, Unstructured Enterprise Data in the Cloud – Part 3: Easy to use, enterprise features

Chris Schin, VP Products, is responsible for coordinating all Zetta product-related initiatives including product strategy, direction, and marketing, as well as business model and go-to-market process definition. Prior to joining Zetta, Chris was acting GM and Senior Director for Symantec Protection Network, Symantec's Software as a Service platform.

 

Hi – back again to discuss the requirements for hosting primary, unstructured enterprise data in the Cloud. For your convenience, I have reprinted the list of requirements from my initial post again – note that I will continue to do this and will link from this list to previous posts as well:

 

 

In this post I’m going to discuss the position that in order for a cloud storage service to be a viable option to host primary enterprise data sets, it must provide easy-to-use, enterprise features.

 

On the face of things, this seems obvious, almost tautological: of course a solution for storing enterprise data should provide enterprise features; however, existing HTTP-centric cloud storage solutions lack key features that should be required for enterprise adoption.

 

So what are some examples of these “enterprise features?” Here is a list of things our enterprise users told us to include in the Zetta Cloud Storage service — things that they expect to find in a cloud storage solution hosting their primary data:

 

  • Parity to Traditional Arrays — in essence, the cloud storage solution should come with the features you’ve come to expect from any robust NAS array you have running as a bump on your network today – things like snapshots, mount-and-write, integration with external systems (e.g. LDAP), support for existing ACLs, etc. Without these, something will need to be written to replace these features, and enterprises don’t have the resources on hand to simply replicate features they typically get from their storage solutions.

     

  • File-based geo-replication — replication is something that IT administrators have come to expect to be facilitated by their storage technologies. And I’m not talking about the type of replication common among the HTTP cloud object stores – those services typically rely on replication as their sole form of data protection, and employ a solution that is opaque to the user. What our enterprise customers asked us for was a form of replication that results in a mountable, readable volume in another identified data center, with all of the visibility and transparency they would get if they constructed their own solution.

     

  • Capacity management & visibility — An enterprise solution should provide real-time presentation of exactly what is happening on your volumes – usage trends, system performance, and real-time access to events (including bad ones!). The fact that the volumes are resident at a service provider shouldn’t change the fact that you want transparent visibility into what is happening with YOUR data!

     

  • Instant provisioning — In this particular case, you should actually expect your cloud service provider to provide much better performance than you would find with a traditional array – with a traditional array, you need time to take down space and power, negotiate with the array vender or VAR on upfront capital cost, and install, configure and test the array. This can take weeks or months. With a cloud service provider like Zetta, you can be up and running within minutes or hours.

     

  • Native support for file-based apps — this is kind of a short restatement of my last post – an enterprise service should provide a full-featured file system that walks and talks like any other filer on your network, making it plug-and-play with existing enterprise architectures and file-based applications.

 

There is more that I could touch on here, but this should give you an idea of some of the things Zetta provides to our enterprise customers.

 

Once again, contrast this set of features with your typical HTTP-centric object stores – by design, those solutions do not provide any of the enterprise features I’ve discussed above, since by design those were built to meet a simpler set of requirements, targeted at a non-enterprise customer.

 

I’ll be back soon to discuss some of the core strengths of the Zetta solution, beginning with a discussion of the Zetta data integrity solution.

Twitter iconReading: Hosting Primary, Unstructured Enterprise Data in the Cloud – Part 3: Easy to use, enterprise featuresTweet This
Chris Schin

November 23, 2009

Hosting Primary, Unstructured Enterprise Data in the Cloud – Part 2: Accessed like traditional storage

Chris Schin, VP Products, is responsible for coordinating all Zetta product-related initiatives including product strategy, direction, and marketing, as well as business model and go-to-market process definition. Prior to joining Zetta, Chris was acting GM and Senior Director for Symantec Protection Network, Symantec's Software as a Service platform.

 

Hi – it’s Chris again, and this is the second in my blog series discussing the requirements for hosting primary, unstructured enterprise data in the Cloud. Recall this list of requirements from my initial post:

 

 

In this post, I am advocating the position that in order for a cloud storage service to be a viable option to host primary enterprise data sets, it must be accessed like traditional storage.

 

Why? Simply put, the design of the cloud storage service must be one that imposes no limits on how the storage can be adopted into the enterprise environment. As such, the solution should appear just like any other traditional NAS filer on your network. If the cloud storage service walks and talks exactly like the NAS filers you are already using (and you probably have at least several on your network today – possibly numbering into the hundreds or thousands), then you can instantly extend your existing environment into the cloud without any modifications to your existing enterprise application infrastructure.

 

So what do I mean when I say “walks and talks exactly like the NAS filers you are already using?” This list gives some tangible examples; in order to be viable as a repository for enterprise unstructured data, the cloud storage service should be:

 

  • Mounted like any existing network share — mountable as a Unix or Windows network share exactly as if it were on your network inside your firewall.

     

  • Accessed via a full-featured, distributed file system — accessed over existing paths, directories, permissions, and commands, and seamlessly leveraging external system integrations (e.g. LDAP), while delivering all the capabilities of a traditional session (e.g. ACLs and strong consistency). In order to provide this, the cloud storage service must be accessed via a complete, full-featured file system.

     

  • Served up over traditional protocols across all operating systems — accessed over the protocols that your applications and operating systems use today (and have for many years) – mount the storage as a Windows share over CIFS, as a Unix share over NFS, access it as an FTP server for large file transfers, access it in an HTTP-optimized way over WebDAV, etc.

     

  • POSIX compatible — this is perhaps a bit more esoteric than the first three, but it is no less important; in order to be viable as a repository for enterprise unstructured data, the cloud storage service should be compatible with the POSIX command set. POSIX has been around and in use in enterprises for 20+ years, and virtually all enterprise applications are written with the expectation that the POSIX command set will be available. One key piece of this is strong consistency – POSIX compatibility ensures that any read from the data set will yield the data from the most recent write. Without this, the applications must be modified to manage cases where a read is yielding out-of-date data.

 

There is more to this concept, but this should give you the idea of what I mean. It also follows that since Zetta was designed and built for the express purpose of hosting primary, unstructured, enterprise data in the cloud, a Zetta storage solution fulfills these requirements.

 

Contrast this with the current generation of HTTP-centric object stores[i]:

 

  • HTTP object stores are not “mounted” like an existing network share, with a robust set of commands available, they are accessed using GET/PUT/POST/DELETE Web APIs, and your enterprise applications would need an overhaul to support that change.

     

  • HTTP object stores lack the file system semantics your applications expect, instead using a simplified “bucket”/“object” storage paradigm.

     

  • HTTP object stores leverage (often proprietary) Web APIs (either REST or SOAP), not the NFS/CIFS/FTP protocols your operating systems are made to use.

     

  • HTTP object stores leverage an “eventual consistency” model that violates POSIX and does not ensure that you applications will read back the most-recent write. Again, your applications would require costly modification to adapt to this deficiency.

     

More to come next week, when I tackle the requirement that an enterprise-class cloud service should provide easy to use, enterprise features.

 


[i] I want to reiterate that I am not bad-mouthing the HTTP object store solutions – those solutions serve the needs they were designed to serve very well (HTTP-centric use cases; e.g. Web applications), but were not built to serve the enterprise and therefore lack multiple key features that you (an enterprise IT professional) need when you are looking to extend your existing infrastructure to the cloud without any rewrites to your existing application footprint.

 

Twitter iconReading: Hosting Primary, Unstructured Enterprise Data in the Cloud – Part 2: Accessed like traditional storageTweet This