Zetta Scalabytes Blog

In this blog, hear from Zetta’s founders and leaders about cloud computing, storage and data management best practices and Zetta Enterprise Cloud Storage technology.

Posts Tagged ‘Cloud Requirements’

Chris Schin

March 31, 2010

Hosting Primary, Unstructured Enterprise Data in the Cloud – Part 8: Administrative Transparency and Control

Chris Schin, VP Products, is responsible for coordinating all Zetta product-related initiatives including product strategy, direction, and marketing, as well as business model and go-to-market process definition. Prior to joining Zetta, Chris was acting GM and Senior Director for Symantec Protection Network, Symantec's Software as a Service platform.

Hi — this blog series contains concepts that we used to design the Zetta storage solution, based on feedback from enterprise IT professionals and their needs.

 

Here is an outline of this series and hyperlinks to previous posts:

 

This post discusses how a service provider can engender trust from customers through transparent access to administration tools and system information.

 

A good software user interface enables easy & quick access: to information about the functioning of the system (monitor), and to the features available to the user (manage). Placed in the context of an IT storage professional, such a UI should provide:

 

    Zetta Storage Screen Shot

  • An intuitive interface; one that behaves like existing filer controls and enables rapid navigation to trending information and features

     

  • A robust control framework — designed for IT professionals — one that enables access management, access logging, and controls for things like snapshots and replication

     

  • Transparent visibility into storage solution behavior — both good and bad events should be surfaced in order to provide the user confidence that he has access to all available events that are relevant to his data set

     

  • Instant access to support and knowledge, in the form of online ticketing and a maintained knowledgebase

     

  • Zetta Events Screen Shot

  • Both actionable alerts to respond to, and automated self-healing capabilities; what this amounts to is a notification framework with some auto-corrective actions

     

  • The ability to delegated administration based on granular roles and permissions, leveraging existing LDAP permissions

     

  • Access from anywhere (i.e. Web-based)

     

This may not seem like a long or onerous list, but if you have any experience with the UIs of either enterprise NAS filers or cloud storage providers, you’ll have noticed that many of these seemingly simple requirements were not fulfilled.

Twitter iconReading: Hosting Primary, Unstructured Enterprise Data in the Cloud – Part 8: Administrative Transparency and ControlTweet This
Chris Schin

March 03, 2010

Hosting Primary, Unstructured Enterprise Data in the Cloud – Part 7: Non-blocking Performance

Chris Schin, VP Products, is responsible for coordinating all Zetta product-related initiatives including product strategy, direction, and marketing, as well as business model and go-to-market process definition. Prior to joining Zetta, Chris was acting GM and Senior Director for Symantec Protection Network, Symantec's Software as a Service platform.

Hello again and welcome back to my blog series outlining what our customers told us they wanted to see in a cloud storage solution before they would put primary copies of their enterprise data in the cloud. Again, it is important to note that these requirements drove the design and development of the solution we have in market today.

 

This is the outline of the series and hyperlinks to previous posts:

 

This post discusses how a service provider must create a storage solution architecture that can ensure “non-blocking” performance, enabling it to adapt to multiple customer access patterns simultaneously.

 

There is no question that innovations have allowed today’s traditional arrays to scale to huge capacity — hundreds of terabytes per array. But the core array architecture has changed little across time, and this architecture can limit the amount of additional capacity that can be added, and can even prevent existing capacity from being utilized adequately. A massive scale, multi-tenant architecture requires a fundamentally different design — one that borrows heavily from distributed systems design principles.

 

There are effectively three components to any storage solution: the network, the controller, and disk. In a traditional array, purchase-time decisions are made that determine the ratios of each of these to the others, and those decisions are very difficult to alter once the array has been deployed. Unfortunately, circumstances change, and one of these three components almost always becomes the bottleneck, preventing full utilization of the other components. For example, if the workload winds up being more controller-intensive than expected, the disks won’t ever be filled.

 

A service provider who tries to construct a storage service using a series of high-priced, traditional arrays will fall prey to this dynamic in a very acute way — installing multiple arrays doesn’t obviate this issue, it expands it. This is augmented by the fact that there is literally no way to plan in advance for customer behavior when the customer isn’t even identified prior to array purchase, as is the case for a cloud storage service provider.

 

A cloud service provider shouldn’t attempt to use traditional vendor-produced arrays to create a storage service — the costs don’t add up, any single customer’s access pattern could negatively impact others, and the fundamental array architecture is in conflict with the notion of a storage service.

 

Instead, a storage service must be architected using Internet-centric distributed computing principles. Each of the tiers of the architecture — throughput, IOPs, and density — should be able to scale independently of any other tier, allowing the service provider to adapt to customer behavior — singly and in aggregate — as necessary to ensure adequate performance to all and adequate system resource utilization in the aggregate.

 

One additional best practice to mention: unlike computer processors, disks are mechanical devices — they spin at a certain maximum rate. As a result, if enough IOPs hit a disk at the same time, the disk can become snarled and disk throughput can fall off a cliff. Since both IOPs and density are determined by the disk, a storage service should provide a QOS engine — similar to a computer’s scheduler — to ensure that disks never reach a point-of-no-return under load, where IOPs begin to slow exponentially.

Twitter iconReading: Hosting Primary, Unstructured Enterprise Data in the Cloud – Part 7: Non-blocking PerformanceTweet This
Chris Schin

January 19, 2010

Hosting Primary, Unstructured Enterprise Data in the Cloud – Part 6: Continuous Availability

Chris Schin, VP Products, is responsible for coordinating all Zetta product-related initiatives including product strategy, direction, and marketing, as well as business model and go-to-market process definition. Prior to joining Zetta, Chris was acting GM and Senior Director for Symantec Protection Network, Symantec's Software as a Service platform.

For those of you just joining here, I’m using this blog series to document what enterprise IT professionals have told us about the baseline requirements that would need to be met by a cloud storage service before they would consider storing their enterprise primary data in the cloud. This list outlines the high-level requirements and hyperlinks to previous posts:

 

 

This post lists a few questions you should ask your cloud storage vendor about their architecture for delivering availability before considering placing a primary copy of your data in their cloud:

 

  • “Does your solution have redundant network links from different top-tier networking providers?” It must; networks go down every day, no matter how expensive they are or what brand is behind them. Redundancy in networks is a baseline requirement for placing primary data in the cloud.

     

  • “Does your solution reside in a data center that has redundant power and cooling?” It must; if the environs of the systems holding your data are not adequately protected, failure of the solutions is inevitable, resulting in availability outages.

     

  • “Does your solution offer triple-layer redundancy at the storage controller tier at no additional cost?” It must; the controller tier holds the brains of the storage solution, and cannot afford downtime or corruption — this is not only key to system availability, but extends to data integrity as well.

     

  • “Does your solution leverage an advanced RAID algorithm to ensure that the data is available?” It must; holding single copies of data in multiple locations is not nearly as available and protected as holding RAID-6-protected copies of data in multiple locations.

 

Before you even consider putting a primary copy of your data into a cloud storage provider’s infrastructure, you should certainly ask these questions and receive detailed, satisfactory answers. If you are using a cloud solution today and don’t know the answers to these questions (or even whom to ask these questions), then you should be concerned about the availability and protection of your data.

 

Zetta’s CTO, Jeff Whitehead, is fond of using a nuclear submarine analogy when discussing system availability, as in “imagine you are on a nuclear submarine right now — would you be satisfied knowing that submarine was highly available, or would you demand that it be continuously available?” An enterprise solution must be built to the stringent demands of an enterprise IT professional, and when it comes to data, an enterprise IT professional demands continuous availability.

Twitter iconReading: Hosting Primary, Unstructured Enterprise Data in the Cloud – Part 6: Continuous AvailabilityTweet This
Chris Schin

January 12, 2010

Hosting Primary, Unstructured Enterprise Data in the Cloud – Part 5: Data Security/Privacy

Chris Schin, VP Products, is responsible for coordinating all Zetta product-related initiatives including product strategy, direction, and marketing, as well as business model and go-to-market process definition. Prior to joining Zetta, Chris was acting GM and Senior Director for Symantec Protection Network, Symantec's Software as a Service platform.

Happy New Year! I’m back with part 5 of a nine-part blog series that describes the requirements for hosting primary unstructured enterprise data in the cloud.

 

This entire blog series includes an introduction and the following set of requirements:

 

 

When talking to enterprise IT professionals (our customers), the second-most frequently-referenced concern/consideration (second only to “don’t lose or corrupt my data,” which was covered in my last post) is “don’t let anyone else see or steal my data.”

 

As this first post of the year comes right after Network World has named Zetta as one of the ‘10 Storage Startups to Watch,’ I would like to say that it is certainly rewarding to see editors such as Jon Brodkin recognize that while “many companies are concerned about the safety of trusting their information to a third party, to help ease those concerns, Zetta has built a system that encrypts data at rest, and can withstand multiple hardware and network failures without losing data.” There are certain baseline security/privacy criteria that must be met prior to trusting a cloud storage solution with primary copies of enterprise data.

 

  • Wireline encryption: Using a storage service (as opposed to an inside-the-firewall solution) clearly implies a need to secure the data in transit from the enterprise to the service. Fortunately, this is increasingly facilitated by the protocols themselves. Most file transfer protocols and Web-optimized storage protocols have encrypted versions readily available today, including sFTP, FTPS, and Secure WebDAV, run over HTTPS. Even traditional storage access protocols are building in wireline encryption in recognition of our increasingly Internet-driven existence, such as NFSv4.

     

    While we encourage customers to use these encrypted protocols, there are clearly use cases that require the use of unencrypted protocols. The solutions here are also tried and true — either encrypt prior to sending the data, contract for a dedicated network link, or work with the service provider to put in place a secure tunnel, such as a VPN.

     

  • Logical partitioning within multi-tenancy: By some definitions (certainly mine), a service must be multi-tenant before it can be considered a “cloud” service[i]. In order for enterprise IT professionals to have confidence using a cloud storage service for enterprise data, they must know that their data cannot be accessed while resident in service infrastructure. The first step to this is to ensure logical separation between customers at the “front door” of the service infrastructure — the initial customer access point to the service. Virtualization makes this easy — simply house every customer’s mountpoint as a unique URI within a distinct virtual machine instance. This way, you know that your access point is completely unique to you, and is not a shared resource comingled with other users.

     

  • At-Rest Encryption: By far the most significant feature to ensuring data security is default encryption at rest, supplied by your service provider at no additional cost. Ideally this should be facilitated by a full Public Key Infrastructure (PKI) backed by FIPS 140-2 compliant key repositories, with advanced bit encryption, a robust key rotation scheme, and ideally per-customer or per-volume keys. Strong encryption at rest is really table stakes for any enterprise-class data storage service.

 

To reiterate a common theme across these posts — it is important to remember that these are the baseline requirements that your cloud storage provider should take in consideration from the development phase. These types of customer requirements drove the design of the Zetta storage solution, which was built specifically to house enterprise primary data in the cloud.

 

I’ll be back in a few days to touch on the next requirement, continuous availability architecture.

 


[i] Note that this is not a statement unique to storage services, but to any kind of service.

 

Twitter iconReading: Hosting Primary, Unstructured Enterprise Data in the Cloud – Part 5: Data Security/PrivacyTweet This
Chris Schin

December 16, 2009

Hosting Primary, Unstructured Enterprise Data in the Cloud – Part 4: Comprehensive data integrity/protection

Chris Schin, VP Products, is responsible for coordinating all Zetta product-related initiatives including product strategy, direction, and marketing, as well as business model and go-to-market process definition. Prior to joining Zetta, Chris was acting GM and Senior Director for Symantec Protection Network, Symantec's Software as a Service platform.

Hi – for those of you just tuning in, this is part four of a 9-part blog series in which I am describing the Zetta solution, and how it is built from the ground up to host primary, unstructured enterprise data in the Cloud. Here again is the list of requirements; the first two have already been addressed, along with an introduction to the series:

 

 

In this post I’m going to discuss the position that in order for a cloud storage service to be a viable option to host primary enterprise data sets, it must provide comprehensive data integrity/protection.

 

At Zetta, our most important design consideration was data integrity (or data “protection” – whatever term is used, this is the idea that we won’t allow data to be lost or corrupted), since ultimately a data storage solution is worthless if it allows the loss or corruption of data.

 

A dirty little secret in the storage community is that data corruption happens all the time – though the relative rate of corruption seems low on its face, the increasing scale of data stored guarantees that corruption events are always occurring. For more on this topic at a deeper technical level, along with a calculator to help you gauge your own data integrity risk, please see JW’s post on Calculating Mean Time To Data Loss (and probability of silent data corruption).

 

So any solution for storing primary enterprise data MUST assume data corruption will happen, and must be designed to adapt to that reality and repair corrupted data, thereby guaranteeing data integrity.

 

Here are some of the unique ways the Zetta solution has been designed to automatically detect data corruption and repair it; taken collectively, these tools truly give Zetta an unparalleled data integrity profile.

 

Zetta Comprehensice Data Integrity/Protection Requirements

  • Write Receipts — Zetta creates a strong SHA-1 hash of every file that enters a Zetta customer virtual volume, and we do two things with that hash (one of which is optionally available at the customer’s request, one mandatory though transparent to the customer).
    • First, at a customer’s request, we can place these hashes on the customer’s volume, allowing a customer to ensure that what we have stored at Zetta is what was sent by the customer.
    • Second, we store each hash in perpetuity. This allows us to compare a read file with the one that was originally received; if there is any difference, we repair the file before completing the read, guaranteeing that what is read is identical to what was written.

       

  • RAIN6 N+3 — Zetta employs a best-in-class RAID algorithm. It is based on RAID 6 (based on Reed Solomon encoding), and adds an additional parity node (RAID 6 traditionally has 2 parity nodes, the Zetta solution has 3 – this is laid out in great detail by JW in his post on Data Integrity in the Cloud). We also refer to it as “RAIN” because we stripe data not just across independent disks, but actually across independent nodes (i.e. storage servers). This level of redundant protection is not available even in traditional storage hardware from top vendors, ensuring integrity (and availability) of data in the event of up to three independent computer failures.

     

  • Proactive Error Correction — In addition to creating a SHA hash of every complete file that enters the Zetta storage cloud, the Zetta solution also creates a SHA hash of every “chunk” of data encoded and striped across the disks in our lower-level storage servers. Then, using any spare system processing cycles, a background process on the system traverses all hard drives and compares those stored hashes to the current chunk on disk, proactively detecting and repairing any data corruptions on disk using our triple redundancy RAIN6 encoding.

     

  • Snapshots — Zetta cloud storage comes with a full-featured file system (a distributed, clustered, highly parallelized file system that we’ll be discussing in a future post). As with most file systems, the Zetta file system provides full snapshotting capabilities – either scheduled or ad hoc snapshots. And Zetta snapshots are free from the capacity and performance limitations of single devices and fixed size clusters. This provides a customer-controlled protection mechanism – once a snapshot is created, the file system is preserved in that state until the snapshot is deleted, allowing a user to go back and restore filed and directories from the “.snapshot” directory like with any on-premises filer.

     

  • Geo-Replication — All customer data stored at Zetta is replicated to another data center. In 2010 we expect to begin to offer full asynchronous replication to our customers who want a fully-mountable volume resident in another Zetta data center, either for performance or for data integrity.

 

Again, the Zetta solution was designed with the core premise that preventing data loss was our primary charter, and these are some of the unique features we’ve put into the solution to live up to that charter.

 

Compare this with what is available today from the HTTP cloud storage vendors. I reiterate that this is not a knock on those solutions – they do an excellent job for their customer target, but their target is not the enterprise, and they don’t provide the requisite features to host primary enterprise data. These solutions do not provide write receipts, have no RAID implementations, lack proactive error correction, and offer no file systems with snapshots.

 

I’ll be back soon to discuss Zetta’s approach to data security & privacy.

Twitter iconReading: Hosting Primary, Unstructured Enterprise Data in the Cloud – Part 4: Comprehensive data integrity/protectionTweet This