Validating CDMI features – Server Side Encryption

One of the features of many storage systems and even disk drives is the ability to encrypt the data at rest. This protects against a specific threat – the disk drive going out the back door for replacement or repair. So it was only a matter of time before we would see this important feature start to be offered for Cloud Storage as well. Well, today Amazon announced their Server Side Encryption capability for their S3 cloud offering. This feature was anticipated by the CDMI standard interface when it was finalized as a standard back in April 2010.

Standard Server Side Encryption

So, how does CDMI standardize this feature? Well, as usual, it starts with finding out if the cloud actually supports the feature and what choices are available. In CDMI, this is done through the capabilities resource – a kind of catalog or discovery mechanism. By fetching the capabilities resource for objects, containers, domain or queues, you can tell whether server side encryption of data at rest if available from the cloud offering (yes this is granular for a reason). The actual capability name is: cdmi_encryption (see section 12.1.3). This indicates that the cloud can do encryption for the data at rest, but also indicates what algorithms are available to do this encryption. The algorithms are expressed in the form of: ALGORITHM_MODE_KEYLENGTH, where:

“ALGORITHM” is the encryption algorithm (e.g., “AES” or “3DES”).

“MODE” is the mode of operation (e.g.,”XTS”, “CBC”, or “CTR”).

“KEYLENGTH” is the key size (e.g.,”128″,”192″, “256″).

So the cloud can offer the user several different algorithms of different strengths and types, or if it only offers a single algorithm (such as the Amazon offering), the cloud storage client can at least understand what that algorithm is.

So how does the user tell the cloud that she wants her data encrypted? Amazon does this with a proprietary header of course, but CDMI does it with standard Data System Metadata that can be placed on any object, container of objects, queue or domain. This metadata is called cdmi_encryption (see section 16.4), and contains merely a string with a value chosen from the list of available algorithms in the corresponding capability. There is also a cdmi_encryption_provided metadata value to tell the client whether their data is being encrypted or not by the cloud.

Lastly, there is a system-wide capability called cdmi_security_encryption (section 12.1.1) that tells the user whether the cloud does server side encryption at all.

Server side encryption is an important capability for cloud storage offerings to provide, which is why CDMI standardized this in advance of having cloud offerings available. We expect more clouds to offer this in the future, and customers to soon realize that – without CDMI implementations, these offerings are locking them in and causing a high cost of exiting that vendor.

New Webpage about SSD Form-Factors

There’s a new page on the SSSI website which describes the wide range of SSD form-factors (physical formats) on the market today.   SSSI defines three major categories – Solid State Drive, Solid State Card, and Solid State Module – and the new page provides descriptions and examples of each.

Take a look.

Share

What’s Old is New Again: Storage Tiering

Storage tiering is nothing new but then again is all new. Traditionally, tiering meant that you’d buy fast (Tier One) storage arrays, based on 15K Fibre Channel drives, for your really important applications. Next you’d buy some slower (Tier Two) storage arrays, based on SATA drives, for your not-so-important applications. Finally you’d buy a (Tier Three) tape library or VTL to house your backups. This is how most people have accomplished storage tiering for the past couple of decades, with slight variations. For instance I’ve talked to some companies that had as many as six tiers when they added their remote offices and disaster recovery sites – these were very large users with very large storage requirements who could justify breaking the main three tiers into sub-tiers.

Whether you categorized your storage into three or six tiers, the basic definition of a tier has historically been a collection of storage silos with particular cost and performance attributes that made them appropriate for certain workloads. Recent developments, however, have changed this age-old paradigm:

1) The post-recession economy has driven IT organizations to look for ways to cut costs by improving storage utilization
2) The introduction of the SSD offers intriguing performance but a higher cost than most can afford
3) Evolving storage array intelligence now automates the placement of “hot” data without human intervention

These three events lead to a rebirth of sorts in tiering, in the form of Automated Storage Tiering. This style of tiering allows the use of new components like SSD without breaking the bank. Assuming that for any given workload, a small percentage of data is accessed very frequently, Automated tiering allows the use of high performance components for that data only, while the less-frequently accessed data can be automatically stored on more economical media.

As with any new technology, or in this case a new technique, vendors are approaching automated tiering from different angles. This is good for consumers in the long run (the best implementations will eventually win out) but in the short run creates some confusion when determining which vendor you should align you and your data with.

As a result, automated storage tiering is getting quite a bit of press from vendors and industry analysts alike. For example, here are two pieces that appeared recently:

Information Week Storage Virtualization Tour – All About Automated Tiering
Business Week – Auto Tiering Crucial to Storage Efficiency

SNIA is also interested in helping clear any confusion around automated storage tiering. This week the DPCO committee will host a live webcast panel of tiering vendors to discuss the pros and cons of tiering within the scope of their products, you can register for it here: Sign up

Join this session and learn more about similarities and differences in various tiering implementations. We hope to see some “lively” interaction, so join the tiering discussion and get your questions answered.

See you there!

Larry

PS – If you can’t make this week’s Webcast, we’ll also be recording it and you’ll be able to view it from the DPCO website

What’s Old is New Again: Storage Tiering

Storage tiering is nothing new but then again is all new. Traditionally, tiering meant that you’d buy fast (Tier One) storage arrays, based on 15K Fibre Channel drives, for your really important applications. Next you’d buy some slower (Tier Two) storage arrays, based on SATA drives, for your not-so-important applications. Finally you’d buy a (Tier Three) tape library or VTL to house your backups. This is how most people have accomplished storage tiering for the past couple of decades, with slight variations. For instance I’ve talked to some companies that had as many as six tiers when they added their remote offices and disaster recovery sites – these were very large users with very large storage requirements who could justify breaking the main three tiers into sub-tiers.

Whether you categorized your storage into three or six tiers, the basic definition of a tier has historically been a collection of storage silos with particular cost and performance attributes that made them appropriate for certain workloads. Recent developments, however, have changed this age-old paradigm:

1) The post-recession economy has driven IT organizations to look for ways to cut costs by improving storage utilization
2) The introduction of the SSD offers intriguing performance but a higher cost than most can afford
3) Evolving storage array intelligence now automates the placement of “hot” data without human intervention

These three events lead to a rebirth of sorts in tiering, in the form of Automated Storage Tiering. This style of tiering allows the use of new components like SSD without breaking the bank. Assuming that for any given workload, a small percentage of data is accessed very frequently, Automated tiering allows the use of high performance components for that data only, while the less-frequently accessed data can be automatically stored on more economical media.

As with any new technology, or in this case a new technique, vendors are approaching automated tiering from different angles. This is good for consumers in the long run (the best implementations will eventually win out) but in the short run creates some confusion when determining which vendor you should align you and your data with.

As a result, automated storage tiering is getting quite a bit of press from vendors and industry analysts alike. For example, here are two pieces that appeared recently:

Information Week Storage Virtualization Tour – All About Automated Tiering
Business Week – Auto Tiering Crucial to Storage Efficiency

SNIA is also interested in helping clear any confusion around automated storage tiering. This week the DPCO committee will host a live webcast panel of tiering vendors to discuss the pros and cons of tiering within the scope of their products, you can register for it here: Sign up

Join this session and learn more about similarities and differences in various tiering implementations. We hope to see some “lively” interaction, so join the tiering discussion and get your questions answered.

See you there!

Larry

PS – If you can’t make this week’s Webcast, we’ll also be recording it and you’ll be able to view it from the DPCO website

Plan to Attend Cloud Burst and SDC

Cloud Storage Developers will be Converging on Santa Clara in September for the Storage Developer Conference and the Cloud Burst Event

Cloud Burst Event

There are a multitude of events dedicated to cloud computing, but where can you go to find out specifically about cloud storage? The 2011 SNIA Cloud Burst Summit educates and offers insight into this fast-growing market segment. Come hear from industry luminaries, see live demonstrations, and talk to technology vendors about how to get started with cloud storage.

The audience for the SNIA Cloud Burst Summit is IT storage professionals and related colleagues who are looking to cloud storage as a solution for their IT environments. The day’s agenda will be packed with presentations from cloud industry luminaries, the latest cloud development panel discussions, a focus on cloud backup, and a cocktail networking opportunity in the evening.

Check out the Agenda and Register Today…

 

Storage Developer Conference

The SNIA Storage Developer Conference is the premier event for developers of cloud storage, filesystems and storage technologies. The year there is a full cloud track on the Agenda, as well as some great speakers. Some examples include:

Programming the Cloud

CDMI for Cloud IPC

David Slik
Technical Director,
Object Storage
NetApp

Open Source Droplet Library with CDMI Support

Giorgio Regni
CTO,
Scality

CDMI Federations, Year 2

David Slik
Technical Director,
Object Storage,
NetApp

CDMI Retention Improvements

Priya Nc
Principal Software Engineer,
EMC Data Storage Systems

CDMI Conformance and Performance Testing

David Slik
Technical Director,
Object Storage,
NetApp

Use of Storage Security in the Cloud

David Dodgson
Software Engineer,
Unisys

Authenticating Cloud Storage with Distributed Keys

Jason Resch
Senior Software Engineer,
Cleversafe

Resilience at Scale in the Distributed Storage Cloud

Alma Riska
Consultant Software Engineer,
EMC

Changing Requirements for Distributed File Systems in Cloud Storage

Wesley Leggette
Cleversafe, Inc

Best Practices in Designing Cloud Storage Based Archival Solution

Sreenidhi Iyangar
Senior Technical Lead,
EMC

Tape’s Role in the Cloud

Chris Marsh
Market Development Manager,
Spectra Logic

Two Storage Trails on the 10GbE Convergence Path

As the migration to 10Gb Ethernet moves forward, many data centers are looking to converge network and storage I/O to fully utilize a ten-fold increase in bandwidth.  Industry discussions continue regarding the merits of 10GbE iSCSI and FCoE.  Some of the key benefits of both protocols were presented in an iSCSI SIG webcast that included Maziar Tamadon and Jason Blosil on July 19th: Two Storage Trails on the 10Gb Convergence Path

It’s a win-win solution as both technologies offer significant performance improvements and cost savings.  The discussion is sure to continue.

Since there wasn’t enough time to respond to all of the questions during the webcast, we have consolidated answers to all of them in this blog post from the presentation team.  Feel free to comment and provide your input.

Question: How is multipathing changed or affected with FCoE?

One of the benefits of FCoE is that it uses Fibre Channel in the upper layers of the software stack where multipathing is implemented.  As a result, multipathing is the same for Fibre Channel and FCoE.

Question: Are the use of CNAs with FCoE offload getting any traction?  Are these economically viable?

The adoption of FCoE has been slower than expected, but is gaining momentum.  Fibre Channel is typically used for mission-critical applications so data centers have been cautious about moving to new technologies.   FCoE and network convergence provide significant cost savings, so FCoE is economically viable.

Question: If you run the software FCoE solution would this not prevent boot from SAN?

Boot from SAN is not currently supported when using FCoE with a software initiator and NIC.  Today, boot from SAN is only supported using FCoE with a hardware converged networked adapter (CNA).

Question:  How do you assign priority for FCoE vs. other network traffic.  Doesn’t it still make sense to have a dedicated network for data intensive network use?

Data Center Bridging (DCB) standards that enable FCoE allow priority and bandwidth to be assigned to each priority queue or link.   Each link may support one or more data traffic types. Support for this functionality is required between two end points in the fabric, such as between an initiator at the host with the first network connection at the top of rack switch, as an example. The DCBx Standard facilitates negotiation between devices to enable supported DCB capabilities at each end of the wire.

Question:  Category 6A uses more power that twin-ax or OM3 cable infrastructures, which in large build-outs is significant.

Category 6A does use more power than twin-ax or OM3 cables.  That is one of the trade-offs data centers should consider when evaluating 10GbE network options.

Question: Don’t most enterprise storage arrays support both iSCSI and FC/FCoE ports?  That seems to make the “either/or” approach to measuring uptake moot.

Many storage arrays today support either the iSCSI or FC storage network protocol. Some arrays support both at the same time. Very few support FCoE. And some others support a mixture of file and block storage protocols, often called Unified Storage. But, concurrent support for FC/FCoE and iSCSI on the same array is not universal.

Regardless, storage administrators will typically favor a specific storage protocol based upon their acquired skill sets and application requirements. This is especially true with block storage protocols since the underlying hardware is unique (FC, Ethernet, or even Infiniband). With the introduction of data center bridging and FCoE, storage administrators can deploy a single physical infrastructure to support the variety of application requirements of their organization. Protocol attach rates will likely prove less interesting as more vendors begin to offer solutions supporting full network convergence.

Question: I am wondering what is the sample size of your poll results, how many people voted?

We had over 60 live viewers of the webcast and over 50% of them participated in the online questions. So, the sample size was about 30+ individuals.

Question: Tape? Isn’t tape dead?

Tape as a backup methodology is definitely on the downward slope of its life than it was 5 or 10 years ago, but it still has a pulse. Expectations are that disk based backup, DR, and archive solutions will be common practice in the near future. But, many companies still use tape for archival storage. Like most declining technologies, tape will likely have a long tail as companies continue to modify their IT infrastructure and business practices to take advantage of newer methods of data retention.

Question: Do you not think 10 Gbps will fall off after 2015 as the adoption of 40 Gbps to blade enclosures will start to take off in 2012?

10GbE was expected to ramp much faster than what we have witnessed. Early applications of 10GbE in storage were introduced as early as 2006. Yet, we are only now beginning to see more broad adoption of 10GbE. The use of LOM and 10GBaseT will accelerate the use of 10GbE.

Early server adoption of 40GbE will likely be with blades. However, recognize that rack servers still outsell blades by a pretty large margin. As a result, 10GbE will continue to grow in adoption through 2015 and perhaps 2016. 40GbE will become very useful to reduce port count, especially at bandwidth aggregation points, such as inter-switch links. 40Gb ports may also be used to save on port count with the use of fanout cables (4x10Gb). However, server performance must continue to increase in order to be able to drive 40Gb pipes.

Question: Will you be making these slides available for download?

These slides are available for download at www.snia.org/?

Question: What is your impression of how convergence will change data center expertise?  That is, who manages the converged network?  Your storage experts, your network experts, someone new?

Network Convergence will indeed bring multiple teams together across the IT organization: server team, network team, and storage team to name a few. There is no preset answer, and the outcome will be on a case by case basis, but ultimately IT organizations will need to figure out how a common, shared resource (the network/fabric) ought to be managed and where the new ownership boundaries would need to be drawn.

Question: Will there be or is there currently a NDMP equivalent for iSCSI or 10GbE?

There is no equivalent to NDMP for iSCSI. NDMP is a management protocol used to backup server data to network storage devices using NFS or CIFS. SNIA oversees the development of this protocol today.

Question: How does the presenter justify the statement of “no need for specialized” knowledge or tools?  Given how iSCSI uses new protocols and concepts not found in traditional LAN, how could he say that?

While it’s true that iSCSI comes with its own concepts and subtleties, the point being made centered around how pervasive and widespread the underlying Ethernet know-how and expertise is.

Question: FC vs IP storage. What does IDC count if the array has both FC and IP storage which group does it go in. If a customer buys an array but does not use one of the two protocols will that show up in IDC numbers? This info conflicts SNIA’s numbers.

We can’t speak to the exact methods used to generate the analyst data. Each analyst firm has their own method for collecting and analyzing industry data. The reason for including the data was to discuss the overall industry trends.

Question: I noticed in the high-level overview that FCoE appeared not to be a ‘mesh’ network. How will this deal w/multipathing and/or failover?

The diagrams only showed a single path for FCoE to simplify the discussion on network convergence.  In a real-world, best-practices deployment there would be multiple paths with failover.   FCoE uses the same multipathing and failover capabilities that are available for Fibre Channel.

Question: Why are you including FCoE in IP-based storage?

The graph should indeed have read Ethernet storage rather than IP storage. This was fixed after the webinar and before the presentation got posted on SNIA’s website.

Trends in Data Protection

Data protection hasn’t changed much in a long time.  Sure, there are slews of product announcements and incessant declarations of the “next big thing”, but really, how much have market shares really changed over the past decade?  You’ve got to wonder if new technology is fundamentally improving how data is protected or is simply turning the crank to the next model year.  Are customers locked into the incremental changes proffered by traditional backup vendors or is there a better way?

Not going to take it anymore

The major force driving change in the industry has little to do with technology.  People have started to challenge the notion that they, not the computing system, should be responsible for ensuring the integrity of their data.  If they want a prior version of their data, why can’t the system simply provide it?   In essence, customers want to rely on a computing system that just works.  The Howard Beale anchorman in the movie Network personifies the anxiety that burdens customers explicitly managing backups, recoveries, and disaster recovery.  Now don’t get me wrong; it is critical to minimize risk and manage expectations.   But the focus should be on delivering data protection solutions that can simply be ignored.

Are you just happy to see me?

The personal computer user is prone to ask “how hard can it be to copy data?”  Ignoring the fact that many such users lose data on a regular basis because they have failed to protect their data at all, the IT professional is well aware of the intricacies of application consistency, the constraints of backup windows, the demands of service levels and scale, and the efficiencies demanded by affordability.    You can be sure that application users that have recovered lost or corrupted data are relieved.  Mae West, posing as a backup administrator, might have said “Is that a LUN in your pocket or are you just happy to see me?”

In the beginning

Knowing where the industry has been is a good step in knowing where the industry is going.  When the mainframe was young, application developers carried paper tape or punch cards.  Magnetic tape was used to store application data as well as a media to copy it to. Over time, as magnetic disk became affordable for primary data, the economics of magnetic tape remained compelling as a backup media.  Data protection was incorporated into the operating system through backup/recovery facilities, as well as through 3rd party products.

As microprocessors led computing mainstream, non-mainframe computing systems gained prominence and tape became relegated to secondary storage.  Native, open source, and commercial backup and recovery utilities stored backup and archive copies on tape media and leveraged its portability to implement disaster recovery plans.  Data compression increased the effective capacity of tape media and complemented its power consumption efficiency.

All quiet on the western front

Backup to tape became the dominant methodology for protecting application data due to its affordability and portability.  Tape was used as the backup media for application and server utilities, storage system tools, and backup applications.

B2T

Backup Server copies data from primary disk storage to tape media

Customers like the certainty of knowing where their backup copies are and physical tapes are comforting in this respect.  However, the sequential access nature of the media and indirect visibility into what’s on each tape led to difficulties satisfying recovery time objectives.  Like the soldier who fights battles that seem to have little overall significance, the backup administrator slogs through a routine, hoping the company’s valuable data is really protected.

B2D phase 1

Backup Server copies data to a Virtual Tape Library

Uncomfortable with problematic recovery from tape, customers have been evolving their practices to a backup to disk model.  Backup to disk and then to tape was one model designed to offset the higher cost of disk media but can increase the uncertainty of what’s on tape.  Another was to use virtual tape libraries to gain the direct access benefits of disk while minimizing changes in their current tape-based backup practices.  Both of these techniques helped improve recovery time but still required the backup administrator to acquire, use, and maintain a separate backup server to copy the data to the backup media.

Snap out of it!

Space-efficient snapshots offered an alternative data protection solution for some file servers. Rather than use separate media to store copies of data, the primary storage system itself would be used to maintain multiple versions of the data by only saving changes to the data.  As long as the storage system was intact, restoration of prior versions was rapid and easy.  Versions could also be replicated between two storage systems to protect the data should one of the file servers become inaccessible.

snapshot

Point in Time copies on disk storage are replicated to other disks

This procedure works, is fast, and is space efficient for data on these file servers but has challenges in terms of management and scale.  Snapshot based approaches manage versions of snapshots; they lack the ability to manage data protection at the file level.  This limitation arises because the customer’s data protection policies may not match the storage system policies.  Snapshot based approaches are also constrained by the scope of each storage system so scaling to protect all the data in a company (e.g., laptops) in a uniform and centralized (labor-efficient) manner is problematic at best.

CDP

Writes are captured and replicated for protection

Continuous Data Protection (both “near CDP” solutions which take frequent snapshots and “true CDP” solutions which continuously capture writes) is also being used to eliminate the backup window thereby ensuring large volumes of data can be protected.  However, the expense and maturity of CDP needs to be balanced with the value of “keeping everything”.

 

 

An offer he can’t refuse

Data deduplication fundamentally changed the affordability of using disk as a backup media.  The effective cost of storing data declined because duplicate data need only be stored once. Coupled with the ability to rapidly access individual objects, the advantages of backing up data to deduplicated storage are overwhelmingly compelling.  Originally, the choice of whether to deduplicate data at the source or target was a decision point but more recent offerings offer both approaches so customers need not compromise on technology.  However, simply using deduplicated storage as a backup target does not remove the complexity of configuring and supporting a data protection solution that spans independent software and hardware products.  Is it really necessary that additional backup servers be installed to support business growth?  Is it too much to ask for a turnkey solution that can address the needs of a large enterprise?

The stuff that dreams are made of

 

PBBA

Transformation from a Backup Appliance to a Recovery Platform

Protection storage offers an end-to-end solution, integrating full-function data protection capabilities with deduplicated storage.  The simplicity and efficiency of application-centric data protection combined with the scale and performance of capacity-optimized storage systems stands to fundamentally alter the traditional backup market.  Changed data is copied directly between the source and the target, without intervening backup servers.  Cloud storage may also be used as a cost-effective target.  Leveraging integrated software and hardware for what each does best allows vendors to offer innovations to customers in a manner that lowers their total cost of ownership.  Innovations like automatic configuration, dynamic optimization, and using preferred management interfaces (e.g., virtualization consoles, pod managers) build on the proven practices of the past to integrate data protection into the customer’s information infrastructure.

No one wants to be locked into products because they are too painful to switch out; it’s time that products are “sticky” because they offer compelling solutions.  IDC projects that the worldwide purpose-built backup appliance (PBBA) market will grow 16.6% from $1.7 billion in 2010 to $3.6 billion by 2015.  The industry is rapidly adopting PBBAs to overcome the data protection challenges associated with data growth.  Looking forward, storage systems will be expected to incorporate a recovery platform, supporting security and compliance obligations, and data protection solutions will become information brokers for what is stored on disk.

Trends in Data Protection

Data protection hasn’t changed much in a long time.  Sure, there are slews of product announcements and incessant declarations of the “next big thing”, but really, how much have market shares really changed over the past decade?  You’ve got to wonder if new technology is fundamentally improving how data is protected or is simply turning the crank to the next model year.  Are customers locked into the incremental changes proffered by traditional backup vendors or is there a better way?

Not going to take it anymore

The major force driving change in the industry has little to do with technology.  People have started to challenge the notion that they, not the computing system, should be responsible for ensuring the integrity of their data.  If they want a prior version of their data, why can’t the system simply provide it?   In essence, customers want to rely on a computing system that just works.  The Howard Beale anchorman in the movie Network personifies the anxiety that burdens customers explicitly managing backups, recoveries, and disaster recovery.  Now don’t get me wrong; it is critical to minimize risk and manage expectations.   But the focus should be on delivering data protection solutions that can simply be ignored.

Are you just happy to see me?

The personal computer user is prone to ask “how hard can it be to copy data?”  Ignoring the fact that many such users lose data on a regular basis because they have failed to protect their data at all, the IT professional is well aware of the intricacies of application consistency, the constraints of backup windows, the demands of service levels and scale, and the efficiencies demanded by affordability.    You can be sure that application users that have recovered lost or corrupted data are relieved.  Mae West, posing as a backup administrator, might have said “Is that a LUN in your pocket or are you just happy to see me?”

In the beginning

Knowing where the industry has been is a good step in knowing where the industry is going.  When the mainframe was young, application developers carried paper tape or punch cards.  Magnetic tape was used to store application data as well as a media to copy it to. Over time, as magnetic disk became affordable for primary data, the economics of magnetic tape remained compelling as a backup media.  Data protection was incorporated into the operating system through backup/recovery facilities, as well as through 3rd party products.

As microprocessors led computing mainstream, non-mainframe computing systems gained prominence and tape became relegated to secondary storage.  Native, open source, and commercial backup and recovery utilities stored backup and archive copies on tape media and leveraged its portability to implement disaster recovery plans.  Data compression increased the effective capacity of tape media and complemented its power consumption efficiency.

All quiet on the western front

Backup to tape became the dominant methodology for protecting application data due to its affordability and portability.  Tape was used as the backup media for application and server utilities, storage system tools, and backup applications.

B2T

Backup Server copies data from primary disk storage to tape media

Customers like the certainty of knowing where their backup copies are and physical tapes are comforting in this respect.  However, the sequential access nature of the media and indirect visibility into what’s on each tape led to difficulties satisfying recovery time objectives.  Like the soldier who fights battles that seem to have little overall significance, the backup administrator slogs through a routine, hoping the company’s valuable data is really protected.

B2D phase 1

Backup Server copies data to a Virtual Tape Library

Uncomfortable with problematic recovery from tape, customers have been evolving their practices to a backup to disk model.  Backup to disk and then to tape was one model designed to offset the higher cost of disk media but can increase the uncertainty of what’s on tape.  Another was to use virtual tape libraries to gain the direct access benefits of disk while minimizing changes in their current tape-based backup practices.  Both of these techniques helped improve recovery time but still required the backup administrator to acquire, use, and maintain a separate backup server to copy the data to the backup media.

Snap out of it!

Space-efficient snapshots offered an alternative data protection solution for some file servers. Rather than use separate media to store copies of data, the primary storage system itself would be used to maintain multiple versions of the data by only saving changes to the data.  As long as the storage system was intact, restoration of prior versions was rapid and easy.  Versions could also be replicated between two storage systems to protect the data should one of the file servers become inaccessible.

snapshot

Point in Time copies on disk storage are replicated to other disks

This procedure works, is fast, and is space efficient for data on these file servers but has challenges in terms of management and scale.  Snapshot based approaches manage versions of snapshots; they lack the ability to manage data protection at the file level.  This limitation arises because the customer’s data protection policies may not match the storage system policies.  Snapshot based approaches are also constrained by the scope of each storage system so scaling to protect all the data in a company (e.g., laptops) in a uniform and centralized (labor-efficient) manner is problematic at best.

CDP

Writes are captured and replicated for protection

Continuous Data Protection (both “near CDP” solutions which take frequent snapshots and “true CDP” solutions which continuously capture writes) is also being used to eliminate the backup window thereby ensuring large volumes of data can be protected.  However, the expense and maturity of CDP needs to be balanced with the value of “keeping everything”.

 

 

An offer he can’t refuse

Data deduplication fundamentally changed the affordability of using disk as a backup media.  The effective cost of storing data declined because duplicate data need only be stored once. Coupled with the ability to rapidly access individual objects, the advantages of backing up data to deduplicated storage are overwhelmingly compelling.  Originally, the choice of whether to deduplicate data at the source or target was a decision point but more recent offerings offer both approaches so customers need not compromise on technology.  However, simply using deduplicated storage as a backup target does not remove the complexity of configuring and supporting a data protection solution that spans independent software and hardware products.  Is it really necessary that additional backup servers be installed to support business growth?  Is it too much to ask for a turnkey solution that can address the needs of a large enterprise?

The stuff that dreams are made of

 

PBBA

Transformation from a Backup Appliance to a Recovery Platform

Protection storage offers an end-to-end solution, integrating full-function data protection capabilities with deduplicated storage.  The simplicity and efficiency of application-centric data protection combined with the scale and performance of capacity-optimized storage systems stands to fundamentally alter the traditional backup market.  Changed data is copied directly between the source and the target, without intervening backup servers.  Cloud storage may also be used as a cost-effective target.  Leveraging integrated software and hardware for what each does best allows vendors to offer innovations to customers in a manner that lowers their total cost of ownership.  Innovations like automatic configuration, dynamic optimization, and using preferred management interfaces (e.g., virtualization consoles, pod managers) build on the proven practices of the past to integrate data protection into the customer’s information infrastructure.

No one wants to be locked into products because they are too painful to switch out; it’s time that products are “sticky” because they offer compelling solutions.  IDC projects that the worldwide purpose-built backup appliance (PBBA) market will grow 16.6% from $1.7 billion in 2010 to $3.6 billion by 2015.  The industry is rapidly adopting PBBAs to overcome the data protection challenges associated with data growth.  Looking forward, storage systems will be expected to incorporate a recovery platform, supporting security and compliance obligations, and data protection solutions will become information brokers for what is stored on disk.

CSI Quarterly Update Q3 2011

A Message from
SNIA Links:

Follow SNIA:
Linkedin
Twitter
Facebook

SNIA Blogs:

Cloud Storage Initiative

Upcoming Activities

Get Involved Now!

A limited number of these activities are open to all, or Join SNIA and the CSI to participate in any of these activities

July Cloud Plugfest

The purpose of the Cloud Plugfest is for vendors to bring their implementations of CDMI and OCCI to test, identify, and fix bugs in a collaborative setting with the goal of providing a forum in which companies can develop interoperable products.

The Cloud Plugfest starts on Tuesday July 12 and runs thru Thursday July 14, 2011 at the SNIA Technology Center in Colorado Springs, CO.  The SNIA Cloud Storage Initiative (CSI) is underwriting the costs of the event, therefore there is no participation fee.

More Information

SNIA Cloud Burst Event

There are a multitude of events dedicated to cloud computing, but where can you go to find out specifically about cloud storage? The 2011 SNIA Cloud Burst Summit educates and offers insight into this fast–growing market segment. Come hear from industry luminaries, see live demonstrations, and talk to technology vendors about how to get started with cloud storage.

More information

Cloud Lab Plugfest at SDC

Plugfests have always been an important part of the Storage Developers Conference and this year will be the first Cloud Lab Plugfest event held over multiple days to test the interoperability of CDMI, OVF and OCCI implementations.

To get involved, please contact: arnold@snia.org

Cloud Pavilion at SNW

Every SNW, one of highlights is the Cloud Pavilion where attendees can see public and private cloud offerings and discuss solutions. Space is limited, so get involved early to ensure your spot.

To get involved, please contact: lisa.mercurio@snia.org