What’s Old is New Again: Storage Tiering

Storage tiering is nothing new but then again is all new. Traditionally, tiering meant that you’d buy fast (Tier One) storage arrays, based on 15K Fibre Channel drives, for your really important applications. Next you’d buy some slower (Tier Two) storage arrays, based on SATA drives, for your not-so-important applications. Finally you’d buy a (Tier Three) tape library or VTL to house your backups. This is how most people have accomplished storage tiering for the past couple of decades, with slight variations. For instance I’ve talked to some companies that had as many as six tiers when they added their remote offices and disaster recovery sites – these were very large users with very large storage requirements who could justify breaking the main three tiers into sub-tiers.

Whether you categorized your storage into three or six tiers, the basic definition of a tier has historically been a collection of storage silos with particular cost and performance attributes that made them appropriate for certain workloads. Recent developments, however, have changed this age-old paradigm:

1) The post-recession economy has driven IT organizations to look for ways to cut costs by improving storage utilization
2) The introduction of the SSD offers intriguing performance but a higher cost than most can afford
3) Evolving storage array intelligence now automates the placement of “hot” data without human intervention

These three events lead to a rebirth of sorts in tiering, in the form of Automated Storage Tiering. This style of tiering allows the use of new components like SSD without breaking the bank. Assuming that for any given workload, a small percentage of data is accessed very frequently, Automated tiering allows the use of high performance components for that data only, while the less-frequently accessed data can be automatically stored on more economical media.

As with any new technology, or in this case a new technique, vendors are approaching automated tiering from different angles. This is good for consumers in the long run (the best implementations will eventually win out) but in the short run creates some confusion when determining which vendor you should align you and your data with.

As a result, automated storage tiering is getting quite a bit of press from vendors and industry analysts alike. For example, here are two pieces that appeared recently:

Information Week Storage Virtualization Tour – All About Automated Tiering
Business Week – Auto Tiering Crucial to Storage Efficiency

SNIA is also interested in helping clear any confusion around automated storage tiering. This week the DPCO committee will host a live webcast panel of tiering vendors to discuss the pros and cons of tiering within the scope of their products, you can register for it here: Sign up

Join this session and learn more about similarities and differences in various tiering implementations. We hope to see some “lively” interaction, so join the tiering discussion and get your questions answered.

See you there!

Larry

PS – If you can’t make this week’s Webcast, we’ll also be recording it and you’ll be able to view it from the DPCO website

What’s Old is New Again: Storage Tiering

Storage tiering is nothing new but then again is all new. Traditionally, tiering meant that you’d buy fast (Tier One) storage arrays, based on 15K Fibre Channel drives, for your really important applications. Next you’d buy some slower (Tier Two) storage arrays, based on SATA drives, for your not-so-important applications. Finally you’d buy a (Tier Three) tape library or VTL to house your backups. This is how most people have accomplished storage tiering for the past couple of decades, with slight variations. For instance I’ve talked to some companies that had as many as six tiers when they added their remote offices and disaster recovery sites – these were very large users with very large storage requirements who could justify breaking the main three tiers into sub-tiers.

Whether you categorized your storage into three or six tiers, the basic definition of a tier has historically been a collection of storage silos with particular cost and performance attributes that made them appropriate for certain workloads. Recent developments, however, have changed this age-old paradigm:

1) The post-recession economy has driven IT organizations to look for ways to cut costs by improving storage utilization
2) The introduction of the SSD offers intriguing performance but a higher cost than most can afford
3) Evolving storage array intelligence now automates the placement of “hot” data without human intervention

These three events lead to a rebirth of sorts in tiering, in the form of Automated Storage Tiering. This style of tiering allows the use of new components like SSD without breaking the bank. Assuming that for any given workload, a small percentage of data is accessed very frequently, Automated tiering allows the use of high performance components for that data only, while the less-frequently accessed data can be automatically stored on more economical media.

As with any new technology, or in this case a new technique, vendors are approaching automated tiering from different angles. This is good for consumers in the long run (the best implementations will eventually win out) but in the short run creates some confusion when determining which vendor you should align you and your data with.

As a result, automated storage tiering is getting quite a bit of press from vendors and industry analysts alike. For example, here are two pieces that appeared recently:

Information Week Storage Virtualization Tour – All About Automated Tiering
Business Week – Auto Tiering Crucial to Storage Efficiency

SNIA is also interested in helping clear any confusion around automated storage tiering. This week the DPCO committee will host a live webcast panel of tiering vendors to discuss the pros and cons of tiering within the scope of their products, you can register for it here: Sign up

Join this session and learn more about similarities and differences in various tiering implementations. We hope to see some “lively” interaction, so join the tiering discussion and get your questions answered.

See you there!

Larry

PS – If you can’t make this week’s Webcast, we’ll also be recording it and you’ll be able to view it from the DPCO website

Plan to Attend Cloud Burst and SDC

Cloud Storage Developers will be Converging on Santa Clara in September for the Storage Developer Conference and the Cloud Burst Event

Cloud Burst Event

There are a multitude of events dedicated to cloud computing, but where can you go to find out specifically about cloud storage? The 2011 SNIA Cloud Burst Summit educates and offers insight into this fast-growing market segment. Come hear from industry luminaries, see live demonstrations, and talk to technology vendors about how to get started with cloud storage.

The audience for the SNIA Cloud Burst Summit is IT storage professionals and related colleagues who are looking to cloud storage as a solution for their IT environments. The day’s agenda will be packed with presentations from cloud industry luminaries, the latest cloud development panel discussions, a focus on cloud backup, and a cocktail networking opportunity in the evening.

Check out the Agenda and Register Today…

 

Storage Developer Conference

The SNIA Storage Developer Conference is the premier event for developers of cloud storage, filesystems and storage technologies. The year there is a full cloud track on the Agenda, as well as some great speakers. Some examples include:

Programming the Cloud

CDMI for Cloud IPC

David Slik
Technical Director,
Object Storage
NetApp

Open Source Droplet Library with CDMI Support

Giorgio Regni
CTO,
Scality

CDMI Federations, Year 2

David Slik
Technical Director,
Object Storage,
NetApp

CDMI Retention Improvements

Priya Nc
Principal Software Engineer,
EMC Data Storage Systems

CDMI Conformance and Performance Testing

David Slik
Technical Director,
Object Storage,
NetApp

Use of Storage Security in the Cloud

David Dodgson
Software Engineer,
Unisys

Authenticating Cloud Storage with Distributed Keys

Jason Resch
Senior Software Engineer,
Cleversafe

Resilience at Scale in the Distributed Storage Cloud

Alma Riska
Consultant Software Engineer,
EMC

Changing Requirements for Distributed File Systems in Cloud Storage

Wesley Leggette
Cleversafe, Inc

Best Practices in Designing Cloud Storage Based Archival Solution

Sreenidhi Iyangar
Senior Technical Lead,
EMC

Tape’s Role in the Cloud

Chris Marsh
Market Development Manager,
Spectra Logic

Two Storage Trails on the 10GbE Convergence Path

As the migration to 10Gb Ethernet moves forward, many data centers are looking to converge network and storage I/O to fully utilize a ten-fold increase in bandwidth.  Industry discussions continue regarding the merits of 10GbE iSCSI and FCoE.  Some of the key benefits of both protocols were presented in an iSCSI SIG webcast that included Maziar Tamadon and Jason Blosil on July 19th: Two Storage Trails on the 10Gb Convergence Path

It’s a win-win solution as both technologies offer significant performance improvements and cost savings.  The discussion is sure to continue.

Since there wasn’t enough time to respond to all of the questions during the webcast, we have consolidated answers to all of them in this blog post from the presentation team.  Feel free to comment and provide your input.

Question: How is multipathing changed or affected with FCoE?

One of the benefits of FCoE is that it uses Fibre Channel in the upper layers of the software stack where multipathing is implemented.  As a result, multipathing is the same for Fibre Channel and FCoE.

Question: Are the use of CNAs with FCoE offload getting any traction?  Are these economically viable?

The adoption of FCoE has been slower than expected, but is gaining momentum.  Fibre Channel is typically used for mission-critical applications so data centers have been cautious about moving to new technologies.   FCoE and network convergence provide significant cost savings, so FCoE is economically viable.

Question: If you run the software FCoE solution would this not prevent boot from SAN?

Boot from SAN is not currently supported when using FCoE with a software initiator and NIC.  Today, boot from SAN is only supported using FCoE with a hardware converged networked adapter (CNA).

Question:  How do you assign priority for FCoE vs. other network traffic.  Doesn’t it still make sense to have a dedicated network for data intensive network use?

Data Center Bridging (DCB) standards that enable FCoE allow priority and bandwidth to be assigned to each priority queue or link.   Each link may support one or more data traffic types. Support for this functionality is required between two end points in the fabric, such as between an initiator at the host with the first network connection at the top of rack switch, as an example. The DCBx Standard facilitates negotiation between devices to enable supported DCB capabilities at each end of the wire.

Question:  Category 6A uses more power that twin-ax or OM3 cable infrastructures, which in large build-outs is significant.

Category 6A does use more power than twin-ax or OM3 cables.  That is one of the trade-offs data centers should consider when evaluating 10GbE network options.

Question: Don’t most enterprise storage arrays support both iSCSI and FC/FCoE ports?  That seems to make the “either/or” approach to measuring uptake moot.

Many storage arrays today support either the iSCSI or FC storage network protocol. Some arrays support both at the same time. Very few support FCoE. And some others support a mixture of file and block storage protocols, often called Unified Storage. But, concurrent support for FC/FCoE and iSCSI on the same array is not universal.

Regardless, storage administrators will typically favor a specific storage protocol based upon their acquired skill sets and application requirements. This is especially true with block storage protocols since the underlying hardware is unique (FC, Ethernet, or even Infiniband). With the introduction of data center bridging and FCoE, storage administrators can deploy a single physical infrastructure to support the variety of application requirements of their organization. Protocol attach rates will likely prove less interesting as more vendors begin to offer solutions supporting full network convergence.

Question: I am wondering what is the sample size of your poll results, how many people voted?

We had over 60 live viewers of the webcast and over 50% of them participated in the online questions. So, the sample size was about 30+ individuals.

Question: Tape? Isn’t tape dead?

Tape as a backup methodology is definitely on the downward slope of its life than it was 5 or 10 years ago, but it still has a pulse. Expectations are that disk based backup, DR, and archive solutions will be common practice in the near future. But, many companies still use tape for archival storage. Like most declining technologies, tape will likely have a long tail as companies continue to modify their IT infrastructure and business practices to take advantage of newer methods of data retention.

Question: Do you not think 10 Gbps will fall off after 2015 as the adoption of 40 Gbps to blade enclosures will start to take off in 2012?

10GbE was expected to ramp much faster than what we have witnessed. Early applications of 10GbE in storage were introduced as early as 2006. Yet, we are only now beginning to see more broad adoption of 10GbE. The use of LOM and 10GBaseT will accelerate the use of 10GbE.

Early server adoption of 40GbE will likely be with blades. However, recognize that rack servers still outsell blades by a pretty large margin. As a result, 10GbE will continue to grow in adoption through 2015 and perhaps 2016. 40GbE will become very useful to reduce port count, especially at bandwidth aggregation points, such as inter-switch links. 40Gb ports may also be used to save on port count with the use of fanout cables (4x10Gb). However, server performance must continue to increase in order to be able to drive 40Gb pipes.

Question: Will you be making these slides available for download?

These slides are available for download at www.snia.org/?

Question: What is your impression of how convergence will change data center expertise?  That is, who manages the converged network?  Your storage experts, your network experts, someone new?

Network Convergence will indeed bring multiple teams together across the IT organization: server team, network team, and storage team to name a few. There is no preset answer, and the outcome will be on a case by case basis, but ultimately IT organizations will need to figure out how a common, shared resource (the network/fabric) ought to be managed and where the new ownership boundaries would need to be drawn.

Question: Will there be or is there currently a NDMP equivalent for iSCSI or 10GbE?

There is no equivalent to NDMP for iSCSI. NDMP is a management protocol used to backup server data to network storage devices using NFS or CIFS. SNIA oversees the development of this protocol today.

Question: How does the presenter justify the statement of “no need for specialized” knowledge or tools?  Given how iSCSI uses new protocols and concepts not found in traditional LAN, how could he say that?

While it’s true that iSCSI comes with its own concepts and subtleties, the point being made centered around how pervasive and widespread the underlying Ethernet know-how and expertise is.

Question: FC vs IP storage. What does IDC count if the array has both FC and IP storage which group does it go in. If a customer buys an array but does not use one of the two protocols will that show up in IDC numbers? This info conflicts SNIA’s numbers.

We can’t speak to the exact methods used to generate the analyst data. Each analyst firm has their own method for collecting and analyzing industry data. The reason for including the data was to discuss the overall industry trends.

Question: I noticed in the high-level overview that FCoE appeared not to be a ‘mesh’ network. How will this deal w/multipathing and/or failover?

The diagrams only showed a single path for FCoE to simplify the discussion on network convergence.  In a real-world, best-practices deployment there would be multiple paths with failover.   FCoE uses the same multipathing and failover capabilities that are available for Fibre Channel.

Question: Why are you including FCoE in IP-based storage?

The graph should indeed have read Ethernet storage rather than IP storage. This was fixed after the webinar and before the presentation got posted on SNIA’s website.

Trends in Data Protection

Data protection hasn’t changed much in a long time.  Sure, there are slews of product announcements and incessant declarations of the “next big thing”, but really, how much have market shares really changed over the past decade?  You’ve got to wonder if new technology is fundamentally improving how data is protected or is simply turning the crank to the next model year.  Are customers locked into the incremental changes proffered by traditional backup vendors or is there a better way?

Not going to take it anymore

The major force driving change in the industry has little to do with technology.  People have started to challenge the notion that they, not the computing system, should be responsible for ensuring the integrity of their data.  If they want a prior version of their data, why can’t the system simply provide it?   In essence, customers want to rely on a computing system that just works.  The Howard Beale anchorman in the movie Network personifies the anxiety that burdens customers explicitly managing backups, recoveries, and disaster recovery.  Now don’t get me wrong; it is critical to minimize risk and manage expectations.   But the focus should be on delivering data protection solutions that can simply be ignored.

Are you just happy to see me?

The personal computer user is prone to ask “how hard can it be to copy data?”  Ignoring the fact that many such users lose data on a regular basis because they have failed to protect their data at all, the IT professional is well aware of the intricacies of application consistency, the constraints of backup windows, the demands of service levels and scale, and the efficiencies demanded by affordability.    You can be sure that application users that have recovered lost or corrupted data are relieved.  Mae West, posing as a backup administrator, might have said “Is that a LUN in your pocket or are you just happy to see me?”

In the beginning

Knowing where the industry has been is a good step in knowing where the industry is going.  When the mainframe was young, application developers carried paper tape or punch cards.  Magnetic tape was used to store application data as well as a media to copy it to. Over time, as magnetic disk became affordable for primary data, the economics of magnetic tape remained compelling as a backup media.  Data protection was incorporated into the operating system through backup/recovery facilities, as well as through 3rd party products.

As microprocessors led computing mainstream, non-mainframe computing systems gained prominence and tape became relegated to secondary storage.  Native, open source, and commercial backup and recovery utilities stored backup and archive copies on tape media and leveraged its portability to implement disaster recovery plans.  Data compression increased the effective capacity of tape media and complemented its power consumption efficiency.

All quiet on the western front

Backup to tape became the dominant methodology for protecting application data due to its affordability and portability.  Tape was used as the backup media for application and server utilities, storage system tools, and backup applications.

B2T

Backup Server copies data from primary disk storage to tape media

Customers like the certainty of knowing where their backup copies are and physical tapes are comforting in this respect.  However, the sequential access nature of the media and indirect visibility into what’s on each tape led to difficulties satisfying recovery time objectives.  Like the soldier who fights battles that seem to have little overall significance, the backup administrator slogs through a routine, hoping the company’s valuable data is really protected.

B2D phase 1

Backup Server copies data to a Virtual Tape Library

Uncomfortable with problematic recovery from tape, customers have been evolving their practices to a backup to disk model.  Backup to disk and then to tape was one model designed to offset the higher cost of disk media but can increase the uncertainty of what’s on tape.  Another was to use virtual tape libraries to gain the direct access benefits of disk while minimizing changes in their current tape-based backup practices.  Both of these techniques helped improve recovery time but still required the backup administrator to acquire, use, and maintain a separate backup server to copy the data to the backup media.

Snap out of it!

Space-efficient snapshots offered an alternative data protection solution for some file servers. Rather than use separate media to store copies of data, the primary storage system itself would be used to maintain multiple versions of the data by only saving changes to the data.  As long as the storage system was intact, restoration of prior versions was rapid and easy.  Versions could also be replicated between two storage systems to protect the data should one of the file servers become inaccessible.

snapshot

Point in Time copies on disk storage are replicated to other disks

This procedure works, is fast, and is space efficient for data on these file servers but has challenges in terms of management and scale.  Snapshot based approaches manage versions of snapshots; they lack the ability to manage data protection at the file level.  This limitation arises because the customer’s data protection policies may not match the storage system policies.  Snapshot based approaches are also constrained by the scope of each storage system so scaling to protect all the data in a company (e.g., laptops) in a uniform and centralized (labor-efficient) manner is problematic at best.

CDP

Writes are captured and replicated for protection

Continuous Data Protection (both “near CDP” solutions which take frequent snapshots and “true CDP” solutions which continuously capture writes) is also being used to eliminate the backup window thereby ensuring large volumes of data can be protected.  However, the expense and maturity of CDP needs to be balanced with the value of “keeping everything”.

 

 

An offer he can’t refuse

Data deduplication fundamentally changed the affordability of using disk as a backup media.  The effective cost of storing data declined because duplicate data need only be stored once. Coupled with the ability to rapidly access individual objects, the advantages of backing up data to deduplicated storage are overwhelmingly compelling.  Originally, the choice of whether to deduplicate data at the source or target was a decision point but more recent offerings offer both approaches so customers need not compromise on technology.  However, simply using deduplicated storage as a backup target does not remove the complexity of configuring and supporting a data protection solution that spans independent software and hardware products.  Is it really necessary that additional backup servers be installed to support business growth?  Is it too much to ask for a turnkey solution that can address the needs of a large enterprise?

The stuff that dreams are made of

 

PBBA

Transformation from a Backup Appliance to a Recovery Platform

Protection storage offers an end-to-end solution, integrating full-function data protection capabilities with deduplicated storage.  The simplicity and efficiency of application-centric data protection combined with the scale and performance of capacity-optimized storage systems stands to fundamentally alter the traditional backup market.  Changed data is copied directly between the source and the target, without intervening backup servers.  Cloud storage may also be used as a cost-effective target.  Leveraging integrated software and hardware for what each does best allows vendors to offer innovations to customers in a manner that lowers their total cost of ownership.  Innovations like automatic configuration, dynamic optimization, and using preferred management interfaces (e.g., virtualization consoles, pod managers) build on the proven practices of the past to integrate data protection into the customer’s information infrastructure.

No one wants to be locked into products because they are too painful to switch out; it’s time that products are “sticky” because they offer compelling solutions.  IDC projects that the worldwide purpose-built backup appliance (PBBA) market will grow 16.6% from $1.7 billion in 2010 to $3.6 billion by 2015.  The industry is rapidly adopting PBBAs to overcome the data protection challenges associated with data growth.  Looking forward, storage systems will be expected to incorporate a recovery platform, supporting security and compliance obligations, and data protection solutions will become information brokers for what is stored on disk.

Trends in Data Protection

Data protection hasn’t changed much in a long time.  Sure, there are slews of product announcements and incessant declarations of the “next big thing”, but really, how much have market shares really changed over the past decade?  You’ve got to wonder if new technology is fundamentally improving how data is protected or is simply turning the crank to the next model year.  Are customers locked into the incremental changes proffered by traditional backup vendors or is there a better way?

Not going to take it anymore

The major force driving change in the industry has little to do with technology.  People have started to challenge the notion that they, not the computing system, should be responsible for ensuring the integrity of their data.  If they want a prior version of their data, why can’t the system simply provide it?   In essence, customers want to rely on a computing system that just works.  The Howard Beale anchorman in the movie Network personifies the anxiety that burdens customers explicitly managing backups, recoveries, and disaster recovery.  Now don’t get me wrong; it is critical to minimize risk and manage expectations.   But the focus should be on delivering data protection solutions that can simply be ignored.

Are you just happy to see me?

The personal computer user is prone to ask “how hard can it be to copy data?”  Ignoring the fact that many such users lose data on a regular basis because they have failed to protect their data at all, the IT professional is well aware of the intricacies of application consistency, the constraints of backup windows, the demands of service levels and scale, and the efficiencies demanded by affordability.    You can be sure that application users that have recovered lost or corrupted data are relieved.  Mae West, posing as a backup administrator, might have said “Is that a LUN in your pocket or are you just happy to see me?”

In the beginning

Knowing where the industry has been is a good step in knowing where the industry is going.  When the mainframe was young, application developers carried paper tape or punch cards.  Magnetic tape was used to store application data as well as a media to copy it to. Over time, as magnetic disk became affordable for primary data, the economics of magnetic tape remained compelling as a backup media.  Data protection was incorporated into the operating system through backup/recovery facilities, as well as through 3rd party products.

As microprocessors led computing mainstream, non-mainframe computing systems gained prominence and tape became relegated to secondary storage.  Native, open source, and commercial backup and recovery utilities stored backup and archive copies on tape media and leveraged its portability to implement disaster recovery plans.  Data compression increased the effective capacity of tape media and complemented its power consumption efficiency.

All quiet on the western front

Backup to tape became the dominant methodology for protecting application data due to its affordability and portability.  Tape was used as the backup media for application and server utilities, storage system tools, and backup applications.

B2T

Backup Server copies data from primary disk storage to tape media

Customers like the certainty of knowing where their backup copies are and physical tapes are comforting in this respect.  However, the sequential access nature of the media and indirect visibility into what’s on each tape led to difficulties satisfying recovery time objectives.  Like the soldier who fights battles that seem to have little overall significance, the backup administrator slogs through a routine, hoping the company’s valuable data is really protected.

B2D phase 1

Backup Server copies data to a Virtual Tape Library

Uncomfortable with problematic recovery from tape, customers have been evolving their practices to a backup to disk model.  Backup to disk and then to tape was one model designed to offset the higher cost of disk media but can increase the uncertainty of what’s on tape.  Another was to use virtual tape libraries to gain the direct access benefits of disk while minimizing changes in their current tape-based backup practices.  Both of these techniques helped improve recovery time but still required the backup administrator to acquire, use, and maintain a separate backup server to copy the data to the backup media.

Snap out of it!

Space-efficient snapshots offered an alternative data protection solution for some file servers. Rather than use separate media to store copies of data, the primary storage system itself would be used to maintain multiple versions of the data by only saving changes to the data.  As long as the storage system was intact, restoration of prior versions was rapid and easy.  Versions could also be replicated between two storage systems to protect the data should one of the file servers become inaccessible.

snapshot

Point in Time copies on disk storage are replicated to other disks

This procedure works, is fast, and is space efficient for data on these file servers but has challenges in terms of management and scale.  Snapshot based approaches manage versions of snapshots; they lack the ability to manage data protection at the file level.  This limitation arises because the customer’s data protection policies may not match the storage system policies.  Snapshot based approaches are also constrained by the scope of each storage system so scaling to protect all the data in a company (e.g., laptops) in a uniform and centralized (labor-efficient) manner is problematic at best.

CDP

Writes are captured and replicated for protection

Continuous Data Protection (both “near CDP” solutions which take frequent snapshots and “true CDP” solutions which continuously capture writes) is also being used to eliminate the backup window thereby ensuring large volumes of data can be protected.  However, the expense and maturity of CDP needs to be balanced with the value of “keeping everything”.

 

 

An offer he can’t refuse

Data deduplication fundamentally changed the affordability of using disk as a backup media.  The effective cost of storing data declined because duplicate data need only be stored once. Coupled with the ability to rapidly access individual objects, the advantages of backing up data to deduplicated storage are overwhelmingly compelling.  Originally, the choice of whether to deduplicate data at the source or target was a decision point but more recent offerings offer both approaches so customers need not compromise on technology.  However, simply using deduplicated storage as a backup target does not remove the complexity of configuring and supporting a data protection solution that spans independent software and hardware products.  Is it really necessary that additional backup servers be installed to support business growth?  Is it too much to ask for a turnkey solution that can address the needs of a large enterprise?

The stuff that dreams are made of

 

PBBA

Transformation from a Backup Appliance to a Recovery Platform

Protection storage offers an end-to-end solution, integrating full-function data protection capabilities with deduplicated storage.  The simplicity and efficiency of application-centric data protection combined with the scale and performance of capacity-optimized storage systems stands to fundamentally alter the traditional backup market.  Changed data is copied directly between the source and the target, without intervening backup servers.  Cloud storage may also be used as a cost-effective target.  Leveraging integrated software and hardware for what each does best allows vendors to offer innovations to customers in a manner that lowers their total cost of ownership.  Innovations like automatic configuration, dynamic optimization, and using preferred management interfaces (e.g., virtualization consoles, pod managers) build on the proven practices of the past to integrate data protection into the customer’s information infrastructure.

No one wants to be locked into products because they are too painful to switch out; it’s time that products are “sticky” because they offer compelling solutions.  IDC projects that the worldwide purpose-built backup appliance (PBBA) market will grow 16.6% from $1.7 billion in 2010 to $3.6 billion by 2015.  The industry is rapidly adopting PBBAs to overcome the data protection challenges associated with data growth.  Looking forward, storage systems will be expected to incorporate a recovery platform, supporting security and compliance obligations, and data protection solutions will become information brokers for what is stored on disk.

CSI Quarterly Update Q3 2011

A Message from
SNIA Links:

Follow SNIA:
Linkedin
Twitter
Facebook

SNIA Blogs:

Cloud Storage Initiative

Upcoming Activities

Get Involved Now!

A limited number of these activities are open to all, or Join SNIA and the CSI to participate in any of these activities

July Cloud Plugfest

The purpose of the Cloud Plugfest is for vendors to bring their implementations of CDMI and OCCI to test, identify, and fix bugs in a collaborative setting with the goal of providing a forum in which companies can develop interoperable products.

The Cloud Plugfest starts on Tuesday July 12 and runs thru Thursday July 14, 2011 at the SNIA Technology Center in Colorado Springs, CO.  The SNIA Cloud Storage Initiative (CSI) is underwriting the costs of the event, therefore there is no participation fee.

More Information

SNIA Cloud Burst Event

There are a multitude of events dedicated to cloud computing, but where can you go to find out specifically about cloud storage? The 2011 SNIA Cloud Burst Summit educates and offers insight into this fast–growing market segment. Come hear from industry luminaries, see live demonstrations, and talk to technology vendors about how to get started with cloud storage.

More information

Cloud Lab Plugfest at SDC

Plugfests have always been an important part of the Storage Developers Conference and this year will be the first Cloud Lab Plugfest event held over multiple days to test the interoperability of CDMI, OVF and OCCI implementations.

To get involved, please contact: arnold@snia.org

Cloud Pavilion at SNW

Every SNW, one of highlights is the Cloud Pavilion where attendees can see public and private cloud offerings and discuss solutions. Space is limited, so get involved early to ensure your spot.

To get involved, please contact: lisa.mercurio@snia.org

Beyond Potatoes – Migrating from NFSv3

“It is a mistake to think you can solve any major problems just with potatoes.”
Douglas Adams (1952-2001, English humorist, writer and dramatist)

While there have been many advances and improvements to NFS over the last decade, some IT organizations have elected to continue with NFSv3 – like potatoes, it’s the staple filesystem protocol that just about any UNIX administrator understands.

Although adequate for many purposes and a familiar and well understood protocol, choosing and continuing to deploy NFSv3 has become increasingly difficult to justify in a modern datacenter. For example, NFSv3 makes promiscuous use of ports, something that is unsuitable for a variety of security reasons for use over a wide area network (WAN); plus increased server & client bandwidth demands and improved functionality of Network Attached Storage (NAS) arrays have outstripped NFSv3’s ability to deliver high throughput.
NFSv4 and the minor versions that follow it are designed to address many of the issues that NFSv3 poses. NFSv4 also includes features intended to enable its use in global wide area networks (WANs), and to improve the performance and resilience of NAS (Network Attached Storage):

  • Firewall-friendly single port operations
  • Internationalization support
  • Replication and migration facilities
  • Mandatory use of strong RPC security flavors that depend on cryptography, with support of access control that is compatible with both UNIX® and Windows®
  • Use of character strings instead of integers to represent user and group identifiers
  • Advanced and aggressive cache management features with delegations
  • (with NFSv4.1 pNFS, or parallel NFS) Trunking

In April 2003, the Network File System (NFS) version 4 Protocol was ratified as an Internet standard, described in RFC-3530, which superseded NFS Version 3 (NFSv3, specified in RFC-1813). Since the ratification of NFSv4, further advances have been made to the standard, notably NFSv4.1 (as described in RFC-5661, ratified in January 2010) that included several new features such as parallel NFS (pNFS). And further work is currently underway in the IETF for NFSv4.2.

Delegations with NFSv4

In NFSv3, clients have to function as if there is contention for the files they have opened, even though this is often not the case. As a result of this conservative approach to file locking, there are frequently many unneeded requests from the client to the server to find out whether an open file has been modified by some other client. Even worse, all write I/O in this scenario is required to be synchronous, further impacting client-side performance.
NFSv4 differs by allowing the server to delegate specific actions on a file to the client; this enables more aggressive client caching of data and the locks. A server temporarily cedes control of file updates and the locking state to a client via a delegation, and promises to notify the client if other clients are accessing the file. Once the client holds a delegation, it can perform operations on files with data has been cached locally, and thereby avoid network latency and optimize its use of I/O.

Trunking with pNFS

Many additional enhancements to NFSv4 are available with NFSv4.1, of which pNFS is a part. pNFS adds the capability to perform trunking at the NFS level by adding a session layer. The client establishes a session with an NFSv4.1 server, and can then create multiple TCP connections to the NFSv4.1 server, each potentially going over a different network interface on the client, and arriving on a different interface on the NFSv4.1 server. Now different requests sent over the same session identifier can go over different network paths, dramatically improving latency and increasing bandwidth.
Although client and server implementations of NFSv4.1 are available, they are in early stages of implementation and adoption. However, to take advantage of them in the future, it is important to plan now for the move to NFSv4 and beyond – and there are many servers and clients available now that support NFSv4. NFSv4 is a mature and stable protocol with many advantages in its own right over its predecessors NFSv3 and NFSv2.

Potatoes and Beyond

Now is the time to make the switchover; there really is no justification for not pursuing NFSv4 as the first NFS protocol version of choice. Although migrating from earlier versions of NFS requires some planning as there are significant differences between the two protocols, the benefits are impressive. To ensure a smooth migration to NFSv4 and beyond, the SNIA Ethernet Storage Forum NFS Special Interest Group has recently published an overview white paper “Migrating to NFSv4”. This covers internationalization support, automatic mounting of NFSv4 filesystems on demand, TCP protocol support amongst other considerations.
NFSv4 and NFSv4.1 have been developed for a reason; and NFSv4.2 is on the horizon. Like the potato, NFSv3 is a staple of the network Filesystem world. But as Douglas Adams said; “It is a mistake to think you can solve any major problems just with potatoes.” NFSv4 fixes many of NFSv3’s deficiencies, and represents a major advance that brings improved availability, performance and security; all the check-list items beyond potatoes that today’s users of network attached storage demand.

Get your hands on a Storage Cloud

Register-Banner2.jpg

Building your own standards-based private storage cloud.

Tuesday May24th, 1-5pm

Omni Interlocken Hotel,

Broomfield, CO

This year at Gluecon SNIA will be conducting a Hands on Lab workshop for Developers,

This session will take you deeper into cloud storage than you likely have ever been. First we will explore the standard cloud storage interface called CDMI (Cloud Data Management Interface), including some of the rationale and design tradeoffs in its creation.

Learn about how to use the RESTful interface to move data into and out of a storage cloud using a common interface. Learn how CDMI enables data portability between clouds. Dig deep into features such as Data System Metadata (how you order services from the cloud), cloud-side operations, queues, query and more.

Then stick around as we load an open source Java implementation of CDMI onto your laptop to create your own private cloud. Explore the workings of the JAX-RS standard used in this implementation and the storage code working behind the scenes. Advanced users can even implement their own cloud storage features and expose them through the standard interface.