Cloud Storage Development Challenges – An SDC Preview

This year’s Storage Developer Conference (SDC) is expected to draw over 400 storage developers and professionals. On August 4th, you can get a sneak preview of key cloud topics that will be covered at SDC in this live Webcast where David Slik and Mark Carlson Co-Chairs of the SNIA Cloud Technical Work Group, together with Yong Chen, Assistant Professor at Texas Tech University will discuss:

  • Mobile and Secure – Cloud Encrypted Objects using CDMI
  • Object Drives: A new Architectural Partitioning
  • Unistore: A Unified Storage Architecture for Cloud Computing
  • Using CDMI to Manage Swift, S3, and Ceph Object Repositories

You’ll learn how encrypted objects can be stored, retrieved, and transferred between clouds, how Object Drives allow storage to scale up and down by single drive increments, end-user and vendor use cases of the Cloud Data Management Interface (CDMI), and we’ll introduce Unistore – an innovative unified storage architecture that efficiently integrates heterogeneous HDD and SCM devices for Cloud storage systems.

I’ll be moderating the discussion among this expert panel. It should be an enlightening and lively hour. I hope you’ll register now to join us.

 

New Webcast: Storage Performance Benchmarking 101

There’s an art to making sense of storage performance benchmarks. That’s why ESF is hosting a live Webcast on this important topic. Please join us on July 30th for “Storage Performance Benchmarking: Introduction and Fundamentals.”   At this Webcast, you’ll gain an understanding of the complexities of benchmarking modern storage arrays and learn the terminology foundations necessary for the rest of the series as Mark Rogov, Advisory Systems Engineer at EMC, Ken Cantrell, Performance Engineering Manager at NetApp, and I discuss:

  • The different kinds of performance benchmarking engagements
  • An introduction to the variety of relevant metrics
  • How to determine the “right” metrics for your business
  • Terminology basics: IOPS, op/s, throughput, bandwidth, latency/response time

If you’re untrained in the storage performance arts, this Webcast will bring you up-to-speed on the basics. My colleagues and I hope it will be an informative and interactive hour. Please register today and bring your questions. I hope to see you there.

Ethernet Roadmap for Networked Storage Q&A

Almost 200 people attended our joint Webcast with the Ethernet Alliance: “The 2015 Ethernet Roadmap for Networked Storage.” We had a lot of great questions during the live event, but we did not have time to answer them all. As promised, we’ve complied answers for all of the questions that came in. If you think of additional questions, please feel free to comment on this blog.

Q. What did you mean by parity of flash with HDD?

A. We were referring to the O’Reilly article in “Network Computing.”  O’Reilly is predicting parity in BOTH capacity and price in 2016.

Q. When do we expect IEEE standards ratification for 25G speed?

A. 2016.  You can see the exact schedule here.

Q. Do you envision the Enterprise, Cloud Providers, HPC, Financials getting rid of their 10/40GbE infrastructure and replacing that with 25/100GbE infrastructure in 2017? Will these customers deploy 100GbE/25GbE switch in the leaf layer in 2017?

A. Deployment will occur over a multi-year time span overall if only because switch infrastructure is expensive to upgrade, as reflected in the Crehan Research forecast.  New deployments will likely move to 25/100GbE as new switches with 100GbE downstream ports become available in 2016.   Just because the Cloud Service Providers are currently the most aggressive in driving new infrastructure purchases, they represent the largest early volumes for 25/100 GbE.  Enterprise is still in the midst of the transition from 1GbE to 10GbE.

Q. What are some of the developments on spanning-tree derivatives vs. Dykstra based derivatives such as OSPF, FSPF for switches?

A. Beyond the scope of this presentation on Ethernet.  Ethernet is defined by the IEEE for L1 and L2 in the ISO model.  Your questions are at L3 and L4, which is handled by organizations like IETF.

Q. With all the speeds possible who is working on flow control?

A. Flow control at the 802.1 level is supported in the Layer 1/2 PHY & MAC by setting upper bounds on the delay through each layer which allows higher layers to comprehend the delays & response times to pause frames. Each new speed & PHY in 802.3 is accompanied by delay constraint specifications to support this.

Q.  Do you have an overlay graphic that shows the Ethernet RDMA roadmap?  If so, is Ethernet storage the primary driver for that technology?

A.  Beyond the scope of this presentation on Ethernet.  Ethernet is defined by the IEEE for L1 and L2 in the ISO model.  Your questions are at L3 and L4, which is handled by organizations like IETF and the InfiniBand Trade Association.

Q. The adoption of faster and new Ethernet always has to do with the costs of acquiring new technology. How long do you think it will take to adopt/acquire faster Ethernet in datacenters now that the development is happening much faster than the last 20 years?

A. Please see the chart on slide 7 where Crehan Research predicts how fast the technology will diffuse into deployments.

Q. What do you expect as cost comparison between Ethernet and InfiniBand going forward?
Also, what work is being done to reduce latency?

A. Beyond the scope of this presentation.  Latency is primarily a consequence of design methodologies and semiconductor process technology, and thus under the control of the silicon device manufacturers.  Some vendors prioritize latency more than others.

Q. What’s the technical limitation as speeds go higher and higher?

A. A number of factors limit speeds going faster and faster, but the main problem is that materials attenuate signals as they travel at higher frequencies.

Q. Will 1GbE used for manageability purposes disappear from public cloud? If so, what is the expected time frame?

A. This is a choice for end users.  Most equipment is managed on a separate network for security concerns, but users can eliminate these management networks at any time.

Q. What are the relative market size predictions for the expanding number of standards (25G, 50G, 100G, 200G, etc.)?

A. See the Crehan Research forecast in the presentation.

Q. What is the major difference between SMF & MMF for the not so initiated?

A. The SMF has a 9um core while the MMF has a 50um core.  Different lasers are used for each fiber type and MMF typically goes 100 meters above 10GbE and SMF goes from 500m to 10km.

Q. Will 25G be available through both copper and fibre connectivity?

A. Yes.  IEEE 802.3 work is currently underway to specify 25Gb/s on twinax (“direct attach copper)” to 5 meters, printed circuit backplane up to ~1m, twisted pair copper to 30m, multimode fiber to 100m.  There is no technology barrier to 25G on SMF, just that a standards project to specify it has not started yet.

Q. This is interesting from a hardware viewpoint, but has nothing to do with storage yet.  Are we going to get to how this relates to storage other than saying flash drives are fast and only Ethernet can keep up?

A. Beyond the scope of this presentation on Ethernet.  Ethernet is defined by the IEEE for L1 and L2 in the ISO model.  Your questions are directed at the higher layers.  The key point of this webcast is that storage networking engineers need to pay much more attention to the Ethernet roadmap than they have historically, primarily because of NVM.

Q. How does “SFP 28″ fit in this mix?  Is it required for 25G?

A. SFP28 connectors and modules are required for 25GbE because they give better performance than SFP+ that only works to 10GbE.

Q. Can you provide the quick difference between copper & optical on speed & costs?

A. Copper and optical Ethernet links are usually standardized at the same speed.  400GbE is not defining a copper link but an active Direct Attached Cable (DAC) will probably support 400GbE.  Cost depends on volume and many factors and is beyond the scope of this presentation.  Copper is usually a fraction of the cost of optical links.

Q. Do you think people will try to use multiple CAT 5e to get more aggregate bandwidth to the access points to avoid having to run Fibre to them?

A. IEEE is defining 2.5GBASE-T and 5GBASE-T to enable Cat5e to support faster wireless access points.

Q. When are higher speeds and PoE going to reach the point when copper based Ethernet will become a viable heat source for buildings thus helping the environment?

A. :)  IEEE is defining 4 wire PoE to deliver at least 60W to end devices.  You can find out more here.

Q. What are the use cases for 2.5Gb and 5.0Gb Base-T?

A. The leading use case for 2.5G/5GBASE-T is to provide the uplink for wireless LAN access points that support 802.11ac and future wireless technology.  Wireless LAN technology has advanced to the point where >1Gb/s BW is needed upstream from the AP, and 2.5G/5G provide a higher speed uplink while preserving the user’s investment in Cat5e/Cat6 cabling.

Q. Why not have only CFP2 sockets right away with things disabled for lower speeds for all the intervening years leading to full-fledged CFP2?

A. CFP2 is defined for 100GbE and 8 ports can be used on a 1U switch. 100GbE switches are shifting to QSFP28 so that 32 ports of 100GbE is supported in a 1U switch at low cost.  The CFP2 is much more expensive than QSFP28 and will not be used for lower speeds because of the high cost.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Data Recovery and Selective Erasure of Solid State Storage a New Focus at SNIA

The rise of solid state storage has been incredibly beneficial to users in a variety of industries. Solid state technology presents a more reliable and efficient alternative to traditional storage devices. However, these benefits have not come without unforeseen drawbacks in other areas. For those in the data recovery and data erase industries, for example, solid state storage has presented challenges. The obstacles to data recovery and selective erasure capabilities are not only a problem for those in these industries, but they can also make end users more hesitant to adopt solid state storage technology.

Recently a new Data Recovery and Erase Special Interest Group (SIG) has been formed within the Solid State Storage Initiative (SSSI) within the Storage Networking Industry Association (SNIA). SNIA’s mission is to “lead the storage industry worldwide in developing and promoting standards, technologies and educational services to empower organizations in the management of information.” This fantastic organization has given the Data Recovery and Erase SIG a solid platform on which to build the initiative.

The new group has held a number of introductory open meetings for SNIA members and non-members to promote the group and develop the group’s charter. For its initial meetings, the group sought to recruit both SNIA members and non-members that were key stakeholders in fields related to the SIG. This includes data recovery providers, erase solution providers and solid state storage device manufacturers. Aside from these groups, members of leading standards bodies and major solid state storage device consumers were also included in the group’s initial formation.

The group’s main purpose is to be an open forum of discussion among all key stakeholders. In the past, there have been few opportunities for representatives from different industries to work together, and collaboration had often been on an individual basis rather than as a group. With the formation of this group, members intend to cooperate between industries on a collective basis in order to foster a more constructive dialogue incorporating the opinions and feedback of multiple parties.

During the initial meetings of the Data Recovery and Erase SIG, members agreed on a charter to outline the group’s purpose and goals. The main objective is to foster collaboration among all parties to ensure consumer demands for data recovery and erase services on solid state storage technology can be performed in a cost-effective, timely and fully successful manner

In order to achieve this goal, the group has laid out six steps needed, involving all relevant stakeholders:

  1. Build the business case to support the need for effective data recovery and erase capabilities on solid state technology by using use cases and real examples from end users with these needs.
  2. Create a feedback loop allowing data recovery providers to provide failure information to manufacturers in order to improve product design.
  3. Foster cooperation between solid state manufacturers and data recovery and erase providers to determine what information is necessary to improve capabilities.
  4. Protect sensitive intellectual property shared between data recovery and erase providers and solid state storage manufacturers.
  5. Work with standards bodies to ensure future revisions of their specifications account for capabilities necessary to enable data recovery and erase functionality on solid state storage.
  6. Collaborate with solid state storage manufacturers to incorporate capabilities needed to perform data recovery and erase in product design for future device models.

The success of this special interest group depends not only on the hard work of the current members, but also in a diverse membership base of representatives from different industries. We will be at Flash Memory Summit in booth 820 to meet you in person! Or you can visit our website at www.snia.org/forums/sssi for more information on this new initiative and all solid state storage happenings at SNIA.   If you’re a SNIA member and you’d like to learn more about the Data Recovery/Erase SIG or you think you’d be a good fit for membership, we’d love to speak with you.  Not a SNIA member yet? Email marty.foltyn@snia.org for details on joining.

Save the Date: SNIA Emerald Training Webinar July 20-21

The SNIA Green TWG (Technical Work Group) and Green Storage Initiative (GSI) will deliver 6 hours of training through a webinar format on July 20th and July 21st. Each day will feature a 3 hour segment from 1PM–4PM Pacific time; July 20th will cover the technical details of the SNIA Emerald v2.1 Specification and supporting test tools; and July 21st will cover the program details for SNIA Emerald Programs, USA EPA EnergyStar Program, and supplementary sessions by testing services and certifying body services.

A highlight of the changes in the V2.1 Specification include:

  • Corrections and clarity to the V2.0.2 specification
  • Required use of Vdbench and a SNIA provided Vdbench test script
  • Improvements in COM testing
  • Changes in pre–fill criteria percentages
  • Change in metric stability criterion
  • Online and near–online categories are combined

To RSVP for the July 20–21 webinar training, please send an email no later than July 16th with your name, title, email address, company name to emerald-training@snia.org. There is no fee to register and attend the training. After the training sessions, training materials will be posted to www.sniaemerald.com. A detailed training agenda along with the webinar details will be emailed to those who have registered.

The Life of a Storage Packet

Keeping storage as close to the application as possible and reasonable is important, but different types of storage can make a big difference for performance as well as types of workloads. Starting with the basics and working to more complexity, find out how storage really works in this first of the Packet Walk series of the “Napkin Dialogues” series. Warning: You’re on your own when tipping the pizza delivery person!

Download (PDF, 1.71MB)

 

Block Storage in OpenStack Q&A

The team at SNIA-ESF and I were very pleased with how many people attended our live “Block Storage in the Open Source Cloud called OpenStack.” If you missed it, please check it out on demand. We had several great questions during the live event. As promised here are answers to all of them. If you have additional questions, please feel free to comment on this blog.

Q. How is the support for OpenStack, if we hit a roadblock or need some features?

A. The OpenStack community has many avenues for contacting developers for support. The official place to report issues, file bugs or ask for new features is Launchpad. https://launchpad.net/openstack. It is the central place for all of the many OpenStack projects to file bugs or feature requests. This is also the location where every OpenStack project tracks its current release cycle and all of the features, called blueprints. Another good source of information are the public mailing lists. A good place to start for the mailing list is here, https://wiki.openstack.org/wiki/Mailing_Lists. Finally, developers are also on the public Internet Relay Chat channels associated with their projects. The developers are live and interactive, on each of the channels. You can find the information about the IRC system that OpenStack developers use here: https://wiki.openstack.org/wiki/IRC.

Q. Why was Python chosen as the programming language? Which version of Python is used as there are incompatibilities between versions?

A. The short answer here is that Python is a great language for rapid development and deployment that is mature and has a wide variety of publicly available libraries for doing work. The current released version of OpenStack uses Python 2.7. The OpenStack community is making efforts to ensure that we can eventually migrate to Python 3.x. New libraries that are being developed have to be Python 3.x compatible.

Q. Is it possible to replicate the backed up volumes at the OpenStack layer or do you defer to the back end array for data replication?

A. Currently, there is no built in support for volume replication in Cinder. The Cinder community is actively working on how to implement volume replication in the next release Liberty, which will ship in the fall of 2015. As with any major new feature in Cinder, the community has to design the new feature core such that it works with the 40+ vendor arrays, in such a way that it’s consistent. As the array support grows, the amount of up front design becomes more and more important and difficult at the same time. We have a specification that we are currently working on that will get us closer to implementing replication.

Q. Who, or what, creates the FC zones?

A. In Cinder, the block storage project, the component that creates and manages Fibre Channel zones is called the Fibre Channel Zone manager. A good document to read up on the zone manager is here: http://www.brocade.com/downloads/documents/at_a_glance/fc-zone-manager-ag.pdf. The official OpenStack documentation on the zone manager is here: http://docs.openstack.org/kilo/config-reference/content/section_fc-zoning.html. The zone manager is automatically called after Cinder Fibre Channel volume driver exports its volume from the array. The zone manager then adds the zones as requested by the driver to make the volume available to the virtual machine.

Q. Does the Cinder and Nova attachment process work over VLANs?

A. Yes. It’s entirely dependent on how the OpenStack admin deploys the Nova and Cinder services. As long as the Nova hosts can see the Cinder services and arrays behind the Cinder volume drivers, then it should just work.

Q. Is the FCZM a native component of the Cinder project? Or is it an add-on?

A. As I mentioned earlier, the Fibre Channel zone manager is part of the Cinder project. There has been some discussions, as part of the Cinder community, to possibly break out the zone manager into it’s own Python library, in which case it would be available to any Python project. Currently, it’s built into Cinder itself.

Q. Does Cinder involve itself in the I/O path as well or is it only the control path responsible for allocating storage?

A. Cinder is almost entirely control plane provisioning mechanism only. There are a few operations where the Cinder services actually does I/O. When a user wants to create an image from a volume, then Cinder attaches the volume to itself, and then copies the bytes from the volume into an image. Cinder also has a backup service that allows a user to backup a volume to an external service. In that case, the Cinder backup service directs copying the bytes into the configured backup storage. When Cinder attaches a volume to a Nova VM, or a bare metal node, Cinder is not involved in any I/O. Cinder’s job is to simply ensure that the volume is exported from the back-end array and make it available to Nova to see. After that, it’s entirely up to the transport protocol, iSCSI, FC, NFS, etc. to do the I/O for the volume.

Q. Is Nova aware of the LUN usage %?

A. Nova doesn’t track statistics against the volumes that it has attached to its virtual machines.

Q. Where do the vendor specific parts of Cinder fit in? Are there vendor specific “volume managers”?

A. The vendor specific components of Cinder exist in what are called Cinder volume drivers.   Those drivers are really nothing more than a python module that conforms to a volume driver API that is defined by the Cinder volume manager. You can get an idea of what the features that the drivers can support on the Cinder Support Matrix here:

https://wiki.openstack.org/wiki/CinderSupportMatrix

Q. If Cinder is only for control plane, which project in OpenStack is for data path?

A. There isn’t a project in OpenStack that manages the data path for volumes.

Q. Is there a volume detachment process as well and when does that come into play?

A. My presentation primarily focused around one aspect of the interaction between Nova and Cinder, which was volume attachment. I briefly discussed the volume detachment process, but it is conducted in basically the same way. An end user asks Nova to detach the volume. Nova then removes the volume from the VM, then removes the SCSI device from the compute host itself, and then tells Cinder to terminate the connection from the array to the compute host.

Q. If a virtual machine is moved to a different physical machine, how’s that handled in Cinder?

A. This process in OpenStack is called live migration. Nova does all of the work of moving the VM’s data, from one host to another. One facet of that is migrating any Cinder volume that may be attached to the VM. Nova understands which volumes are attached to the VM and knows which one of those volume(s) are Cinder volumes. When the VM is migrated, Nova coordinates with Cinder to ensure that all volumes are attached to the destination host and VM, as well as ensures that the volumes are detached from the originating compute host.

Q. Why doesn’t Cinder use SNIA SMI-S API to manage/consume SAN, NAS or Switch fabric instead of each storage vendor building Cinder drivers? SMI already covers all scenarios for the Cinder scenarios for FC, iSCSI, SAS etc.

A. Cinder itself doesn’t really manage the storage array communication itself. It’s entirely up to the individual vendor drivers to decide how best to communicate with its storage array. The HP 3PAR volume driver uses REST to communicate with the array, as do several other vendor drivers in Cinder. Other drivers use ssh. There are no strict rules on how a Cinder volume driver can choose to communicate with its back-end. This allows vendors to make the best use of their array interfaces as they see fit.

Q. Are there Horizon extensions or extension points for showing what physical resources your storage is coming from? Or is that something a storage vendor would need to implement?

A. Horizon doesn’t really know much about where storage is coming from other than it’s a Cinder volume. Horizon uses the available Cinder APIs to talk to Cinder to do work and fetch information about Cinder’s resources. I know of a few vendors that are writing Horizon plugins that add extra capabilities to view more detailed information about their specific array. As of today though, there is no API in Cinder to describe the internals of a volume on the vendor’s array.

 

 

 

 

 

 

Upcoming Webcast: Hybrid Clouds Part 2

On June 10, 2015, SNIACloud will be hosting a live Webcast “Hybrid Clouds Part 2: A Case Study on Building a Bridge between Public and Private Clouds.” There are significant differences in how cloud services are delivered to various categories of users. The integration of these services with traditional IT operations will remain an important success factor but also a challenge for IT managers. The key to success is to build a bridge between private and public clouds. I’ll be back to expand upon our earlier SNIA Hybrid Clouds Webcast where we looked at the choices and strategies for picking a cloud provider for public and hybrid solutions. Please join me on June 10th to hear:

  • Best practices to work with multiple public cloud providers
  • The role of SDS in supporting a hybrid data fabric
  • Hybrid cloud decision criteria
  • Key implementation principles
  • Real-world hybrid cloud use case

Please Register now and bring your questions. This will be a live and interactive event. I hope to see you there.

 

 

SSD Data Retention Issue Debunked

For the past few weeks, there has been quite a commotion on tech sites, and it was all because of a 5 year old presentation. Someone discovered a set of slides created in 2010 by then-chair of the JEDEC SSD committee, Alvin Cox of Seagate. That person misinterpreted one of the slides to mean that unpowered SSDs would retain data for only a few days in a hot room. It was reported on one site and of course picked up by several others, spreading quickly across the spectrum of sites reporting on technical matters. Finally, Alvin and a colleague gave an interview to PCWorld (www.pcworld.com/article/2925173/debunked-your-ssd-wont-lose-data-if-left-unplugged-after-all.html) setting the record straight, stating that a scenario that might cause data loss is highly unlikely, especially for consumers. Read the article for more details.

NFS 4.2 Q&A

We received several great questions at our What’s New in NFS 4.2 Webcast. We did not have time to answer them all, so here is a complete Q&A from the live event. If you missed it, it’s now available on demand.

Q. Are there commercial Linux or windows distributions available which have adopted pNFS?

A. Yes. RedHat RHEL6.2, Suse SLES 11.3 and Ubuntu 14.10 all support the pNFS capable client. There aren’t any pNFS servers on Linux so far; but commercial systems such as NetApp (file pNFS), EMC (block pNFS), Panasas (object pNFS) and maybe others support pNFS servers. Microsoft Windows has no client or server support for pNFS.

Q. Are we able to prevent it from going back to NFS v3 if we want to ensure file lock management?

A. An NFSv4 mount (mount -t nfs4) won’t fall back to an nfs3 mount. See man mount for details.

Q. Can pNFS metadata servers forward clients to other metadata servers?

A. No, not currently.

Q. Can pNfs provide a way similar to synchronous writes? So data’s instantly safe in at least 2 locations?

A. No; that kind of replication is a feature of the data servers. It’s not covered by the NFSv4.1 or pNFS specification.

Q. Does hole punching depend on underlying file system in server?

A. If the underlying server supports it, then hole punching will be supported. The client & server do this silently; a user of the mount isn’t aware that it’s happening.

Q. How are Ethernet Trunks formed? By the OS or by the NFS client or NFS Server or other?

A. Currently, they’re not! Although trunking is specified and is optional, there are no servers that support it.

Q. How do you think vVols could impact NFS and VMware’s use of NFS?

A. VMware has committed to supporting NFSv4.1 and there is currently support in vSphere 6. vVols adds another opportunity for clients to inform the server with IO hints; it is an area of active development.

Q. In pNFS, does the callback call to the client must come from the original-called-to metadata server?

A. Yes, the callback originates from the MDS.

Q. Is hole punched in block units?

A. That depends on the server.

Q. Is there any functionality like SMB continuous availability?

A. Since it’s a function of the server, and much of the server’s capabilities are unspecified in NFSv4, the answer is – it depends. It’s a question for the vendor of your server.

Q. NFS has historically not been used in large HPC cluster environments for cluster-wide storage, for performance reasons. Do you see these changes as potentially improving this situation?

A. Yes. There’s much work being done on the performance side, and the cluster parallelism that pNFS brings will have it outperform NFSv3 once clients employ more of its capabilities.

Q. Speaking of the Amazon adoption for NFSv4.0. Do you have insight / guess on why Amazon did not select NFSv4.1, which has a lot more performance / scalability advantages over NFSv4.0?

A. No, none at all.