OpenStack File Services for HPC Q&A

We got some great questions during our Webcast on how OpenStack can consume and control file services appropriate for High Performance Computing (HPC) in a cloud and multi-tenanted environment. Here are answers to all of them. If you missed the Webcast, it’s now available on-demand. I encourage you to check it out and please feel free to leave any additional questions at this blog.

Q. Presumably we can use other than ZFS for the underlying filesystems in Lustre?

A. Yes, there a plenty of other filesystems that can be used other than ZFS. ZFS was given as an example of a scale up and modern filesystem that has recently been integrated, but essentially you can use most filesystem types with some having more advantages than others. What you are looking for is a filesystem that addresses the weaknesses of Lustre in terms of self-healing and scale up. So any filesystem that allows you to easily grow capacity whilst also being capable of protecting itself would be a reasonable choice. Remember, Lustre doesn’t do anything to protect the data itself. It simply places objects in a distributed fashion of the Object Storage Targets.

Q. Are there any other HPC filesystems besides Lustre?

A. Yes there are and depending on your exact requirements Lustre might not be appropriate. Gluster is an alternative that some have found slightly easier to manage and provides some additional functionality. IBM has GPFS which has been implemented as an HPC filesystem and other vendors have their scale-out filesystems too. An HPC filesystem is simply a scale-out filesystem capable of very good throughput with low latency. So under that definition a flash array could be considered a High Performance storage platform, or a scale out NAS appliance with some fast disks. It’s important to understand you’re workloads characteristics and demands before making the choice as each system has pro’s and con’s.

Q. Does “embarrassingly parallel” require bandwidth or latency from the storage system?

A. Depending on the workload characteristics it could require both. Bandwidth is usually the first demand though as data is shipped to the nodes for processing. Obviously the lower the latency the fast though jobs can start and run, but its not critical as there is limited communication between nodes that normally drives the low latency demand.

Q. Would you suggest to use Object Storage for NFV, i.e Telco applications?

A. I would for some applications. The problem with NFV is it actually captures a surprising breadth of applications so of which have very limited data storage needs. For example there is little need for storage in a packet switching environment beyond the OS and binaries needed to stand up the VM’s. In this case, object is a very good fit as it can be easily, geographically distributed ensuring the same networking function is delivered in the same manner. Other applications that require access to filtered data (so maybe billing based applications or content distribution) would also be good candidates.

Q. I missed something in the middle; please clarify, your suggestion is to use ZFS (on Linux) for the local file system on OSTs?

A. Yes, this was one example and where some work has recently been done in the Lustre community. This affords the OSS’s the capability of scaling the capacity upwards as well as offering the RAID-like protection and self-healing that comes with ZFS. Other filesystems can offer those some things so I am not suggesting it is the only choice.

Q. Why would someone want/need scale-up, when they can scale-out?

A. This can often come down to funding. A lot of HPC environments exist in academic institutions that rely on grant funding and sponsorship to expand their infrastructure. Sometimes it simply isn’t feasible to buy extra servers in order to add capacity, particularly if there is already performance headroom. It might also be the case that rack space, power and cooling could be factors in which case adding drives to cope with bigger workloads might be the only option. You do need to consider if the additional capacity would also provoke the need for better performance so we can’t just assume that adding disk is enough, but it’s certainly a good option and a requirement I have seen a number of times.

 

What to Expect from OpenStack Manila Liberty

On October 7, 2015, the SNIA Ethernet Storage Forum is pleased to present its next live Webcast on OpenStack Manila. Manila is the OpenStack file share service that provides the management of file shares (for example, NFS and CIFS) as a core service to OpenStack. Intended to be an open-standards, highly-available and fault-tolerant component of OpenStack, Manila also aims to provide API-compatibility with popular systems like Amazon EC2.

I will be moderating this Webcast, presented by the OpenStack Manila Project Team Lead (PTL), Ben Swartzlander, who will dive into:

  • An overview of Manila
  • New features that are being delivered for OpenStack Liberty (due October 2015)
  • A preview of Makita

With Liberty availability due next month, this information is extremely timely; I encourage you to register now to block your calendar. This will be a live and interactive Webcast, please bring your questions. I look forward to “seeing” you on October 7th.

A Deep Dive into the SNIA Storage Developer Conference – The File Systems Track

SNIA Storage Developer Conference (SDC) 2015 is two weeks away, and the SNIA Technical Council is finalizing a strong, comprehensive agenda of speakers and sessions. Wherever your interests lie, you’re sure to find experts and topics that will expand your knowledge and fuel your professional development!SDC15_WebHeader3_999x188

For the next two weeks, SNIA on Storage will highlight exciting interest areas in the 2015 agenda. If you have not registered, you need to! Visit www.storagedeveloper.org to see the four day overview and sign up.

Year after year, the File Systems Track at SDC provides in depth information on the latest technologies, and 2015 is no exception. The track kicks off with Vinod Eswaraprasad, Software Architect, Wipro, on Creating Plugin Modules for OpenStack Manila Services. He’ll discuss his work on integrating a multi-protocol NAS storage device to the OpenStack Manila service, looking at the architecture principle behind the scalability and modularity of Manila services, and the analysis of interface extensions required to integrate a typical NAS head.

Ankit Agrawal and Sachin Goswami of TCS will discuss How to Enable a Reliable and Economic Cloud Storage Solution by Integrating SSD with LTFS. They will share views on how to integrate SSD as a cache with a LTFS tape system to transparently deliver the best benefits for Object Base storage, and talk about the potential challenges in their approach and best practices that can be adopted to overcome these challenges.

Jakob Homan, Distributed Systems Engineer, Microsoft, will present Apache HDFS: Latest Developments and Trends. He’ll discuss the new features of HDFS, which has rapidly developed to meet the needs of enterprise and cloud customers, and take a look at HDFS at implementations and how they address previous shortcomings of HDFS.

James Cain, Principal Software Architect, Quantel Limited, will discuss a Pausable File System, using his own implementation of an SMB3 server (running in user mode on Windows) to demonstrate the effects of marking messages as asynchronously handled and then delaying responses in order to build up a complete understanding of the semantics offered by a pausable file system.

Ulrich Fuchs, Service Manager, CERN, will talk about Storage Solutions for Tomorrow’s Physics Projects, suggesting possible architectures for tomorrow’s storage implementations in this field, and showing results of first performance tests done on various solutions (Lustre, NFS, Block Object storage, GPFS ..) for typical application access patterns.

Neal Christiansen, Principal Development Lead, Microsoft, will present Support for New Storage Technologies by the Windows Operating System, describing the changes being made to the Windows OS, its file systems, and storage stack in response to new evolving storage technologies.

Richard Morris and Peter Cudhea of Oracle will discuss ZFS Async Replication Enhancements, exploring design decisions around enhancing the ZFS send and ZFS receive commands to transfer already compressed data more efficiently and to recover from failures without re-sending data that has already been received.

J.R. Tipton, Development Lead, Microsoft, will discuss ReFS v2: Cloning, Projecting, and Moving Data File Systems, presenting new abstractions that open up greater control for applications and virtualization, covering block projection and cloning as well as in-line data tiering.

Poornima Gurusiddaiah and Soumya Koduri of Red Hat will present Achieving Coherent and Aggressive Client Caching in Gluster, a Distributed System, discussing how to implement file system notifications and leases in a distributed system and how these can be leveraged to implement a client side coherent and aggressive caching.

Sriram Rao, Partner Scientist Manager, Microsoft, will present Petabyte-scale Distributed File Systems in Open Source Land: KFS Evolution, providing an overview of OSS systems (such as HDFS and KFS) in this space, and describing how these systems have evolved to take advantage of increasing network bandwidth in data center settings to improve application performance as well as storage efficiency.

Richard Levy, CEO and President, Peer Fusion, will discuss a High Resiliency Parallel NAS Cluster, including resiliency design considerations for large clusters, the efficient use of multicast for scalability, why large clusters must administer themselves, fault injection when failures are the nominal conditions, and the next step of 64K peers.

Join your peers as well – register now at www.storagedeveloper.org. And stay tuned for tomorrow’s blog on Cloud topics at SDC!

OpenStack File Services Options

How can OpenStack consume and control file services appropriate to High Performance Compute (HPC) in a cloud and multi-tenanted environment? Find out on September 22nd when SNIA Cloud hosts a live Webcast and examines two approaches to integration.

One approach is to have OpenStack manage the storage infrastructure services using Cinder, Nova and Neutron to provide HPC Filesystem as a Service.

A second option is to use Manila file services for OpenStack to control the HPC File system deployment and manage the exports etc. This part also looks at the creation (in progress) of the Lustre Manila driver and its current progress.

I hope you’ll join Alex McDonald and me as we discuss the pros and cons of each approach. Register today and

Cloud Storage Development Challenges – An SDC Preview

This year’s Storage Developer Conference (SDC) is expected to draw over 400 storage developers and professionals. On August 4th, you can get a sneak preview of key cloud topics that will be covered at SDC in this live Webcast where David Slik and Mark Carlson Co-Chairs of the SNIA Cloud Technical Work Group, together with Yong Chen, Assistant Professor at Texas Tech University will discuss:

  • Mobile and Secure – Cloud Encrypted Objects using CDMI
  • Object Drives: A new Architectural Partitioning
  • Unistore: A Unified Storage Architecture for Cloud Computing
  • Using CDMI to Manage Swift, S3, and Ceph Object Repositories

You’ll learn how encrypted objects can be stored, retrieved, and transferred between clouds, how Object Drives allow storage to scale up and down by single drive increments, end-user and vendor use cases of the Cloud Data Management Interface (CDMI), and we’ll introduce Unistore – an innovative unified storage architecture that efficiently integrates heterogeneous HDD and SCM devices for Cloud storage systems.

I’ll be moderating the discussion among this expert panel. It should be an enlightening and lively hour. I hope you’ll register now to join us.

 

Block Storage in OpenStack Q&A

The team at SNIA-ESF and I were very pleased with how many people attended our live “Block Storage in the Open Source Cloud called OpenStack.” If you missed it, please check it out on demand. We had several great questions during the live event. As promised here are answers to all of them. If you have additional questions, please feel free to comment on this blog.

Q. How is the support for OpenStack, if we hit a roadblock or need some features?

A. The OpenStack community has many avenues for contacting developers for support. The official place to report issues, file bugs or ask for new features is Launchpad. https://launchpad.net/openstack. It is the central place for all of the many OpenStack projects to file bugs or feature requests. This is also the location where every OpenStack project tracks its current release cycle and all of the features, called blueprints. Another good source of information are the public mailing lists. A good place to start for the mailing list is here, https://wiki.openstack.org/wiki/Mailing_Lists. Finally, developers are also on the public Internet Relay Chat channels associated with their projects. The developers are live and interactive, on each of the channels. You can find the information about the IRC system that OpenStack developers use here: https://wiki.openstack.org/wiki/IRC.

Q. Why was Python chosen as the programming language? Which version of Python is used as there are incompatibilities between versions?

A. The short answer here is that Python is a great language for rapid development and deployment that is mature and has a wide variety of publicly available libraries for doing work. The current released version of OpenStack uses Python 2.7. The OpenStack community is making efforts to ensure that we can eventually migrate to Python 3.x. New libraries that are being developed have to be Python 3.x compatible.

Q. Is it possible to replicate the backed up volumes at the OpenStack layer or do you defer to the back end array for data replication?

A. Currently, there is no built in support for volume replication in Cinder. The Cinder community is actively working on how to implement volume replication in the next release Liberty, which will ship in the fall of 2015. As with any major new feature in Cinder, the community has to design the new feature core such that it works with the 40+ vendor arrays, in such a way that it’s consistent. As the array support grows, the amount of up front design becomes more and more important and difficult at the same time. We have a specification that we are currently working on that will get us closer to implementing replication.

Q. Who, or what, creates the FC zones?

A. In Cinder, the block storage project, the component that creates and manages Fibre Channel zones is called the Fibre Channel Zone manager. A good document to read up on the zone manager is here: http://www.brocade.com/downloads/documents/at_a_glance/fc-zone-manager-ag.pdf. The official OpenStack documentation on the zone manager is here: http://docs.openstack.org/kilo/config-reference/content/section_fc-zoning.html. The zone manager is automatically called after Cinder Fibre Channel volume driver exports its volume from the array. The zone manager then adds the zones as requested by the driver to make the volume available to the virtual machine.

Q. Does the Cinder and Nova attachment process work over VLANs?

A. Yes. It’s entirely dependent on how the OpenStack admin deploys the Nova and Cinder services. As long as the Nova hosts can see the Cinder services and arrays behind the Cinder volume drivers, then it should just work.

Q. Is the FCZM a native component of the Cinder project? Or is it an add-on?

A. As I mentioned earlier, the Fibre Channel zone manager is part of the Cinder project. There has been some discussions, as part of the Cinder community, to possibly break out the zone manager into it’s own Python library, in which case it would be available to any Python project. Currently, it’s built into Cinder itself.

Q. Does Cinder involve itself in the I/O path as well or is it only the control path responsible for allocating storage?

A. Cinder is almost entirely control plane provisioning mechanism only. There are a few operations where the Cinder services actually does I/O. When a user wants to create an image from a volume, then Cinder attaches the volume to itself, and then copies the bytes from the volume into an image. Cinder also has a backup service that allows a user to backup a volume to an external service. In that case, the Cinder backup service directs copying the bytes into the configured backup storage. When Cinder attaches a volume to a Nova VM, or a bare metal node, Cinder is not involved in any I/O. Cinder’s job is to simply ensure that the volume is exported from the back-end array and make it available to Nova to see. After that, it’s entirely up to the transport protocol, iSCSI, FC, NFS, etc. to do the I/O for the volume.

Q. Is Nova aware of the LUN usage %?

A. Nova doesn’t track statistics against the volumes that it has attached to its virtual machines.

Q. Where do the vendor specific parts of Cinder fit in? Are there vendor specific “volume managers”?

A. The vendor specific components of Cinder exist in what are called Cinder volume drivers.   Those drivers are really nothing more than a python module that conforms to a volume driver API that is defined by the Cinder volume manager. You can get an idea of what the features that the drivers can support on the Cinder Support Matrix here:

https://wiki.openstack.org/wiki/CinderSupportMatrix

Q. If Cinder is only for control plane, which project in OpenStack is for data path?

A. There isn’t a project in OpenStack that manages the data path for volumes.

Q. Is there a volume detachment process as well and when does that come into play?

A. My presentation primarily focused around one aspect of the interaction between Nova and Cinder, which was volume attachment. I briefly discussed the volume detachment process, but it is conducted in basically the same way. An end user asks Nova to detach the volume. Nova then removes the volume from the VM, then removes the SCSI device from the compute host itself, and then tells Cinder to terminate the connection from the array to the compute host.

Q. If a virtual machine is moved to a different physical machine, how’s that handled in Cinder?

A. This process in OpenStack is called live migration. Nova does all of the work of moving the VM’s data, from one host to another. One facet of that is migrating any Cinder volume that may be attached to the VM. Nova understands which volumes are attached to the VM and knows which one of those volume(s) are Cinder volumes. When the VM is migrated, Nova coordinates with Cinder to ensure that all volumes are attached to the destination host and VM, as well as ensures that the volumes are detached from the originating compute host.

Q. Why doesn’t Cinder use SNIA SMI-S API to manage/consume SAN, NAS or Switch fabric instead of each storage vendor building Cinder drivers? SMI already covers all scenarios for the Cinder scenarios for FC, iSCSI, SAS etc.

A. Cinder itself doesn’t really manage the storage array communication itself. It’s entirely up to the individual vendor drivers to decide how best to communicate with its storage array. The HP 3PAR volume driver uses REST to communicate with the array, as do several other vendor drivers in Cinder. Other drivers use ssh. There are no strict rules on how a Cinder volume driver can choose to communicate with its back-end. This allows vendors to make the best use of their array interfaces as they see fit.

Q. Are there Horizon extensions or extension points for showing what physical resources your storage is coming from? Or is that something a storage vendor would need to implement?

A. Horizon doesn’t really know much about where storage is coming from other than it’s a Cinder volume. Horizon uses the available Cinder APIs to talk to Cinder to do work and fetch information about Cinder’s resources. I know of a few vendors that are writing Horizon plugins that add extra capabilities to view more detailed information about their specific array. As of today though, there is no API in Cinder to describe the internals of a volume on the vendor’s array.

 

 

 

 

 

 

New Webcast: Block Storage in the Open Source Cloud called OpenStack

On June 3rd at 10:00 a.m. SNIA-ESF will present its next live Webcast “Block Storage in the Open Source Cloud called OpenStack.” Storage is a major component of any cloud computing platform. OpenStack is one of largest and most widely supported Open Source cloud computing platforms that exist in the market today. The OpenStack block storage service (Cinder) provides persistent block storage resources that OpenStack Nova compute instances can consume.

I will be moderating this Webcast, presented by a core member of the OpenStack Cinder team, Walt Boring. Join us, as we’ll dive into:

  • Relevant components of OpenStack Cinder
  • How block storage is managed by OpenStack
  • What storage protocols are currently supported
  • How it all works together with compute instances

I encourage you to register now to block your calendar. This will be a live and interactive Webcast, please bring your questions. I look forward to “seeing” you on June 3rd

Swift, S3 or CDMI – Your Questions Answered

Last week’s live SNIA Cloud Webcast “Swift, S3 or CDMI – Why Choose?” is now available on demand. Thanks to all the folks who attended the live event. We had some great questions from attendees, in case you missed it, here is a complete Q&A.

Q. How do you tag the data? Is that a manual operation?

A. The data is tagged as part of the CDMI API by supplying key value pairs in the JSON Object. Since it is an API you can put a User Interface in front of it to manually tag the data. But you can also develop software to automatically tag the data. We envision an entire ecosystem of software that would use this interface to better manage data in the future

Q. Which vendors support CDMI today?

A. We have a page that lists all the publically announced CDMI implementations here. We also plan to start testing implementations with standardized tests to certify them as conformant. This will be a separate list.

Q. FC3 Common Services layer vs. SWIFT, S3, & CDMI – Will it fully integrate with encryption at rest vendors?

A. Amazon does offer encryption at rest for example, but does not (yet) allow you choose the algorithm. CDMI allows vendors to show a list of algorithms and pick the one they want.

Q. You’d mentioned NFS, other interfaces for compatibility – but often “native” NFS deployments can be pretty high performance. Object storage doesn’t really focus on performance, does it? How is it addressed for customers moving to the object model?

A. CDMI implementations are responsible for the performance not the standard itself, but there is nothing in an object interface that would make it inherently slower. But if the NFS interface implementation is faster, customers can use that interface for apps with those performance needs. The compatibility means they can use whatever interface makes sense for each application type.

Q. Is it possible to query the user-metadata on a container level for listing all the data objects that have that user-metadata set?

A. Yes. Metadata query is key and it can be scoped however you like. Data system metadata is also hierarchical and inherited – meaning that you can override the parent container settings.

Q. So would it be reasonable to say that any current object storage should be expected to implement one or more of these metadata models? What if the object store wasn’t necessarily meant to play in a cloud? Would it be at a disadvantage if its metadata model was proprietary?

A. Yes, but as an add-on that would not interfere with the existing API/access method. Eventually as CDMI becomes ubiquitous, products would be at a disadvantage if they did not add this type of interface.

 

 

 

OpenStack Cloud Storage Q&A

More than 300 people have seen our Webcast “OpenStack Cloud Storage.” If you missed it, it’s now available on demand. It was a great session with a lot of questions from attendees. We did not have time to address them all – so here is a complete Q&A. If you think of any others, please comment on this blog. Also, mark your calendar for January 29th when the SNIA Cloud Storage Initiative will continue its Developers Tutorial Series with a live Webcast on OpenStack Manila.

Q. Is it correct to say that one can use OpenStack on any vendor’s hardware?

A. Servers, yes, assuming the hardware is supported by Linux. Block storage requires a driver, and not all vendor systems have Cinder drivers.

Q. Is there any OpenStack investigation and/or development in the storage networking area?

A. Cinder includes support for FC and iSCSI. As of Icehouse, the FC support also includes auto-zoning. 

Q. Is there any monetization going on around OpenStack, like we see for distros of Linux?

A. Yes, there are already several commercial distributions available.

Q. Is erasure code needed to get a positive business case for Swift, when compared with traditional storage systems?

A. It is a way to reduce the cost of replication. Traditional storage systems typically already have erasure coding, in the form of RAID. Systems without erasure coding end up using more storage to achieve the same level of protection due to their use of 3-way replication.

Q. Is erasure code currently implemented in the current Swift release?

A. No, it is a separate development stream, which has not been merged yet.

Q. Any limitation on the number of objects per container or total number of objects per Swift cluster?

A. Technically there are no limits. However, in practice, the fact that the containers are implemented using SQL lite limits their size to a million or maybe a few million objects per container. However, due to the way that Swift partitions its metadata, each user can also have millions of containers, and there can be millions of users. So practically speaking, the total system can support an unlimited number of objects.

Q. What are some of the technical reasons for an enterprise to select Swift vs. Amazon S3? In other words, are they pretty much direct alternatives, or does each have its own preferred use cases?

A. They are more or less direct alternatives. There are some minor differences, but they are made for the same purpose. That said, S3 is only available from Amazon. There are some S3 compatible systems, but most of those also support Swift. Swift, on the other hand, is available open source or from multiple vendors. So if you want to run it in your own data center, or in a public cloud other than Amazon, you probably want Swift.

Q. If I wanted to play around with Open Stack, Cinder, and Swift in a lab environment (or in my basement), what do I need and how do I get started?

A. openstack.org is the best place to start. The “devstack” distribution is also good for playing around.

Q. Will you be showing any features for Kilo?

A. The “Futures” I showed will likely be Kilo features, though the final decision of what will be in Kilo won’t happen until just before release.

 Q. Are there any plans to implement data encryption in Cinder?

A. I believe some of the back ends can support encryption already. Cinder is really just a provisioning and orchestration layer. Encryption is a data path feature, so it would need to be implemented in the back end.

Q. Some time back I heard OpenStack Swift is going to come up with block storage as well, any timeline for that?

A. I haven’t heard this, Swift is object storage.

Q. The performance characteristics of Cinder block services can vary quite widely. Is there any standard measure proposed within OpenStack to inform Nova or the application about the underlying Cinder block performance characteristics?

A. Volume types were designed to enable clouds to provide different levels of service. The meaning of these types is up to the cloud administrator. That said, Cinder does expose QoS features like minimum/maximum IOPS.

Q. Is the hypervisor talking to a cinder volume or to (for example) a NetApp or EMC volume?

A. The hypervisor talks to a volume the same way it does outside of OpenStack. For example, the KVM hypervisor can talk to volumes through LVM, or can mount SAN volumes directly.

Q. Which of these projects are most production-ready?

A. This is a hard question, and depends on your definition of production ready. It’s hard to do much without Nova, Glance, and Horizon. Most people use Cinder too, and Swift has been in production at HP and Rackspace for years. Neutron has a lot of complexity, so some people still use Nova network, but that has many limitations. For toy clouds you can avoid using Keystone, but you need it for a “production” cluster. The best way to get a “production ready” OpenStack is to get a supported commercial distribution.

Q. Are there any Plugfests?

A. No, however, the Cinder team has a fairly extensive and continuous integration process that drivers need to pass through. Swift does not because it doesn’t officially “support” any plugins.

 

 

 

OpenStack Manila Webcast – Shared File Services for the Cloud

On January 29th, we continue our Cloud Developer’s series by hosting a live Webcast on OpenStack Manila – the OpenStack file share service. Manila provides the management of file shares (for example, NFS and CIFS) as a core service to OpenStack. Manila currently works with a variety of vendors’ storage products, including NetApp, Red Hat, EMC, IBM, and with the Linux NFS server.

In this Webcast we will:

  • Introduce the Manila file share service
  • Review key Manila concepts
  • Describe the logical architecture of Manila and its API structure
  • Explain what’s new in Juno, the latest release of OpenStack
  • Highlight the roadmap for Manila in the next release, OpenStack Kilo, and beyond

Register now for this live event that we expect will be informative and interactive. I hope you’ll join us.