Block Storage in OpenStack Q&A

The team at SNIA-ESF and I were very pleased with how many people attended our live “Block Storage in the Open Source Cloud called OpenStack.” If you missed it, please check it out on demand. We had several great questions during the live event. As promised here are answers to all of them. If you have additional questions, please feel free to comment on this blog.

Q. How is the support for OpenStack, if we hit a roadblock or need some features?

A. The OpenStack community has many avenues for contacting developers for support. The official place to report issues, file bugs or ask for new features is Launchpad. https://launchpad.net/openstack. It is the central place for all of the many OpenStack projects to file bugs or feature requests. This is also the location where every OpenStack project tracks its current release cycle and all of the features, called blueprints. Another good source of information are the public mailing lists. A good place to start for the mailing list is here, https://wiki.openstack.org/wiki/Mailing_Lists. Finally, developers are also on the public Internet Relay Chat channels associated with their projects. The developers are live and interactive, on each of the channels. You can find the information about the IRC system that OpenStack developers use here: https://wiki.openstack.org/wiki/IRC.

Q. Why was Python chosen as the programming language? Which version of Python is used as there are incompatibilities between versions?

A. The short answer here is that Python is a great language for rapid development and deployment that is mature and has a wide variety of publicly available libraries for doing work. The current released version of OpenStack uses Python 2.7. The OpenStack community is making efforts to ensure that we can eventually migrate to Python 3.x. New libraries that are being developed have to be Python 3.x compatible.

Q. Is it possible to replicate the backed up volumes at the OpenStack layer or do you defer to the back end array for data replication?

A. Currently, there is no built in support for volume replication in Cinder. The Cinder community is actively working on how to implement volume replication in the next release Liberty, which will ship in the fall of 2015. As with any major new feature in Cinder, the community has to design the new feature core such that it works with the 40+ vendor arrays, in such a way that it’s consistent. As the array support grows, the amount of up front design becomes more and more important and difficult at the same time. We have a specification that we are currently working on that will get us closer to implementing replication.

Q. Who, or what, creates the FC zones?

A. In Cinder, the block storage project, the component that creates and manages Fibre Channel zones is called the Fibre Channel Zone manager. A good document to read up on the zone manager is here: http://www.brocade.com/downloads/documents/at_a_glance/fc-zone-manager-ag.pdf. The official OpenStack documentation on the zone manager is here: http://docs.openstack.org/kilo/config-reference/content/section_fc-zoning.html. The zone manager is automatically called after Cinder Fibre Channel volume driver exports its volume from the array. The zone manager then adds the zones as requested by the driver to make the volume available to the virtual machine.

Q. Does the Cinder and Nova attachment process work over VLANs?

A. Yes. It’s entirely dependent on how the OpenStack admin deploys the Nova and Cinder services. As long as the Nova hosts can see the Cinder services and arrays behind the Cinder volume drivers, then it should just work.

Q. Is the FCZM a native component of the Cinder project? Or is it an add-on?

A. As I mentioned earlier, the Fibre Channel zone manager is part of the Cinder project. There has been some discussions, as part of the Cinder community, to possibly break out the zone manager into it’s own Python library, in which case it would be available to any Python project. Currently, it’s built into Cinder itself.

Q. Does Cinder involve itself in the I/O path as well or is it only the control path responsible for allocating storage?

A. Cinder is almost entirely control plane provisioning mechanism only. There are a few operations where the Cinder services actually does I/O. When a user wants to create an image from a volume, then Cinder attaches the volume to itself, and then copies the bytes from the volume into an image. Cinder also has a backup service that allows a user to backup a volume to an external service. In that case, the Cinder backup service directs copying the bytes into the configured backup storage. When Cinder attaches a volume to a Nova VM, or a bare metal node, Cinder is not involved in any I/O. Cinder’s job is to simply ensure that the volume is exported from the back-end array and make it available to Nova to see. After that, it’s entirely up to the transport protocol, iSCSI, FC, NFS, etc. to do the I/O for the volume.

Q. Is Nova aware of the LUN usage %?

A. Nova doesn’t track statistics against the volumes that it has attached to its virtual machines.

Q. Where do the vendor specific parts of Cinder fit in? Are there vendor specific “volume managers”?

A. The vendor specific components of Cinder exist in what are called Cinder volume drivers.   Those drivers are really nothing more than a python module that conforms to a volume driver API that is defined by the Cinder volume manager. You can get an idea of what the features that the drivers can support on the Cinder Support Matrix here:

https://wiki.openstack.org/wiki/CinderSupportMatrix

Q. If Cinder is only for control plane, which project in OpenStack is for data path?

A. There isn’t a project in OpenStack that manages the data path for volumes.

Q. Is there a volume detachment process as well and when does that come into play?

A. My presentation primarily focused around one aspect of the interaction between Nova and Cinder, which was volume attachment. I briefly discussed the volume detachment process, but it is conducted in basically the same way. An end user asks Nova to detach the volume. Nova then removes the volume from the VM, then removes the SCSI device from the compute host itself, and then tells Cinder to terminate the connection from the array to the compute host.

Q. If a virtual machine is moved to a different physical machine, how’s that handled in Cinder?

A. This process in OpenStack is called live migration. Nova does all of the work of moving the VM’s data, from one host to another. One facet of that is migrating any Cinder volume that may be attached to the VM. Nova understands which volumes are attached to the VM and knows which one of those volume(s) are Cinder volumes. When the VM is migrated, Nova coordinates with Cinder to ensure that all volumes are attached to the destination host and VM, as well as ensures that the volumes are detached from the originating compute host.

Q. Why doesn’t Cinder use SNIA SMI-S API to manage/consume SAN, NAS or Switch fabric instead of each storage vendor building Cinder drivers? SMI already covers all scenarios for the Cinder scenarios for FC, iSCSI, SAS etc.

A. Cinder itself doesn’t really manage the storage array communication itself. It’s entirely up to the individual vendor drivers to decide how best to communicate with its storage array. The HP 3PAR volume driver uses REST to communicate with the array, as do several other vendor drivers in Cinder. Other drivers use ssh. There are no strict rules on how a Cinder volume driver can choose to communicate with its back-end. This allows vendors to make the best use of their array interfaces as they see fit.

Q. Are there Horizon extensions or extension points for showing what physical resources your storage is coming from? Or is that something a storage vendor would need to implement?

A. Horizon doesn’t really know much about where storage is coming from other than it’s a Cinder volume. Horizon uses the available Cinder APIs to talk to Cinder to do work and fetch information about Cinder’s resources. I know of a few vendors that are writing Horizon plugins that add extra capabilities to view more detailed information about their specific array. As of today though, there is no API in Cinder to describe the internals of a volume on the vendor’s array.