Q&A – OpenStack Mitaka and Data Protection

At our recent SNIA Webcast “Data Protection and OpenStack Mitaka,” Ben Swartzlander, Project Team Lead OpenStack Manila (NetApp), and Dr. Sam Fineberg, Distinguished Technologist (HPE), provided terrific insight into data protection capabilities surrounding OpenStack. If you missed the Webcast, I encourage you to watch it on-demand at your convenience. We did not have time to get to all of out attendees’ questions during the live event, so as promised, here are answers to the questions we received.

Q. Why are there NFS drivers for Cinder?

 A. It’s fairly common in the virtualization world to store virtual disks as files in filesystems. NFS is widely used to connect hypervisors to storage arrays for the purpose of storing virtual disks, which is Cinder’s main purpose.

 Q. What does “crash-consistent” mean?

 A. It means that data on disk is what would be there is the system “crashed” at that point in time. In other words, the data reflects the order of the writes, and if any writes are lost, they are the most recent writes. To avoid losing data with a crash consistent snapshot, one must force all recently written data and metadata to be flushed to disk prior to snapshotting, and prevent further changes during the snapshot operation.

Q. How do you recover from a Cinder replication failover?

 A. The system will continue to function after the failover, however, there is currently no mechanism to “fail-back” or “re-replicate” the volumes. This function is currently in development, and the OpenStack community will have a solution in a future release.

 Q. What is a Cinder volume type?

 A. Volume types are administrator-defined “menu choices” that users can select when creating new volumes. They contain hidden metadata, in the cinder.conf file, which Cinder uses to decide where to place them at creation time, and which drivers to use to configure them when created.

 Q. Can you replicate when multiple Cinder backends are in use?

 A. Yes

 Q. What makes a Cinder “backup” different from a Cinder “snapshot”?

 A. Snapshots are used for preserving the state of a volume from changes, allowing recovery from software or user errors, and also allowing a volume to remain stable long enough for it to be backed up. Snapshots are also very efficient to create, since many devices can create them without copying any data. However, snapshots are local to the primary data and typically have no additional protection from hardware failures. In other words, the snapshot is stored on the same storage devices and typically shares disk blocks with the original volume.

Backups are stored in a neutral format which can be restored anywhere and typically on separate (possibly remote) hardware, making them ideal for recovery from hardware failures.

 Q. Can you explain what “share types” are and how they work?

 A. They are Manila’s version of Cinder’s volume types. One key difference is that some of the metadata about them is not hidden and visible to end users. Certain APIs work with shares of types that have specific capabilities.

 Q. What’s the difference between Cinder’s multi-attached and Manila’s shared file system?

A. Multi-attached Cinder volumes require cluster-aware filesystems or similar technology to be used on top of them. Ordinary file systems cannot handle multi-attachment and will corrupt data quickly if attached more than one system. Therefore cinder’s multi-attach mechanism is only intended for fiesystems or database software that is specifically designed to use it.

Manilla’s shared filesystems use industry standard network protocols, like NFS and SMB, to provide filesystems to arbitrary numbers of clients where shared access is a fundamental part of the design.

 Q. Is it true that failover is automatic?

 A. No. Failover is not automatic, for Cinder or Manila

 Q. Follow-up on failure, my question was for the array-loss scenario described in the Block discussion. Once the admin decides the array has failed, does it need to perform failover on a “VM-by-VM basis’? How does the VM know to re-attach to another Fabric, etc.?

A. Failover is all at once, but VMs do need to be reattached one at a time.

 Q. What about Cinder? Is unified object storage on SHV server the future of storage?

 A. This is a matter of opinion. We can’t give an unbiased response.

 Q. What about a “global file share/file system view” of a lot of Manila “file shares” (i.e. a scalable global name space…)

 A. Shares have disjoint namespaces intentionally. This allows Manila to provide a simple interface which works with lots of implementations. A single large namespace could be more valuable but would preclude many implementations.

 

 

Data Protection and OpenStack Mitaka

Interested in data protection and storage-related features of OpenStack? Then please join us for a live SNIA Webcast “Data Protection and OpenStack Mitaka” on June 22nd. We’ve pulled together an expert team to discuss the data protection capabilities of the OpenStack Mitaka release, which includes multiple new resiliency features. Join Dr. Sam Fineberg, Distinguished Technologist (HPE), and Ben Swartzlander, Project Team Lead OpenStack Manila (NetApp), as they dive into:

  • Storage-related features of Mitaka
  • Data protection capabilities – Snapshots and Backup
  • Manila share replication
  • Live migration
  • Rolling upgrades
  • HA replication

Sam and Ben will be on-hand for a candid Q&A near the end of the Webcast, so please start thinking about your questions and register today. We hope to see you there!

This Webcast is co-sponsored by two groups within the Storage Networking Industry Association (SNIA): the Cloud Storage Initiative (CSI), and the Data Protection & Capacity Optimization Committee (DPCO).

 

OpenStack Manila – A Q&A on Liberty and Mitaka

Our recent Webcast with OpenStack Manila OpenStack Manila Project Team Lead (PTL), Ben Swartzlander, generated a lot of great questions. As promised, we’ve complied answers for all of the questions that came in. If you think of additional questions, please feel free to comment on this blog. And if you missed the live Webcast, it’s now available on-demand.

Q. Is Hitachi Data Systems contributing to the Manila project?

A. Yes, Hitachi contributed a new driver and also contributed a major new feature (migration) during Liberty. HDS was also active during the Kilo release with a different driver which is unfortunately no longer maintained.

Q. EMC has open sourced ViPER as CopperHD. Do you see any overlap between Manila/Cinder from one side and CopperHD from the other hand?

A. I’m not familiar enough with CoprHD to answer authoritatively, but I understand that there is definitely some overlap between it and Cinder, and I also expect there is some overlap with Manila. Assuming there is some overlap, I think that’s a great thing because competition within open source drives greater quality, and it’s confirmation that there is real demand for what we’re building.

Q. Could Manila be used stand-alone (without OpenStack) to create a fileshare server?

A. Yes, the only OpenStack service Manila depends on is Keystone (for authentication). Running Manila in a stand-alone fashion is a specific use case the team supports.

Q. If we are mapping the snap shot images what is the guarantee for data integrity?

A. Snapshots are typically crash-consistent copies of the filesystem at a point in time. In reality the exact guarantee depends on the backend used, and that’s something we’d like to avoid, so that the snapshot semantics are clear to the user. In the future, backends which cannot meet the crash-consistent guarantee will probably be forced to advertise a different capability so end users are aware of what they’re getting.

Q. Is there manila automation with ansible?

A. As far as I know this hasn’t been done yet.

Q. For kilo deployed in production does it work for all commercial drivers or is there a chart that says which commercial drivers support kilo?

A. The developer doc now has a table which attempts to answer this question. However, the most reliable way to see which drivers are part of the stable/kilo release would be to look at the driver directory of the code. This is an area where the docs need to improve.

Q. Could you explain consistency groups?

A. Consistency groups are a mechanism to ensure that 2 or more shares can be snapshotted in a single operation. Without CGs, you can take 2 snapshots of 2 shares but there is no guarantee that those snapshots will represent the same point in time. CGs allow you to guarantee that the snapshots are synchronized, which makes it possible to use multiple shares together for a single application and to take snapshots of that application’s data in a consistent way.

Q. How is the consistency group in Manila different from Cinder? Is it similar?

A. The designs are very similar. There are some semantic differences in terms of how you modify the membership of the CGs, but the snapshot functionality is identical.

Q. Are you considering pNFS, but I guess this will be hard since it has req. on the client as well?

A. Manila is agnostic to the data protocol so if the backend supports pNFS and Manila is asked to create an NFS share, it may very well get a share with pNFS support. Certainly Manila supports shares with multiple export locations so that on a system with multiple network interfaces, or a clustered system, Manila will tell the clients about all of the paths to the share. In the future we may want Manila to actually know the capabilities of the backends w.r.t. what version of NFS they support so that if a user requires a minimum version we can guarantee that they get that version or get a sensible error if it’s not possible.

Q. Share Replication. In what mode, Async and/or Sync?

A. We plan to support both, and the choice of which is used will be up to the administrator. Communication about which is used and any relevant information like RPO time would be out of band from Manila. The goal of the feature in Manila is to make Manila able to configure the replication relationship, and able to initiate failovers. The intention is for planned failovers to be disruptive but with no data loss, and for unplanned failovers to be disruptive, with data loss corresponding to the RPO that the administrator configured (which would be zero for synchronous replication).

Q. Can you point me to any resources with SNIA available for OpenStack? Where can I download document, videos, etc?

A. You can find several informative OpenStack on-demand Webcasts on the SNIA BrightTalk channel here.