Securely Sharing Health Care Data across Different Cloud Services

As more and more health care providers leverage the efficiencies of the cloud, the need to share health care data across different cloud services arises. Sharing health care data across cloud services must ensure the confidentiality, integrity, and availability of the health data and preserve the privacy of the patients in such a way that revealing the data to other data requestors is performed only with patient consent.

The Cloud Data Management Interface (CDMI) international standard is a protocol that has been standardized by SNIA to create interoperable data management services in cloud storage.

The Cloud Storage TWG has just released a technical white paper, “Towards a CDMI Health Care Profile,” that explores the capabilities of CDMI in addressing these requirements, and provides suggestions for possible extensions that are appropriate for a health care profile.

I encourage you to download this paper to learn:

  • Motivations for protecting health data
  • Health data protection requirements
  • A use case that promotes the deployment of health data protection
  • Requirements and implementation aspects of the use case
  • Use case architecture
  • Future use cases roadmap

I hope you’ll find this paper enlightening and welcome feedback and comments on its content here in this blog.

Next Webcast: What’s New in NFS 4.2

We’re excited to announce our next ESF Webcast on NFSv4.2. With NFSv4.1 implemented on several commercial NFS systems, an established Linux client and a new pNFS Linux server, there is a continued growth of NFS usage in the IT industry. NFSv4.1, first introduced in 2010, meets many needs in the modern datacenter, but there are still technologies and advanced techniques that NFS developers want to deliver.

Join me and J Metz on April 28th at 10:00 a.m. PT as we’ll cover a brief update of where we are with NFSv4.1 and more detail on the proposed features for NFSv4.2 that are currently being ratified at the IETF. This will be a live, interactive session. Register now and please bring your questions.

If you need a primer on NFS before this event, I encourage you to check our 4-part Webcast mini-series available on demand:

Register today. I hope to see you on April 28th.

 

 

 

 

Ethernet Connected Drives Webcast Q&A

At our recent SNIA ESF Webcast “Visions for Ethernet Connected Drives” Chris DePuy of the Dell’Oro Group discussed potential benefits, use cases, and challenges of Ethernet connected drives. It’s not surprising that we had a lot of questions given that this market is in its infancy. As promised during our live event, here are answers to questions from the audience. If you think of additional questions, please feel free to comment on this blog.

Q. Will this also mandate new protocols to be used for storage like RDMA?

A. We did not receive any feedback from the technology companies we surveyed about RDMA specifically, but new protocols very well may be required to make effective and cost-effective use of eDrives. Storage systems offer many capabilities beyond just standard Ethernet networking and new protocols may be required to deliver those as well as new services in this new storage system architecture.

Q. Is White Box bought primarily by cloud customers?

A. Yes, in our research, substantially all purchases of White Box storage devices are purchased by cloud service providers.

Q. I may have missed it but aren’t we really talking about the HGST Open Ethernet Drive Architecture and the Seagate Kinetic Open Storage Platform? Both use Ethernet interfaces but HGST puts Debian on each HDD and Seagate has a key-value API for applications to directly write to the HDD. The actual deployment of these Ethernet HDDs would be in Ethernet Layer 2 switched backplanes in a 4U chassis being built by Supermicro, Xyratex (Seagate) and several others.

A. Given this was a presentation made to a neutral industry association; we chose not to discuss specific vendors. To answer your questions, yes, we are talking about Ethernet Connected Drives from HGST and Seagate, but we also integrated feedback from other suppliers of related technology, as well, including Toshiba. To your other question, yes, we have seen enclosures with embedded Ethernet switch technology connecting to the Ethernet drives from various other vendors. In our research for this webinar, we have also seen Ethernet switch technology embedded into enclosures that don’t use Ethernet connected drives, as well, but these would have systems to convert traditional HDD interfaces, but the network would see Ethernet as the outward facing interface.

Q. Doesn’t that take space on the drive when you put CPU and more memory?

A. We asked this question, too, but learned that there is sufficient space to maintain the HDD and all the parts in the same form factors we historically have known.

Q. What can one implement in these internal processors used in Ethernet drives? For instance can we run erasure codes such as Jerasure or XOR based codes yet do the basic tasks needed for the Ethernet drives?

A. We did not receive specific feedback during the surveys for this webinar about where one would run erasure coding. Generally, though, the decision will lead to design considerations for which CPU and memory choices would be made for each drive, which in turn would change economics as to whether the overall system is affordable/feasible. Note that doing erase coding on the drives increases the amount of intelligence required on the drive, for the arithmetic, for the requisite peer-to-peer networking, and for maintaining state information about other relevant drives required for completing the erasure codes. New software to manage all this would be required as well.

Q. Can I ran Ceph OSD plus Erasure code based on open source Jerasure in the Ethernet connected drive internal ARM processor?

A. We did not receive specific feedback during the surveys for this webinar about where one would run erasure coding. Generally, though, the decision will lead to design considerations for which CPU and memory choices would be made for each drive, which in turn would change economics as to whether the overall system is affordable/feasible.

Q. Erasure coding is more complex compared to RAID, how do I implement erasure coding with Ethernet drives?

A. We did not receive specific feedback during the surveys for this webinar about where or how one would run erasure coding.

Q. Does the economics assume including the cost of the Ethernet Ports? If so are you assuming unmanaged or managed Ethernet ports?

A. In the slides, we portrayed a simplistic capital spending model that considered just servers and hard drives. In reality, there are many other factors that play into both CAPEX and OPEX comparisons between conventional and Ethernet Connected Drive architectures. Examples include the cost differential between using Ethernet switching versus traditional HDD interfaces and how much memory and CPU is needed to support a particular use case.

Q. How does the increased number of network ports needed influence this price equation?

A. In the slides, we portrayed a simplistic capital spending model that considered just servers and hard drives. In reality, there are many other factors that play into both CAPEX and OPEX comparisons between conventional and Ethernet Connected Drive architectures. Examples include the cost differential between using Ethernet switching versus traditional HDD interfaces, how much memory and CPU is needed to support a particular use case.

Q. I’m confused how Power and Cooling could be saved. If you need X number of drives to store data then you would need the same number of drives in the connected drive model wouldn’t you? Perhaps more if the e-drives lack efficiency features?

A. The general point is that proponents of Ethernet Connected Drives argue there won’t be a need for storage-oriented servers, and so the savings would result from there being fewer of them consuming power.

Q. I guess the protocol would change commanding the drives?

A. There is no single approach that has been agreed upon. During the presentation, we said there are multiple technical approaches, one of which includes using Key Value APIs, and the other is to install an Operating System onto each drive that could run whatever you want on it.

Q. Are Ethernet connected drives JBODS on Ethernet?

A. Yes, that is the way we view it, too. Sometimes they are even called, “eBODS” where the traditional JBOD controller is replaced with an Ethernet switch.

Q. How is data protected–i.e., RAID or other mechanism.

A. In our surveys, we learned that the most common method would be to leverage erasure coding that is commonly associated with object oriented storage systems.

Q. How will photonics impact this concept?

A. Photonics is involved in data center Ethernet for higher speed communications. In our surveys, we did not encounter a single instance of a vendor discussing photonics at the Ethernet Connected Drive. For HDDs, 1GbE provides more than enough bandwidth for the drive.

Q. Are the servers today connecting the storage just dumb boxes that expose storage? Don’t they do processing as well? With Ethernet drives we’re removing that computational node it seems.

A. This is a very good point. Today’s conventional storage systems have significant computing capabilities – we think these could be used to do computing as well as performing storage-oriented tasks as they do primarily today. We expect that in the future, the servers that are packaged in external storage systems will be organized in a way that allows customers to run storage functions as well as more traditional purposes that would allow us to just call them ‘servers.’ In fact, there are several startups that are popularizing this idea.

Q. When it comes to HDD manufacturers there are only three left…WD (HGST), Seagate (Samsung) and Toshiba. When it comes to SSD or flash drives there are more manufacturers. Seagate is using a dual Serial Gigabit Media Independent Interface (SGMII) on its Kinetic HDDs. What other ways are there to do Ethernet on an HDD?

A. We did not receive any feedback from the technology companies we surveyed about this topic. Note, that SNIA recently started an “Object Drive Technical Work Group” to help drive standards for Ethernet-connected drives. If this topic is of interest, we encourage you to join that TWG.

Q. Have you seen any indication of a ratio between CPU power and Memory vs. the size of the storage? What is the typical White Box? EG Intel (version?) Memory (in GB?) Storage (in TB?)

A. The uses cases we presented are based on vendor-supplied viewpoints that implicitly incorporate the answers to your question, but don’t specifically address it. What we learned is that in these use-cases, there is an assumed positive TCO savings, but not every vendor agrees with these calculations – again without providing specifics like you are asking about.

Q. How can you eliminate the object servers? You still need that functionality somewhere if you ever hope to find the data again, or protect it… You may move away from dedicated Object servers but that code has to run somewhere thus saying they are eliminated is wrong…

A. This is a very good point. The use cases offered to us suggest that this code would either reside in the Ethernet Connected Drive, or on the server running an application itself, or both. This is why we made the point that the applications would have to be re-written to take advantage of the proposed new architecture.

Q. Is the cost of Ethernet HDDs expected to be the same as current HDDs and why?
Ethernet HDDs have more processing capabilities so shouldn’t they cost more (is that 10% more?)

A. Correct. If more components were added to an otherwise identical HDD, then, the cost would be greater. This is paramount to one of the main dissenting views we learned about during the survey process. It does raise the question as to whether it makes sense to deliver underlying HDDs that are NOT identical to traditional HDDs to offset costs somehow – maybe with lower speeds, or whether these Ethernet Connected Drives would be sold at lower margins by the HDD vendors.

Q. Do Server power TCO numbers take account of lower power consumption of next generation servers as indicated by Intel?

A. We do not know what version of servers was used in these vendor-supplied TCO calculations.

Q. If you are planning to offload processing to the processor on the HDD then you are assuming that the HDD vendors will expose those drives for user access – is there any evidence of this?

A. There is no single approach that has been agreed upon, and therefore no single answer to this question. During the presentation, we said there are multiple technical approaches, one of which includes using Key Value APIs, and the other is to install an Operating System onto each drive that could run whatever you want on it.

Q. How is redundancy handled on eHDD based appliance… aka a drive fails?

A. The custom-built software would presumably be developed to handle this. And obviously, the eHDD has to add enough CPU and memory to manage all this — which of course adds cost.

Q. It seems that with the CPUs on each drive, the archive, object or whatever the application would need to be rewritten to support this specific method of parallel processing. Is anyone doing this now?

A. During the survey process, we learned that many applications were being ported to this environment, some of which apparently do take advantage of parallel computing. Given we were planning to immediately divulge information to the public, we were not presented with details.

Q. What is nearline storage?

A. This is the way it was described to us by some of the technology companies we surveyed, but the meaning is that it represents a more traditional storage system you might see in an enterprise where many drives are stopped (not rotating) and are turned on when a request comes in.

Q. Why are analytics specifically optimized for Ethernet attached storage devices – the presenter seems to anticipate that processing can be pushed onto the drive, and if this is the case why can’t other drive interfaces do this – PCIe attached storage should be even more amenable for this.

A. The presenter was sharing views compiled by the responses of various technology companies during a series of interviews conducted before this webcast. Analytics is a large, growing industry today and exists without Ethernet Connected Drives. Some of the companies surveyed offered the view that putting processing capabilities into each HDD may enhance the overall system’s performance.

Q. Can the presenter comment on the value of scale-out for E-Drives, versus legacy SAN scale out?

A. Some of the technology companies interviewed by the presenter suggested that systems based on Ethernet Connected Drives may scale to larger capacities than traditional architectures on the basis that the storage-oriented servers no longer present an impediment to scaling.

Q. Just as object storage addresses RAID smart drives could provide the meta data needed by the swift controllers to do deduplication, or the controller may do deduplication as a pre-process or post process like we have seen on NetApp or Data Domain evolve over years.
If we use optic connections the port density issue is resolved and this end up looking like something from 2001 (the movie) correct?

A. Photonics is involved in data center Ethernet for higher speed communications. In our surveys, we did not encounter a single instance of a vendor discussing photonics at the Ethernet Connected Drive. As noted above, 1GbE is more than sufficient for eHDDs.

Q. FYI…48TB Capacity Kinetic Storage Appliance $5000.00 street price
White Box 2U Dual Xeon storage server with 48TB RAW…$8000 street price

A. Thank you for sharing! You may have noticed we did not mention specific vendors during the presentation – perhaps others viewing your question will take note of your viewpoint.

Q. To the extent that hyperscale cloud environments have servers with open sockets or slots for direct attach storage of drives, how are there financial savings to connect through Ethernet instead of direct attach? Will servers of the future remove these slots and sockets? Are there other cluster wide benefits with regards to performance for data accessed directly through the network instead of through the server with the local storage, when the data is accessed by a large number of servers?

A. Hyperscalers are buying storage-related hardware at a fraction of the price that systems OEMs are selling them for mainly because they do not demand software that enterprises value so much – they leverage open source and make their own for their very specific needs. If you look at the slide about the ‘White Box Effect’ in the presentation, you get a sense for just how much less they pay – or anyone else who buys a White Box pays – but make no mistake about it, these devices don’t do much unless you integrate them into a working system intended to store and safely retain data. To answer your question, we observe that these hyperscalers are such large customers of components and systems that they could choose to request custom hardware designs with customized specifications – more of this kind of interface, fewer of that kind, etc. As an analogy, in the networking industry, one of the largest buyers of the underlying network technology like processors, Ethernet interfaces and optics are the handful of hyperscalers – and in fact these customers are larger than most vendors.

Q. Why would each drive not know about other drives storage? How does this differ from existing storage servers?

A. In the traditional storage architecture, a central system is involved. The dissenting viewpoint we received from some of the technology companies we interviewed was a counterpoint that may exist only under certain design scenarios. Our view is that if a system is designed with the goal in mind to make each drive aware of each other’s contents, then that is technically possible of course. But at a cost, as you add CPU, memory, and software to do this.

Q. I can see flash and Wi-Fi Ethernet connected drives providing Internet of Things storage for values that can be harvested impendent of when the value was stored. Thus getting a low power system that could live off of USB type power or power over Ethernet being why corporations would look at this.

A. I think the point you are making is that flash consumes very little power, right? This revolutionary technology (lets just say, non volatile memory to keep it general) is causing all kinds of disruptive changes in the storage industry, and as costs come down for NVM, all kinds of different scenarios become possible.

Q. Cost model might need to include a simpler lower cost local server with the Ethernet drive clusters by adding a cost item to the left side of their equation, comments?

A. Agreed – the equation we provided was simplistic and could be expanded to include many other terms and other simultaneous equations as well. We just thought that providing it would frame the discussion on the slide instead of just saying it verbally.

Q. Obviously, it will be higher, but how do you envision this changing Ethernet bandwidth requirements? Will Ethernet connected drives only become a reality once 40, 25, 100 Gb becomes the mainstream for Ethernet networks?

A. Network bandwidth needs will be a function of how the servers interact with the drives – I can see scenarios where traffic might be kept more locally, or where asking each drive for ‘the answer’ instead of ‘all of its data’ so it can be processed in a server, might actually cause your premise that traffic increases. The point I’m getting to is that it depends on what applications these Ethernet Connected Drives are used for. Nevertheless, Metcalfe’s law (all available bandwidth installed will be consumed) has not yet been repealed publicly that I’m aware of.

Q. With Ethernet connected drives are we still stuck with the fundamental issue that HDD are still transactionally inefficient and thus while a novel concept the basic drive unless improvements are made in transactional efficiencies are improved remain the bottleneck?

A. We think HDDs will co-exist with Flash/NVM for a very long time. Some very smart engineers are working to make this co-existence increasingly efficient, taking into account the strengths and weaknesses of both storage media.

 

 

SNIA ESF Leadership Welcomes Chad Hintz

The ESF continues our busy schedule hosting informative Webcasts, writing and publishing articles and participating at industry conferences. 2015 also brings a change in ESF leadership. I’d like to welcome Chad Hintz of Cisco as our newest ESF board member. Chad has been elected as chair of our Storage over Ethernet Special Interest Group (SIG).

The Storage over Ethernet SIG is focused on a growing trend among modern data centers to deploy consolidated Ethernet networks as the primary network infrastructure for all LAN and storage traffic. Technologies such as Data Center Bridging (DCB), Fibre Channel over Ethernet (FCoE) among others, offer organizations a robust environment to support mixed workloads with each using the most appropriate protocol (including NFS, SMB and iSCSI) over a shared Ethernet physical transport. The Storage over Ethernet SIG offers educational and thought leadership materials related to these technologies and the business value they offer to organizations of all sizes. Especially appreciated by our audience is that this information comes from SNIA and thus is vendor-neutral.

Chad brings a wealth of expertise to ESF. He is a Technical Solutions Architect focusing on designing enterprise solutions for customers around data center technologies. He holds 3 CCIEs in routing and switching, security and storage and has held certifications from Novell, VMware, and Cisco. We’re confident his expertise and passion for Ethernet Storage will be a big asset to our group.

We are looking forward to having Chad on our team to help guide the many activities we have planned this year. Other members of the 2015 board include myself as ESF chair, Alex McDonald (NetApp), and Mike Jochimsen (Emulex).

As I mentioned, the ESF is busy creating vendor-neutral educational on Ethernet connected storage networking technologies. I encourage you to check out some of our recent and upcoming content:

Upcoming Webcast – Visions for Ethernet Connected Drives

On-demand Webcast – Benefits of RDMA in Accelerating Ethernet Storage Connectivity

Article – Cloud File Services

Article – Weave Your Cloud with a Data Fabric

 

 

Hybrid Clouds Webcast Preview

On March 18th, SNIA-CSI will be hosting a live Webcast “Hybrid Clouds: Bridging Private and Public Cloud Infrastructures.”

Every IT consumer is using (or is planning to use) cloud in one form or another. The emphasis on the design and implementation of cloud architectures is often made without consideration of where the cloud storage and compute should be located and the benefits, costs and risks of deciding where the applications will run. Will it be a public cloud? Or a private cloud in the data center or co-location site? Or a hybrid of the two?

This session will be an overview on developing & delivering a cloud architecture with a focus on getting the overall goals correctly specified and defined, understanding the issues that must be addressed, and then making the decision about whether the application is suitable for public, private or some hybrid mixture of the two before undertaking implementation. We’ll also focus on one of the most difficult aspects of the solution, the management of data and storage in the cloud, and present a case study of a successful commercial implementation.

Register now for this live event. I hope you’ll join Alex McDonald and me for what we hope will be an informative and interactive event.

Benefits of RDMA in Accelerating Ethernet Storage Q&A

At our recent live Webcast “Benefits of RDMA in Accelerating Ethernet Storage Connectivity” experts from Emulex, Intel and Microsoft had an insightful discussion on the ways RDMA is having an impact on Ethernet storage. The live event was attended by nearly 200 people and feedback was overwhelming positive with several attendees thanking us for our vendor neutral presentation and one attendee commenting that it was, “Probably the most clearly comprehensible yet comprehensive webinar I’ve attended in some time.” If you missed the Webcast, it’s now available on demand. We did not have time to get to everyone’s questions, so as promised, below are answers to all of them. If you have additional questions, please ask them in the comments section in this blog and we’ll get back to you as soon as possible.

Q. Is RDMA over RoCEv2 in production?

A. The IBTA released the RoCEv2 Specification in September 2014.  In order to support that specification changes may be required across the RDMA stack, including firmware, drivers & operating systems.  Schedules for implementation of that specification will vary by operating system.  For example, the OpenFabrics Alliance (OFA) has not released an Open Fabrics Enterprise Distribution (OFED) version that implements that standard yet, although it is in process now.  Once OFA completes their OFED stack implementation, the Linux distribution vendors will then incorporate and support the updated OFED stack.  Implementations provided prior to full OFA and Distro vendor support would be preliminary, potentially incompatible with the OFED release, and require confirmation by the distro vendor with regard to the nature/level of support they would be providing

Q. I would have liked a list of Windows applications that take advantage of SMB Direct – both in a Hyper-V host or bare metal.

A. In Windows, any file-based application can make use of SMB3 and SMB Direct due to the native file-based programming interface support. No application changes are required. For certain enterprise applications such as Hyper-V and SQL Server, SMB3 is officially supported, and more information can be found in the product catalog at www.microsoft.com.

Q. Are there any particular benefits in using one network protocol over another for SMB Direct/RDMA (iWARP vs. RoCE vs. IB)?

A. There are no hard and fast rules; any adapter or protocol can be suitable for many scenarios. Of the Ethernet-based protocols we considered in today’s webcast

  • iWARP offers the benefit of operation over TCP with its reliability and routability, well-suited to a broad range of installed infrastructure.
  • RoCE offers a lightweight, efficient protocol when a DCB-enabled switched fabric is available. RoCE, however, is not routable.
  • RoCEv2 offers similar properties to RoCE, with the possibility to scale to larger routed and DCB-enabled fabrics.

Q. Who are the vendors offering iWARP capable RNICs?

A. Chelsio Communications has production iWARP adapters today, and both Intel and Qlogic have publicly committed to future iWARP controllers.

Q. How much testing has been done with SMB3, and in particular SMB direct, over WAN connections?

A. The SMB2 protocol was originally designed to adapt to WAN scenarios, and supports a credit-based management of large amounts of data to be outstanding, to make best use of WAN-type long pipes. The SMB3 protocol retains these design attributes, and the SMB Direct protocol also supports similar deep pipelining. The iWARP protocol, being layered on standard TCP, is well suited to such deployments, and RoCE WAN adapters are potentially available. Please contact the respective technology vendors for information on any available testing results.

Q. I love a future webcast for RDMA enabled distributed filesystems.

A. Thanks for the suggestion! We’re always looking for ideas for future webcasts and SNIA-ESF will consider this as a potential follow-on.

Q. Is Live Migration the scenario where “packet size” is 1MB?

A. All SMB Direct scenarios have workloads that range anywhere up to 8MB. For large file copies, most SMB3 clients request from 1MB to 8MB per operation, for Hyper-V live migration, transfers are typically similar, during the bulk transfer phase.

Q. SMB3 is being compared to FC for enterprise. If Ethernet based protocols are of interest, wouldn’t FCoE give the same performance as FC (same stack) vs. SMB3?

A. SMB3 with SMB Direct enables many workloads not possible with Fibre Channel over Ethernet, and performance comparisons are therefore difficult. Perhaps another SNIA webcast could investigate this!

Q. Regarding your SMB direct example with lots of small operations, how do you deal with the overhead of registering and unregistering buffers for the RDMA operations?

A. As answered later in the session, the registration and unregistration is not a protocol matter, but in the case of the Windows implementation, it is strictly performed for the specific buffers of each operation, which is critical for security, data integrity, and system protection. The standard “Fast Register Work Request” method is used, and careful implementation has shown that the overhead does not negatively impact performance, even for small I/O (4KB/operation). Check out Jose Barreto’s blog, which contains many benchmark results.

Q. But isn’t Live Migration done in 1MB “chunks”? So not “small” I/Os?

A. As answered later in the session, Hyper-V Live Migration is done in several phases, the first phase is the initial bulk copy of memory, done in large chunks, but immediately after it a second phase of copying individual pages which were dirtied by the live-running VM is performed. These operations are typically 4KB. Note: The faster the initial phase goes, the less work there is in this second phase, but in both phases, the faster, the better, and RDMA accelerates both.

Q. Are iSER and iWARP alternatives to one another?

A.  iWARP is an RDMA protocol, and iSER is a mapping of iSCSI to iWARP, as well as RoCE/InfiniBand.

Q. What’s Intel’s roadmap for RoCE and/or iWARP?

A. Intel is committed to iWARP and plans to incorporate it in future server chipsets and SOCs. See http://www.intel.com/content/www/us/en/ethernet-products/accelerating-ethernet-iwarp-video.html for more information.

Q. Is there any other Transport being used other than IB to create a reliable transport for RoceV2? Puristically it is possible?

A. RoCE was developed to leverage Infiniband as much as possible.  For that reason, the Infiniband transport was chosen when the RoCE standard was developed.  As the RoCEv2 standard was developed, the underlying Infiniband network protocol was replaced with IPv4 / IPv6 in order to provide the layer 3 routability and UDP to provide stateless encapsulation (and indication) of the Infiniband transport header that was retained.  While it may be possible to develop a reliable transport to replace Infiniband, the RoCE standards body has elected not to go that route as of this writing.

 

 

 

SNIA CSI Welcomes Glyn Bowden

At our annual SNIA Members’ Symposium in San Jose, the Cloud Storage Initiative (CSI) elected our 2015 CSI board. I’d like to officially welcome our newest board member, Glyn Bowden from HP. HP now joins our growing list of member companies.

The CSI is committed to the adoption, growth and standardization of cloud storage and related cloud data services to promote interoperability and portability of data stored in the cloud. CSI leads as an industry-neutral authority on cloud storage environments and is committed to educating vendor and end user communities on cloud storage & industry standardization benefits.

It’s only the beginning of March and we’ve already hosted several educational Webcasts on topics ranging from OpenStack Cloud Storage and OpenStack Manila, to CDMI and the LTFS Bulk Transfer Standard. All CSI Webcasts are available on-demand. I encourage you to check then out.

LTFS Bulk Transfer Standard Q&A

Our recent live SNIA Cloud Webcast “LTFS Bulk Transfer Standard” is now available on-demand. Thanks to all the folks who attended the live event. We did not have time to address all of the questions, so here are answers to them. If you think of additional questions, please feel free to comment on this blog.

Q. The LTFS standard seems to support shared extents between files, and by extension, deduplicated files. Is this a correct assessment, and how does it play in the bulk transfer standard?

A. The LTFS Bulk Transfer Standard supports shared extents as supported by the LTFS standard, which can transparently reduce space used by having multiple references to common data stored on tape (deduplication). This typically happens below the bulk transfer layer, by the software used to read and write the LTFS volumes. At this point, few software packages support this feature due to the wear and latency consequences of read seeks resulting from using this feature.

Q. What is the state of the standard in its lifecycle? (e.g., working group draft, public review, published, etc.)

A. The LTFS standard has been around for some time; more information can be found here at http://www.snia.org/tech_activities/standards/curr_standards/ltfs. The LTFS Bulk Transfer Standard is here at http://www.snia.org/tech_activities/publicreview#ltfsbulk, and is in public review.

Q. The standard seems to be based on the idea of moving physical tapes to the cloud. Is there a definition of a virtual LTFS image that can be moved between systems over the network?

A. Not yet, but that is a great idea we’ll be taking forward in the next versions of the proposal.

Q. One of the barriers to greater use of LTFS in the Cloud is the relative lack of enterprise grade management software that ensures that the tape media is refreshed / upgraded as it ages, that its integrity is periodically checked, that reclamation and compaction is done. It needs open standards for support in standard volume management systems as well. Until these things are in place, LTFS will be interesting largely to specialized industries like film/entertainment, seismic, and bulk transfer & bulk storage — but not about the steady-state use of tape as a true additional layer of the cloud storage hierarchy. Tape with LTFS plus proper management could fill this role — but not until the full lifecycle tape management is available and integrated.

A. The management that is always required for a physical product with a well-defined and finite lifetime is not a unique requirement of LTFS. Tape has a long history of use as a backup and archive medium, and there are a number of tape management products that are commercially available from LTO tape suppliers and independent software companies, as well as open source products. A Google search for “tape management software” will provide you with a number of alternative solutions.

Q. Do you have a list or people that sell LTFS based solutions?

A. No we don’t, but it’s a very good idea, and we’ll investigate it further.

 

New Webcast: Visions For Ethernet Connected Drives

Mark your calendar for March 25th as SNIA-ESF, together with the Dell’Oro Group, will be hosting a live Webcast, “Visions for Ethernet Connected Drives.” The arrival of mass-storage services, the emergence of analytics applications and the adoption of object storage by the cloud-services industry have provided an impetus for new storage hardware architectures. One such underlying hardware technology is the Ethernet connected hard drive, which is in early stages of availability.

Please join us on March 25th to hear Chris DePuy, Vice President of Dell’Oro Group share findings from interviews with storage-related companies, including those selling hard drives, semiconductors, peripherals and systems, as he will present some common themes uncovered, including:

  • What system-level architectural changes may be needed to support Ethernet connected drives
  • What capabilities may emerge as a result of the availability of these new drives
  • What part of the value chain spends the time and money to package working solutions

We will also present some revenue and unit statistics about the storage systems and hard drive markets and will discuss potential market scenarios that may unfold as a result of the object storage and Ethernet connected drive trends.

I’ll be hosting the event and together with Chris, taking your questions. I hope you’ll join us.

 

New SNIA-CSI Webcast: LTFS Bulk Transfer Standard

Mark your calendar for February 10th as we conclude our Cloud Developer’s series by hosting a live Webcast on the LTFS Bulk Transfer Standard. LTFS (Linear Tape File System) technology provides compelling economics for bulk transportation of data between enterprise cloud storage.

This Webcast will provide an update on the joint work of the LTFS and Cloud Technical Working Groups on a bulk transfer standard that uses LTFS to allow for the reliable movement of bulk data in and out of the cloud, and mechanisms for verification, error handling and the management of namespaces. Register now to hear David Slik, Co-Chair of the SNIA Cloud Storage Technical Work Group, discuss:

  • LTFS standard mandate and history
  • LTFS adoption and use cases
  • LTFS bulk transfer to, from, and between clouds
  • Error handling and recovery
  • Security considerations

I’ll be hosting the event, taking your questions, and hopefully shedding some light on the importance of this standard. I hope you’ll join us.