Evaluator Group to Share Hybrid Cloud Research

In a recent survey of enterprise hybrid cloud users, the Evaluator Group saw that nearly 60% of respondents indicated that lack of interoperability is a significant technology issue that they must overcome in order to move forward. In fact, lack of interoperability was the number one issue, surpassing public cloud security and network security as significant inhibitors.

The SNIA Cloud Storage Initiative (CSI) is pleased to have John Webster, Senior Partner at Evaluator Group, who will join us on December 12th for a live webcast to dive into the findings of their research. In this webcast, Multi-Cloud Storage: Addressing the Need for Portability and Interoperability, my SNIA Cloud colleague, Mark Carlson, and John will discuss enterprise hybrid cloud objectives and barriers to adoption. John and Mark will focus on cloud interoperability within the storage domain and the CSI’s work that promotes interoperability and portability of data stored in the cloud.

As moderator of this webcast, I’ll make sure we offer great insights on real-world cloud deployment challenges. As always, we will be available to answer your questions on the spot. I encourage you to register today. We hope to see you on the 12th.

 

Expert Answers to Cloud Object Storage and Gateways Questions

In our most recent SNIA Cloud webcast, “Cloud Object Storage and the Use of Gateways,” we discussed market trends toward the adoption of object storage and the use of gateways to execute on a cloud strategy.  If you missed the live event, it’s now available on-demand together with the webcast slides. There were many good questions at the live event and our expert, Dan Albright, has graciously answered them in this blog. Read More

How Gateways Benefit Cloud Object Storage

The use of cloud object storage is ramping up sharply especially in the public cloud, where its simplicity can significantly reduce capital budgets and operating expenses. And while it makes good economic sense, enterprises are challenged with legacy applications that do not support standard protocols to move data to and from the cloud. That’s why the SNIA Cloud Storage Initiative is hosting a live webcast on September 26th, “Cloud Object Storage and the Use of Gateways.” Read More

Security and Privacy in the Cloud

When it comes to the cloud, security is always a topic for discussion. Standards organizations like SNIA are in the vanguard of describing cloud concepts and usage, and (as you might expect) are leading on how and where security fits in this new world of dispersed and publicly stored and managed data. On July 20th, the SNIA Cloud Storage Initiative is hosting a live webcast “The State of Cloud Security.” In this webcast, I will be joined by SNIA experts Eric Hibbard and Mark Carlson who will take us through a discussion of existing cloud and emerging technologies, such as the Internet of Things (IoT), Analytics & Big Data, and more, and explain how we’re describing and solving the significant security concerns these technologies are creating. They will discuss Read More

Object Drives Now Have a Management Standard

The growing popularity of object-based storage has resulted in the development of Ethernet-connected storage devices, also referred to as IP-Based Drives, that support object interfaces, and in some cases the ability to run applications on the drives themselves. These scale-out storage nodes consist of relatively inexpensive drive-sized enclosures with IP network connectivity, CPU, memory and storage. Read More

Containers, Docker and Storage – An Expert Q&A

Containers continue to be a hot topic today as is evidenced by the more than 2,000 people who have already viewed our SNIA Cloud webcasts, “Intro to Containers, Container Storage and Docker“ and “Containers: Best Practices and Data Management Services.” In this blog, our experts, Keith Hudgins of Docker and Andrew Sullivan of NetApp, address questions from our most recent live event.

Q. What is the major challenge for storage in containerized environment?

A. Containers move fast. Users can spin up and spin down containers extremely quickly. The biggest challenge in production-bound container environments is simply keeping up with the movement of data.

Docker Engine does not delete base container images when the container is shut down. Likewise, Registry assumes you’ve got unlimited storage on hand. For containers that push frequent revisions (as would be the case in a continuous delivery environment), that leads to a lot of orphaned container images that can fill up all available storage if left unchecked.

There are some community-led scripts that will help to keep things in control. That’s the beauty of community-led technology.

Q. What about the speed of retrieving the data from storage?

A. That’s where being a solid storage architect comes in. Every storage system has different strengths and weaknesses, so it’s important to engineer your solution to fit your performance goals. Docker containers are running on the main kernel of the host system. IO is not constrained by abstraction, as in the case of virtual machines. Rather, it is constrained more by density – hundreds of containers on a host can push massive IOPS, so you want your pipes fat and data sources close to the host systems.

Q. Can you expand on moving Docker Volumes from On-Premise bare metal to Cloud Service Providers? Data Migration? Encryption? 

A. None of these capabilities are built-in to Docker Engine. We rely on external storage systems to provide those features. Private-to-cloud replication is primarily a feature of software-based companies, like Portworx, Blockbridge, or Hedvig. Encryption and migration are both common features across other companies as well. Flocker from ClusterHQ is a service broker system that provides many bolt-on features for storage systems they support. You can also use community-supplied services like Ceph to get you there.

Q. Are you familiar with “Flocker” that apparently is able to copy persistent data to another container? Can share your thoughts?

A. Yes. ClusterHQ (makers of Flocker) provide an API broker that sits between storage engines and Docker (and other dynamic infrastructure providers, like OpenStack), and they also provide some bolt-on features like replication and encryption.

Q. Is there any sort of feature in the volume plugins that allows a persistent volume to re-connect to a container if the container is moved across multiple hosts?

A. There’s no feature in plugins to cover that specifically. The plugin API is very simple. In practice, what you would do is write your plugin to expose volumes to Docker Engine on every host that it’s possible to mount that volume. In your container specification, whether it’s a Compose file, DAB file, or what have you, specify the name of your volume. Wherever that unique name is encountered, it will be mounted and attached to the container when it’s re-launched.

If you have more questions on containers, Docker and storage, check out our first Q&A blog: Containers: No Shortage of Interest or Questions.

I also encourage you to join our Containers opt-in email list. It will be a good way to keep up with all the SNIA Cloud is doing on this important technology.

Containers: No Shortage of Interest or Questions

Based on record-breaking registration and attendance at our recent SNIA Cloud webcast, Intro to Containers, Container Storage and Docker, It’s clear that containers is a hot topic that folks want to learn more about – especially from a vendor-neutral authority like SNIA. If you missed the live event, it’s now available on-demand together with the webcast slides.

We were bombarded with questions at the live webcast and we ran out of time before we could answer them all, so as promised, here are answers from our expert presenters, Chad Thibodeau and Keith Hudgins. Oh, and please don’t forget to register for part two of this webcast, Containers: Best Practices and Data Management Services, on December 7, 2016.

Q: Would you please highlight key challenges a company may face to move from a hypervisor to container environment?

CT: The main challenge that gets raised in moving from virtual machines to containers is around security as when deployed on bare-metal, all of the containers share the core operating system. However, there are arguments that containers can still be effectively isolated.

KH: Primarily paring down your applications to their minimum running requirements. This can be quite difficult with long-entrenched legacy applications!

Q: With a VM you allocate a finite amount of vCPU and RAM, with a high degree of confidence that those resources will be available to whatever workload is running in the VM. Is that also true of containers – does (or can) the workload get a guaranteed allocation of CPU and memory resources?

CT: Keith, I’ll let you address this one from the application microservice; from the storage side, an SLA or Quality-of-Service can be defined for a container volume if the storage provider offers this capability.

KH: By default, you don’t allocate CPU or ram availability. Most containers are small enough that it’s not a consideration. However, if you need to specify priority, we have a method to do that. Please review the docs here.

Q: Where are microservices most useful? Are there certain environments where they are more likely to be deployed & which verticals or type of solutions/apps will see more benefit?

CT: Microservices can apply to applications within most verticals; for financials it was mentioned that Goldman Sachs is planning on containerizing 90% of their existing applications to web-service such as Netflix. Some of the determining factors are whether the application(s) would benefit from what container technology provides such as rapid deployment, lightweight, portability, and the ability to scale beyond typical monolithic applications.

KH: Microservices are most useful with network-facing applications that don’t require heavy transactional control. Note that it *is* possible to build transactional microservices, but the best practices on that route hasn’t been optimized yet.

Q: What OS version / Hypervisor, support containerization, are working towards cutting the “noisy neighbor” issue?

CT: Containers are supported by both MS Windows and Linux operating systems. The specific version of Linux OS will be more dependent upon the level of capabilities included (Keith, more your area) and MS Windows Server 2016 is the first release of Windows with container (Docker) support.

KH: Docker supports running containers under Windows and Linux kernels. We don’t care whether it’s on metal or virtualized. It’s possible to set affinity groups in a production Docker installation to help manage noisy neighbor issues, but note that fundamentally Docker is NOT a multi-tenant system.

Q: What is “stateful database”? How does it differ from regular databases?

CT: Most databases are stateful such as Oracle, MySQL, Cassandra, MongoDB or Redis. The confusion may be around the Gartner quote which stated “Stateful Database Applications” in which they simply meant that databases are examples of stateful applications.

KH: Any database is by definition stateful. A “stateless” container is one that is running a process that doesn’t store persistent data to disk. This could be a caching system, web application server, load balancer, queue runner, or pretty much any component that doesn’t need to store data. Everything else is “stateful” and needs some way to shove that data into a reliable datastore.

Q: What factors should be considered when choosing between containers and virtual servers for a given project/use case?

CT: The driving factors for container deployments are: portability, minimal footprint (low overhead since no hypervisor or guest OS), rapid provisioning and de-commissioning, scalability and largely open-source based. If any (or all) of these are deemed valuable to you, then you should consider container deployment technology.

KH: That’s a very broad question! It helps to understand that a container is simply a wrapper around one process that is running on a container host. So it’d be one database service, or one web app server, for example. If you can break up your application into a bunch of these single processes and chain those processes together via networking (DB serves data through the network layer to your cache, which supplies the web app, which is behind the load balancer, etc) then it’s a great candidate for containerization.

Q: Sounds like a lightweight hypervisor?

CT: Containers have been compared to virtual machines as a “lightweight VM”; however, there are distinctions mostly around the fact that the hardware resources are not virtualized for containers, but rather the application is abstracted.

KH: That’s not a bad way to start thinking about it. However, you don’t have a second kernel underneath the hypervisor, so there’s no hardware abstraction. Also, in general you don’t want to run a full OS stack per container, just what you need for the application. That way your containers are lean and efficient.

Q: So is there a practical limit to the number of users you need to have for an app in order for containers/microservices to be preferable vs. traditional apps?

CT: Not necessarily. It is more about what you are trying to achieve with the application and the requirements you have around things like: platform agnostic, portability, ease-of-deployment, scalability, etc. But I wouldn’t put a hard number on when containers make more sense over virtual machines or bare-metal deployments for that matter.

KH: Nope! Microservices is far more about making it easier to build and maintain your applications than it is about scaling. Like anything, you can over-abstract your application design and go extra silly with it, but it’s fundamentally about a better way of managing your applications’ lifecycle than it is about how many users you can push through the pipe.

Q: So is graph and memory the same thing?

CT: Keith, I’ll let you address this one.

KH: Nope. Graph refers to our copy-on-write storage for images at runtime. Our docs can explain it way better than I can in a Q&A session. Look here for more info.

Q: Similar to the Docker Container Networking, are there any specific efforts going on around Docker Storage? For example, are you (or will you be) building any products to support features that you mentioned (such as ‘Storage vMotion’ like capabilities)?

CT: Keith, I’ll let you address this one. However, there are initiatives and activities on the storage side around providing vMotion like capabilities for the data and application state.

KH: It’s always a possibility. There’s nothing I can say right now, but stay tuned.

Q: Let me shift gear, here, where does containerization work with NFV, and how should one correlate to the ask of Telco provider(s)?

CT: Keith, I’ll let you address this one–should be right up your alley.

KH: While this webinar is fundamentally about storage technologies, Docker does have a very broad ecosystem of network partners. NFV is a very broad topic and can’t easily be covered in one quick bite, but there are definitely efforts using Docker as both an enablement technology for NFV, as well as integrating Docker’s built-in networking capabilities in an NFV scope for application delivery.

Q: It would be helpful to circle back at the end and summarize what is Open Source and what is a commercial product, I’m trying to grasp what you miss out on by staying just Open Source. I know that excludes the Universal Control Plane but I don’t yet see what UCP delivers, what its USP is.

CT: Keith, I’ll let you address this one–should be right up your alley.

KH: UCP is the only unique commercial component of Docker. It combines a web-based GUI with role-based access control (RBAC) to make it easier to control security and access to Docker components in a production environment. We do maintain a separate codebase for our commercially supported Engine and Registry, but that’s mainly done to maintain a more stable release, with critical patches backported from the upstream open source projects. Fundamentally, CS Engine and DTR are the same product as their open source upstreams, only on a slower, more stable release cycle. Click here for an overview, and links to some more detailed information on what’s involved in our commercial products:

Q: Is there demand for concurrent access, across container hosts, to persistent data? If so, what are those use case scenarios?

CT: Yes, actually if you think about a micro-service architecture, you will most likely have many containers accessing a common set of container data volumes simultaneously. This is exactly the reason for persistent storage–if the containers running the application services get migrated or moved to other physical nodes, they need to maintain access to their respective container data volumes in many cases.

KH: What a great question! That demand is small, but there. In most cases, persistent data is maintained concurrently through clustering processes (database replication, object storage, etc) but there are some edge cases for large file processing (rendering, big data needs) where there are some asks for that capability.

Q: is it possible to run windows applications on Linux container and vice versa?

CT: To my knowledge, you should be able to run Linux applications within the recently announced Windows Server 2016 supported containers (see article here), but you can’t run Windows applications within Linux containers.

KH: No. A container is essentially a process running in a named concurrency group under the kernel. Therefore, you need a Linux kernel to run Linux processes, and likewise for Windows. It will be possible to run Windows and Linux containers under the same management umbrella very soon. We’re waiting on some network features to roll out in Windows Server 2016 SP1 for that capability.

Q: is it a good idea to run legacy apps in a container? Exactly what is the relation between microservices and containers? Container-like technology used to be popular in various UNIX OSes. What is different now? is the best choice a microservice in a container & spin multiple instantiations fast?

CT: Keith, I’ll let you address this one–should be right up your alley.

KH: Container technology is still popular in several UNIX OSes. Under the hood, a Docker Linux container isn’t very different from a Solaris Zone. The difference is primarily the lifecycle tools to build and maintain your containers from both the developer and operations sides. The newer generation of container runtimes is simply much easier to use than older methods. From a Docker perspective: Docker Hub, the ease of use of the ‘Docker’ CLI command tools, and clustering capabilities in Engine are the main differences. As always, design your architecture to fit your team, user, and application needs. However, if you do want to use a microservices approach, maintaining each part of your application stack as a suite of microservices does make running them widely parallel a strong approach.

Q: Is a micro service self-contained with respect to data requirements. Can a service that depends on an external datasource be a micro service?

CT: A micro service is by definition self-contained; however, it also would typically connect to one or more container data volumes. Regarding accessing external data sources, not exactly sure what is meant here, but the micro services can be running on one physical server with the container data volumes being created and managed on a separate DAS/SAN/NAS storage platform.

KH: Yes, absolutely. An API broker for an external, legacy datastore is a good example of a microservice.

Q: How are container images qualified so that they can be trusted for automated pulls?

CT: Keith, I’ll let you address this one–should be right up your alley. I believe that there is NOT a vetting or certification process done by Docker when posting to either the public Hub or to a trusted registry. This would be the responsibility of the container image developer.

KH: In a few ways. First, containers are rarely built from scratch. They are normally built from base images released by trusted providers like Microsoft, Ubuntu, Red Hat, etc. First you should prove trust in that base image through similar methods as you would a VM image. In a Docker Datacenter install, a user with Admin rights can then bring those base images into Docker Trusted Registry (DTR) and then also do a review of internal images built on top of that base before blessing them to go into production. There are also 3rd party security scanning technologies you can use, should that be a concern.

Q: For stateless applications, can Docker help apply updates to the application without taking a downtime? For example, a container is running version n of an application and version n+1 needs to be deployed without causing a downtime to users, could one spin a new container with version n+1 of the application and deploy it?

CT: If the application is truly stateless, then it shouldn’t matter if they are torn down and restarted on another physical server/node to allow the application of a new patch or OS update on the original node. However, this would need to be correctly architected.

KH: Yes. Using Docker Engine in Swarm mode, we provide a command ‘Docker service update’ to do exactly that. Check the docs.

Q: Flocker vs. Convoy vs. others – could you talk about these interfaces and their adoption?

CT: ClusterHQ (Flocker) has developed a generic storage volume plugin that they then provided back to the Docker community to incorporate the Docker engine. I’m not very familiar with Convoy, but it appears to be a Rancher-developed storage plugin that they have made available as open source, but it is not part of the Docker release.

KH: Flocker and Convoy are brokerage-type volume drivers that have the capability to connect with several storage backends. Each has its own API to talk to and manage volumes under the hood. It’s also possible to integrate directly with Docker’s volume API. If you’re mainly interested in integrating purely with Docker Engine, a direct Docker Volume API plugin is the best approach. However, both Flocker and Convoy provide some ease-of-use features and capabilities that might make it attractive to go their routes. Volume API docs are here.

Q: Does the link in communication between different containers that run microservices incur the very load we are trying to escape from monolithic approach?

CT: Keith, I’ll let you address this one–should be right up your alley.

KH: That’s a very philosophical question! I’d argue that using modern API methods like REST over HTTP is so lightweight that the distributed approach makes more sense.

Q: Docker Swarm?

CT: Keith, I’ll let you address this one.

KH: Swarm is our clustering technology to chain together several Docker Engine hosts into one big cluster. Prior to Engine 1.12, it was a standalone product. After 1.12, we added SwarmKit into Engine to make building and maintaining swarms much easier. For more info, check out old Swarm docs and new Swarm docs.

Q: Do the applications need to be re-written/revised to take advantage of Container approach?

CT: Most legacy or monolithic applications will need to be refactored to best take advantage of a micro-service architecture.

KH: Typically, yes. Web applications are already built in a distributed way, so they’re the easiest to convert.

Q: Do microservices implement Unikernels?

CT: Keith, I’ll let you also address this. My take: Containers run microservices and leverage the entire OS (Linux kernel and all of its libraries, drivers, etc.). A unikernel is a very small and minimalistic kernel that doesn’t contain the additional bloat of the full kernel and therefore, is considered to run that much faster and leaner. Docker acquired a unikernels company and will most likely provide support for running microservices with unikernels and how they may provide a container like wrapper.

KH: Not directly. Unikernels are a new method to run arbitrary runtimes under a single kernel. Docker is currently doing some early work with microkernel technology to improve containers, but it’s not rolled into core Engine yet.

Q: Can a Docker container run on bare metal instead of a host OS directly. If yes, what benefits does this approach provide?

CT: A Docker container requires a host OS to run; however, when we refer to “bare-metal” we are referring to a “non-virtualized server”. The key benefit this provides is that you don’t waste efficiencies by eliminating the hypervisor and guest OS and it is also much more manageable as with the hypervisor and guest OS scenario, you have to manage and maintain all of the VMs that may have different guest OS’s and versions.

KH: No. A Docker container needs Docker Engine to run, so you’ll need to run Engine under a supported OS on the metal. Running Engine on a physical server means your containers will get full “on the iron” IO, since there’s no hypervisor abstraction layer between your container and the hardware it’s running on.

Q: Are packaged software companies like Oracle moving to containerization?

CT: Oracle is developing product offerings that are containerized applications. They would be best able to address your question.

KH: Here is Oracle’s GitHub repository of their official Docker containers and here is their official images in Docker Hub.

See? I told you there was no shortage of questions! If you still have one, please comment in this blog below and we’ll get back to you as soon as we can. Follow us on Twitter @SNIACloud to stay up-to-date on what SNIA Cloud is doing with containers. And don’t forget to register for part two of this webcast, Containers: Best Practices and Data Services, on December 7th. We hope to see you there!

 

 

Cloud Object Storage – You’ve Got Questions, We’ve Got Answers

The SNIA Cloud Storage Initiative hosted a live Webcast “Cloud Object Storage 101.” Like any “101” type course, there were a lot of good questions. Here they all are – with our answers. If you have additional questions, please let us know by commenting on this blog.

Q. How do you envision the new role of tape (LTO) in this unstructured data growth?

A. Exactly the same way that tape has always played a part; it’s the storage medium that requires no power to store cold data and is cheap per bit. Although it has a limited shelf life, and although we believe that flash will eventually replace it, it still has a secure & growing foreseeable future.

Q. What are your thoughts on whether object storage can exist outside the bounds of supporting file systems? Block devices directly storing objects using the key as reference and removing the intervening file system? A hierarchy of objects instead of files?

A. All of these things. Objects can be objects identified by an ID in a flat non-hierarchical structure; or we can impose a hierarchy by key- to objectID translation; or indeed, an object may contain complete file systems or be treated like a block device. There are really no restrictions on how we can build meta data that describes all these things over the bytes of storage that makes up an object.

Q. Can you run write insensitive low latency apps on object storage, ex: virtual machines?

A. Yes. Object storage can be made up of the same stuff as other high performance storage systems; for instance, flash connect via high bandwidth and low latency networks. Or they could even be object stores built over PCIe and NVDIMM.

Q. Is erasure coding (EC) expensive in terms of networking and resources utilization (especially in case of rebuild)?

A. No, that’s one of the advantages of EC. Rebuilds take place by reading data from many disks and writing it to many disks; in traditional RAID rebuilds, the focus is normally on the one disk that’s being rebuilt.

Q. Is there any overhead for small files or object use cases? Do you have a recommended size?

A. Each system will have its own advantages and disadvantages for objects of specific sizes. In general, object stores are designed to store billions of objects, so the number of objects is usually not an issue.

Q. Can you comment on Internet bandwidth limitations on geographically dispersed erasure coded data?

A. Smart caching can make a big difference, but at the end of the day, a geographically EC dispersed object store won’t be faster than a local store. You can’t beat the speed of light.

Q. The suppliers all claim easy exit strategies from their systems. If we were to use one of the on-premise solutions such as ECS or Cleversafe, and then down the road decide to move off-premise, is the migration/egress typically as easy as claimed?

A. In general, any proprietary interface might lock you in. The SNIA’s CDMI is vendor neutral, and supported by a number of vendors. Amazon’s S3 is a popular and common interface. Ultimately, vendors want your data on their systems – and that means making it easy to get the data from a competing vendor’s system; lock-in is not what vendors want. Talk to your vendor and ask for other users’ experiences to get confirmation of their claims.

Q. Based on factual information, where are you seeing the most common use cases for Object Storage?

A. There are many, and each vendor of cloud storage has particular markets. Backup is a common case, as are systems in the healthcare space that treat data such as scans and X-rays as objects.

Q. NAS filers only scale up not out. They are hard to manage at scale. Why use them anymore?

A. There are many NAS systems that scale out as well as up. NFSv4 support high degrees of scale out and there are file systems like Gluster that provide very large-scale solutions indeed, into the multi-petabyte range.

Q. Are there any specific uses cases to avoid when considering object storage?

A. Yes. Many legacy applications will not generate any savings or gains if moved to object storage.

Q. Would you agree with industry statements that 80% of all data written today will NEVER be accessed again; and that we just don’t know WHICH 20% will be read again?

A. Yes to the first part, and no to the second. Knowing which 80% is cold is the trick. The industry is developing smart ways of analyzing data to help with the issue of ensuring cached data is hot data, and that cold data is placed correctly first time around.

Q. Is there also the possibility to bring “compliance” in the object storage? (thinking about banking, medical and other sensible data that needs to be tracked, retention, etc…)

A. Yes. Many object storage vendors provide software to do this.

 

Need a Primer on Cloud Object Storage?

There has been a lot of buzz around cloud object storage recently. But before you get deep into all that cloud object storage can do, it’s good to take a step back and make sure you understand the basics. That’s what the SNIA Cloud Storage Initiative is planning to do on July 14th at our live Webcast “Cloud Object Storage 101.”

Many organizations, like large service providers, have already begun to leverage software-defined object storage to support new application development and DevOps projects. Meanwhile, legacy enterprise companies are in the early stages of exploring the benefits of object storage for their particular business and are searching for how they can use cloud object storage to modernize their IT strategies, store and protect data, while dramatically reducing the costs associated with legacy storage sprawl.

This Webcast will highlight the market trends towards the adoption of object storage, the definition and benefits of object storage, and the use cases that are best suited to leverage an underlying object storage infrastructure.

Join us on July 14th to learn:

  • How to accelerate the transition from legacy storage to a cloud object architecture
  • Understand the benefits of object storage
  • Primary use cases
  • How an object storage can enable your private, public or hybrid cloud strategy without compromising security, privacy or data governance

I hope you’ll register today to join my colleague, Nancy Bennis, Director of Alliances at Cleversafe (an IBM company), and me for this tutorial on cloud object storage.