NVMe®/TCP Q&A

The SNIA Networking Storage Forum (NSF) had an outstanding response to our live webinar, “NVMe/TCP: Performance, Deployment, and Automation.” If you missed the session, you can watch it on-demand and download a copy of the presentation slides at the SNIA Educational Library. Our live audience gave the presentation a 4.9 rating on a scale of 1-5, and they asked a lot of detailed questions, which our presenter, Erik Smith, Vice Chair of SNIA NSF, has answered here. Q: Does the Centralized Discovery Controller (CDC) layer also provide drive access control or is it simply for discovery of drives visible on the network? A: As defined in TP8010, the CDC only provides transport layer discovery. In other words, the CDC will allow a host to discover transport layer information (IP, Port, NQN) about the subsystem ports (on the array) that each host has been allowed to communicate with. Provisioning storage volumes to a particular host is additional functionality that COULD be added to an implementation of the CDC. (e.g., Dell has a CDC implementation that we refer to as SmartFabric Storage Software (SFSS). Q: Can you provide some examples of companies that provide CDC and drive access control functionalities? Read More

Considerations and Options for NVMe/TCP Deployment

NVMe®/TCP has gained a lot of attention over the last several years due to its great performance characteristics and relatively low cost. Since its ratification in 2018, the NVMe/TCP protocol has been enhanced to add features such as Discovery Automation, Authentication and Secure Channels that make it more suitable for use in enterprise environments. Now as organizations evaluate their options and consider adopting NVMe/TCP for use in their environment, many find they need a bit more information before deciding how to move forward. That’s why the SNIA Networking Storage Forum (NSF) is hosting a live webinar on July 19, 2023 “NVMe/TCP: Performance, Deployment and Automation” where we will provide an overview of deployment considerations and options, and answer questions such as: Read More

Object Storage: Got Questions?

Over 900 people (and counting) have watched ourSNIA Networking Storage Forum (NSF) webcast, “Object Storage: Trends, Use Cases” where our expert panelist had a lively discussion on object storage characteristics, use cases and performance acceleration. If you have not seen this session yet, we encourage you to check it out on-demand. The conversation included several interesting questions related to object storage. As promised, here are answers to them: Q: Today object storage allows many new capabilities but also new challenges, such as the need for geographic and local load balancers in a distributed scale out infrastructure that at the same time do not become the bottleneck of the object services at an unsustainable cost. Are there any solutions available today that have these features built in? A: Some object storage solutions have features such as load balancing and geographic distribution built into the software, though often the storage administrator must manually configure parts of these features at the network and/or server level. Most object storage cloud (StaaS) implementations include a distributed, scale-out infrastructure (including load balancing) in their implementation. Read More

A Q&A on Discovery Automation for NVMe-oF IP-Based SANs

In order to fully unlock the potential of the NVMe® IP based SANs, we first need to address the manual and error prone process that is currently used to establish connectivity between NVMe Hosts and NVM subsystems. Several leading companies in the industry have joined together through NVM Express to collaborate on innovations to simplify and automate this discovery process. This was the topic of discussion at our recent SNIA Networking Storage Forum webcast “NVMe-oF: Discovery Automation for IP-based SANs” where our experts, Erik Smith and Curtis Ballard, took a deep dive on the work that is being done to address these issues. If you missed the live event, you can watch it on demand here and get a copy of the slides. Erik and Curtis did not have time to answer all the questions during the live presentation. As promised, here are answers to them all. Q. Is the Centralized Discovery Controller (CDC) highly available, and is this visible to the hosts?  Do they see a pair of CDCs on the network and retry requests to a secondary if a primary is not available? Read More

Automating Discovery for NVMe IP-based SANs

NVMe® IP-based SANs (including transports such as TCP, RoCE, and iWARP) have the potential to provide significant benefits in application environments ranging from the Edge to the Data Center. However, before we can fully unlock the potential of the NVMe IP-based SAN, we first need to address the manual and error prone process that is currently used to establish connectivity between NVMe Hosts and NVM subsystems.  This process includes administrators explicitly configuring each Host to access the appropriate NVM subsystems in their environment. In addition, any time an NVM Subsystem interface is added or removed, a Host administrator may need to explicitly update the configuration of impacted hosts to reflect this change. Due to the decentralized nature of this configuration process, using it to manage connectivity for more than a few Host and NVM subsystem interfaces is impractical and adds complexity when deploying an NVMe IP-based SAN in environments that require a high-degrees of automation. Read More

How SNIA Swordfish™ Expanded with NVMe® and NVMe-oF™

The SNIA Swordfish™ specification and ecosystem are growing in scope to include full enablement and alignment for NVMe® and NVMe-oF client workloads and use cases. By partnering with other industry-standard organizations including DMTF®, NVM Express, and OpenFabrics Alliance (OFA), SNIA’s Scalable Storage Management Technical Work Group has updated the Swordfish bundles from version 1.2.1 and later to cover an expanding range of NVMe and NVMe-oF functionality including NVMe device management and storage fabric technology management and administration. The Need Large-scale computing designs are increasingly multi-node and linked together through high-speed networks. These networks may be comprised of different types of technologies, fungible, and morphing. Over time, many different types of high-performance networking devices will evolve to participate in these modern, coupled-computing platforms. New fabric management capabilities, orchestration, and automation will be required to deploy, secure, and optimally maintain these high-speed networks. Read More

Protecting NVMe over Fabrics Data from Day One, The Armored Truck Way

With ever increasing threat vectors both inside and outside the data center, a compromised customer dataset can quickly result in a torrent of lost business data, eroded trust, significant penalties, and potential lawsuits. Potential vulnerabilities exist at every point when scaling out NVMe® storage, which requires data to be secured every time it leaves a server or the storage media, not just when leaving the data center. NVMe over Fabrics is poised to be the one of the most dominant storage transports of the future and securing and validating the vast amounts of data that will traverse this fabric is not just prudent, but paramount. Read More

NVMe® over Fabrics for Absolute Beginners

A while back I write an article entitled “NVMe™ for Absolute Beginners.” It seems to have resonated with a lot of people and it appears there might be a call for doing the same thing for NVMe® over Fabrics (NVMe-oF™). This article is for absolute beginners. If you are a seasoned (or even moderately-experienced) technical person, this probably won’t be news to you. However, you are free (and encouraged!) to point people to this article who need Plain English™ to get started. A Quick Refresher Any time an application on a computer (or server, or even a consumer device like a phone) needs to talk to a storage device, there are a couple of things that you need to have. First, you need to have memory (like RAM), you need to have a CPU, and you also need to have something that can hold onto your data for the long haul (also called storage). Another thing you need to have is a way for the CPU to talk to the memory device (on one hand) and the storage device (on the other). Thing is, CPUs talk a very specific language, and historically memory could speak that language, but storage could not. For many years, things ambled along in this way. The CPU would talk natively with memory, which made it very fast but also was somewhat risky because memory was considered volatile. That is, if there was a power blip (or went out completely), any data in memory would be wiped out. Not fun. Read More

NVMe Key-Value Standard Q&A

Last month, Bill Martin, SNIA Technical Council Co-Chair, presented a detailed update on what’s happening in the development and deployment of the NVMe Key-Value standard. Bill explained where Key Value fits within an architecture, why it’s important, and the standards work that is being done between NVM Express and SNIA. The webcast was one of our highest rated. If you missed it, it’s available on-demand along with the webcast slides. Attendees at the live event had many great questions, which Bill Martin has answered here: Q. Two of the most common KV storage mechanisms in use today are AWS S3 and RocksDB. How does NVMe KV standards align or differ from them? How difficult would it be to map between the APIs and semantics of those other technologies to NVMe KV devices? A. KV Storage is intended as a storage layer that would support these and other object storage mechanisms. There is a publicly available KVRocks: RocksDB compatible key value store and MyRocks compatible storage engine designed for KV SSDs at GitHub. There is also a Ceph Object storage design available. These are example implementations that can help an implementer get to an efficient use of NVMe KV storage. Q. At which layer will my app stack need to change to take advantage of KV storage?  Will VMware or Linux or Windows need to change at the driver level?  Or do the apps need to be changed to treat data differently?  If the apps don’t need to change doesn’t this then just take the data layout tables and move them up the stack in to the server? Read More

An FAQ on RAID on the CPU

A few weeks ago, SNIA EMEA hosted a webcast to introduce the concept of RAID on CPU. The invited experts, Fausto Vaninetti from Cisco, and Igor Konopko from Intel, provided fascinating insights into this exciting new technology.

The webcast created a huge amount of interest and generated a host of follow-up questions which our experts have addressed below. If you missed the live event “RAID on CPU: RAID for NVMe SSDs without a RAID Controller Card” you can watch it on-demand.

Q. Why not RAID 6?

A. RAID on CPU is a new technology. Current support is for the most-used RAID levels for now, considering this is for servers not disk arrays. RAID 5 is primary parity RAID level for NVMe with 1 drive failure due to lower AFRs and faster rebuilds.

Q. Is the XOR for RAID 5 done in Software?

A.Yes, it is done in software on some cores of the Xeon CPU.

Q. Which generation of Intel CPUs support VROC?

Read More