• Home
  • About
  •  

    The Alphabet Soup of Storage Networking Acronyms Explained

    August 7th, 2017
    At our most recent webcast, “Everything You Wanted to Know About Storage But Were Too Proud To Ask: Part Turquoise – Where Does My Data Go?, our panel of experts dove into what really happens when you hit “save” and send your data off. It was an alphabet soup of acronyms as they explained the nuances of and the differences between:
    • Volatile v Non-Volatile v Persistent Memory
    • NVDIMM v RAM v DRAM v SLC v MLC v TLC v NAND v 3D NAND v Flash v SSDs v NVMe
    • NVMe (the protocol)
    As promised during the live event, here are answers to all the questions we received. Q. Is SRAM still used today? A. SRAM is still in use today as embedded CACHE (Level 1/2/3) within a CPU and very limited in external standalone packaging… This is due to cost and size/capacity.  Continue Reading...

    Q&A – When Compute, Networking and Storage Intersect

    July 18th, 2017
    In Part Vermillion of our SNIA Ethernet Storage Forum (ESF) “Everything You Wanted To Know About Storage But Were Too Proud To Ask” webcast series – we examined the terms and concepts are at the heart of where compute, networking and storage intersect. That’s why we called it “What if Programming and Networking Had a Storage Baby” If you missed the live webcast, you can watch it on-demand. The discussion from our panel of experts generated a lot of good questions. As promised, here are answers to them all.   Continue Reading...

    The Too Proud to Ask Train Makes Another Stop: Where Does My Data Go?

    June 22nd, 2017
    By now, we at the SNIA Storage Ethernet Storage Forum (ESF) hope you are familiar with (perhaps even a loyal fan of) the “Everything You Wanted To Know About Storage But Were Too Proud To Ask,” popular webcast series. On August 1st, the “Too Proud to Ask” train will make another stop. In this seventh session, “Everything You Wanted to Know About Storage But Were Too Proud To Ask: Turquoise – Where Does My Data Go?, we will take a look into the mysticism and magic of what happens when you send your data off into the wilderness. Once you click “save,” for example, where does it actually go?  Continue Reading...

    Attend Live – or Live Stream – SNIA’s Persistent Memory Summit January 18

    January 12th, 2017

    by Marty Foltyn

    SNIA’s Persistent Memory Summit makes its fifth annual appearance in Silicon Valley next Wednesday, January 18, and if you are in the vicinity of the Westin San Jose, you owe it to yourself to check it out. PMSummitLogo (2)

    SNIA is well known for its technology-focused, no vendor-hype conferences, and this one-day event will feature 12 presentations and two panels that will “level set” the discussion, review persistent memory usage, describe applications incorporating PM available today, discuss the infrastructure and implementation, and provide a vision of the “next generation” of persistent memory.

    You’ll meet speakers from SNIA member companies Intel, Micron, Microsemi, VMware, Red Hat, Microsoft, AgigA Tech, Western Digital, and Spin Transfer.  Live demonstrations of persistent memory solutions will be featured from Summit underwriters Intel and the SNIA Solid State Storage Initiative, and Summit sponsors Microsemi, VMware, AgigA Tech, SMART Modular, and Spin Transfer.

    Registration is complimentary but limited  -visit http://www.snia.org/pm-summit for the complete agenda and how to sign up.  And, if your travels don’t permit you to attend in person, the Persistent Memory Summit will be live-streamed on the SNIAvideo channel at https://www.youtube.com/user/SNIAVideo.


    Questions on the 2017 Ethernet Roadmap for Networked Storage

    January 9th, 2017

    Last month, experts from Dell EMC, Intel, Mellanox and Microsoft convened to take a look ahead at what’s in store for Ethernet Networked Storage this year. It was a fascinating discussion of anticipated updates. If you missed the webcast, “2017 Ethernet Roadmap for Networked Storage,” it’s now available on-demand. We had a lot of great questions during the live event and we ran out of time to address them all, so here are answers from our speakers.

    Q. What’s the future of twisted pair cable? What is the new speed being developed with twisted pair cable?

    A. By twisted pair I assume you mean USTP CAT5,6,7 etc.  The problem going forward with high speed signaling is the USTP stands for Un-Shielded and the signal radiates off the wire very quickly.   At 25G and 50G this is a real problem and forces the line card end to have a big, power consuming and costly chip to dig the signal out of the noise. Anything can be done, but at what cost.  25G BASE-T is being developed but the reach is somewhere around 30 meters.  Cost, size, power consumption are all going up and reach going down – all opposite to the trends in modern high speed data centers.  BASE-T will always have a place for those applications that don’t need the faster rates.

    Q. What do you think of RCx standards and cables?

    A. So far, Amphenol, JAE and Volex are the suppliers who are members of the MSA. Very few companies have announced or discussed RCx.  In addition to a smaller connector, not having an EEPROM eliminates steps in the cable assembly manufacture, hence helping with lowering the cost when compared to traditional DAC cabling. The biggest advantage of RCx is that it can help eliminate bulky breakout cables within a rack since a single RCx4 receptacle can accept a number of combinations of single lane, 2 lane or 4 lane cable with the same connector on the host. RCx ports can be connected to existing QSFP/SFP infrastructure with appropriate cabling. It remains to be seen, however, if it becomes a standard and popular product or remain as a custom solution.

    Q. How long does AOC normally reach, 3m or 30m?  

    A. AOCs pick it up after DAC drops off about 3m.  Most popular reaches are 3,5,and 10m and volume drops rapidly after 15,20,30,50, and100. We are seeing Ethernet connected HDD’s at 2.5GbE x 2 ports, and Ceph touting this solution.  This seems to play well into the 25/50/100GbE standards with the massive parallelism possible.

    Q. How do we scale PCIe lanes to support NVMe drives to scale, and to replace the capacity we see with storage arrays populated completely with HDDs?

    A. With the advent of PCIe Gen 4, the per-lane rate of PCIe is going from 8 GT/s to 16GT/s. Scaling of PCIe is already happening.

    Q. How many NVMe drives does it take to saturate 100GbE?

    A. 3 or 4 depending on individual drives.

    Q. How about the reliability of Ethernet? A lot of people think Fibre Channel has better reliability than Ethernet.

    A. It’s true that Fibre Channel is a lossless protocol. Ethernet frames are sometimes dropped by the switch, however, network storage using TCP has built in error-correction facility. TCP was designed at a time when networks were less robust than today. Ethernet networks these days are far more reliable.

    Q. Do the 2.5GbE and 5GbE refer to the client side Ethernet port or the server Ethernet port?

    A. It can exist on both the client side and the server side Ethernet port.

    Q. Are there any 25GbE or 50GbE NICs available on the market?

    A. Yes, there are many that are on the market from a number of vendors, including Dell, Mellanox, Intel, and a number of others.

    Q. Commonly used Ethernet speeds are either 10GbE or 40GbE. Do the new 25GbE and 50GbE require new switches?

    A. Yes, you need new switches to support 25GbE and 50GbE. This is, in part, because the SerDes rate per lane at 25 and 50GbE is 25Gb/s, which is not supported by the 10 and 40GbE switches with a maximum SerDes rate of 10Gb/s.

    Q. With a certain number of SerDes coming off the switch ASIC, which would you prefer to use 100G or 40G if assuming both are at the same cost?

    A. Certainly 100G. You get 2.5X the bandwidth for the same cost under the assumptions made in the question.

    Q. Are there any 100G/200G/400G switches and modulation available now?

    A. There are many 100G Ethernet switches available on the market today include Dell’s Z9100 and S6100, Mellanox’s SN2700, and a number of others. The 200G and 400G IEEE standards are not complete as of yet. I’m sure all switch vendors will come out with switches supporting those rates in the future.

    Q. What does lambda mean?

    ALambda is the symbol for wavelength.

    Q. Is the 50GbE standard ratified now?

    A. IEEE 802.3 just recently started development of a 50GbE standard based upon a single-lane 50 Gb/s physical layer interface. That standard is probably about 2 years away from ratification. The 25G Ethernet Consortium has a ratified specification for 50GbE based upon a dual-lane 25 Gb/s physical layer interface.

    Q. Are there any parallel options for using 2 or 4 lanes like in 128GFCp?

    A. Many Ethernet specifications are based upon parallel options. 10GBASE-T is based upon 4 twisted-pairs of copper cabling. 100GBASE-SR4 is based upon 4 lanes (8 fibers) of multimode fiber. Even the industry MSA for 100G over CWDM4 is based upon four wavelengths on a duplex single-mode fiber. In some instances, the parallel option is based upon the additional medium (extra wires or fibers) but with fiber optics, parallel can be created by using different wavelengths that don’t interfere with each other.

     

     


    Ethernet Networked Storage – FAQ

    December 8th, 2016

    At our SNIA Ethernet Storage Forum (ESF) webcast “Re-Introduction to Ethernet Networked Storage,” we provided a solid foundation on Ethernet networked storage, the move to higher speeds, challenges, use cases and benefits. Here are answers to the questions we received during the live event.

    Q. Within the iWARP protocol there is a layer called MPA (Marker PDU Aligned Framing for TCP) inserted for storage applications. What is the point of this protocol?

    A. MPA is an adaptation layer between the iWARP Direct Data Placement Protocol and TCP/IP. It provides framing and CRC protection for Protocol Data Units.  MPA enables packing of multiple small RDMA messages into a single Ethernet frame.  It also enables an iWARP NIC to place frames received out-of-order (instead of dropping them), which can be beneficial on best-effort networks. More detail can be found in IETF RFC 5044 and IETF RFC 5041.

    Q. What is the API for RDMA network IPC?

    The general API for RDMA is called verbs. The OpenFabrics Verbs Working Group oversees the development of verbs definition and functionality in the OpenFabrics Software (OFS) code. You can find the training content from OpenFabrics Alliance here. General information about RDMA for Ethernet (RoCE) is available at the InfiniBand Trade Association website. Information about Internet Wide Area RDMA Protocol (iWARP) can be found at IETF: RFC 5040, RFC 5041, RFC 5042, RFC 5043, RFC 5044.

    Q. RDMA requires TCP/IP (iWARP), InfiniBand, or RoCE to operate on with respect to NVMe over Fabrics. Therefore, what are the advantages of disadvantages of iWARP vs. RoCE?

    A. Both RoCE and iWARP support RDMA over Ethernet. iWARP uses TCP/IP while RoCE uses UDP/IP. Debating which one is better is beyond the scope of this webcast, but you can learn more by watching the SNIA ESF webcast, “How Ethernet RDMA Protocols iWARP and RoCE Support NVMe over Fabrics.”

    Q. 100Gb Ethernet Optical Data Center solution?

    A. 100Gb Ethernet optical interconnect products were first available around 2011 or 2012 in a 10x10Gb/s design (100GBASE-CR10 for copper, 100GBASE-SR10 for optical) which required thick cables and a CXP and a CFP MSA housing. These were generally used only for switch-to-switch links. Starting in late 2015, the more compact 4x25Gb/s design (using the QSFP28 form factor) became available in copper (DAC), optical cabling (AOC), and transceivers (100GBASE-SR4, 100GBASE-LR4, 100GBASE-PSM4, etc.). The optical transceivers allow 100GbE connectivity up to 100m, or 2km and 10km distances, depending on the type of transceiver and fiber used.

    Q. Where is FCoE being used today?

    A. FCoE is primarily used in blade server deployments where there could be contention for PCI slots and only one built-in NIC. These NICs typically support FCoE at 10Gb/s speeds, passing both FC and Ethernet traffic via connect to a Top-of-Rack FCoE switch which parses traffic to the respective fabrics (FC and Ethernet). However, it has not gained much acceptance outside of the blade server use case.

    Q. Why did iSCSI start out mostly in lower-cost SAN markets?

    A. When it first debuted, iSCSI packets were processed by software initiators which consumed CPU cycles and showed higher latency than Fibre Channel. Achieving high performance with iSCSI required expensive NICs with iSCSI hardware acceleration, and iSCSI networks were typically limited to 100Mb/s or 1Gb/s while Fibre Channel was running at 4Gb/s. Fibre Channel is also a lossless protocol, while TCP/IP is lossey, which caused concerns for storage administrators. Now however, iSCSI can run on 25, 40, 50 or 100Gb/s Ethernet with various types of TCP/IP acceleration or RDMA offloads available on the NICs.

    Q. What are some of the differences between iSCSI and FCoE?

    A. iSCSI runs SCSI protocol commands over TCP/IP (except iSER which is iSCSI over RDMA) while FCoE runs Fibre Channel protocol over Ethernet. iSCSI can run over layer 2 and 3 networks while FCoE is Layer 2 only. FCoE requires a lossless network, typically implemented using DCB (Data Center Bridging) Ethernet and specialized switches.

    Q. You pointed out that at least twice that people incorrectly predicted the end of Fibre Channel, but it didn’t happen. What makes you say Fibre Channel is actually going to decline this time?

    A. Several things are different this time. First, Ethernet is now much faster than Fibre Channel instead of the other way around. Second, Ethernet networks now support lossless and RDMA options that were not previously available. Third, several new solutions–like big data, hyper-converged infrastructure, object storage, most scale-out storage, and most clustered file systems–do not support Fibre Channel. Fourth, none of the hyper-scale cloud implementations use Fibre Channel and most private and public cloud architects do not want a separate Fibre Channel network–they want one converged network, which is usually Ethernet.

    Q. Which storage protocols support RDMA over Ethernet?

    A. The Ethernet RDMA options for storage protocols are iSER (iSCSI Extensions for RDMA), SMB Direct, NVMe over Fabrics, and NFS over RDMA. There are also storage solutions that use proprietary protocols supporting RDMA over Ethernet.

     

     

     

     

     

     

     

     

     

     

     

     


    2017 Ethernet Roadmap for Networked Storage

    November 2nd, 2016

    When SNIA’s Ethernet Storage Forum (ESF) last looked at the Ethernet Roadmap for Networked Storage in 2015, we anticipated a world of rapid change. The list of advances in 2016 is nothing short of amazing

    • New adapters, switches, and cables have been launched supporting 25, 50, and 100Gb Ethernet speeds including support from major server vendors and storage startups
    • Multiple vendors have added or updated support for RDMA over Ethernet
    • The growth of NVMe storage devices and release of the NVMe over Fabrics standard are driving demand for both faster speeds and lower latency in networking
    • The growth of cloud, virtualization, hyper-converged infrastructure, object storage, and containers are all increasing the popularity of Ethernet as a storage fabric

    The world of Ethernet in 2017 promises more of the same. Now we revisit the topic with a look ahead at what’s in store for Ethernet in 2017.  Join us on December 1, 2016 for our live webcast, “2017 Ethernet Roadmap to Networked Storage.”

    With all the incredible advances and learning vectors, SNIA ESF has assembled a great team of experts to help you keep up. Here are some of the things to keep track of in the upcoming year:

    • Learn what is driving the adoption of faster Ethernet speeds and new Ethernet storage models
    • Understand the different copper and optical cabling choices available at different speeds and distances
    • Debate how other connectivity options will compete against Ethernet for the new cloud and software-defined storage networks
    • And finally look ahead with us at what Ethernet is planning for new connectivity options and faster speeds such as 200 and 400 Gigabit Ethernet

    The momentum is strong with Ethernet, and we’re here to help you stay informed of the lightning-fast changes. Come join us to look at the future of Ethernet for storage and join this SNIA ESF webcast on December 1st. Register here.

     


    SNIA Storage Developer Conference-The Knowledge Continues

    October 13th, 2016

    SNIA’s 18th Storage Developer Conference is officially a success, with 124 general and breakout sessions;  Cloud Interoperability, Kinetiplugfest 5c Storage, and SMB3 plugfests; ten Birds-of-a-Feather Sessions, and amazing networking among 450+ attendees.  Sessions on NVMe over Fabrics won the title of most attended, but Persistent Memory, Object Storage, and Performance were right behind.  Many thanks to SDC 2016 Sponsors, who engaged attendees in exciting technology discussions.

    For those not familiar with SDC, this technical industry event is designed for a variety of storage technologists at various levels from developers to architects to product managers and more.  And, true to SNIA’s commitment to educating the industry on current and future disruptive technologies, SDC content is now available to all – whether you attended or not – for download and viewing.

    20160919_120059You’ll want to stream keynotes from Citigroup, Toshiba, DSSD, Los Alamos National Labs, Broadcom, Microsemi, and Intel – they’re available now on demand on SNIA’s YouTube channel, SNIAVideo.

    All SDC presentations are now available for download; and over the next few months, you can continue to download SDC podcasts which combine audio and slides. The first podcast from SDC 2016 – on hyperscaler (as well as all 2015 SDC Podcasts) are available here, and more will be available in the coming weeks.

    SNIA thanks all its members and colleagues who contributed to make SDC a success! A special thanks goes out to the SNIA Technical Council, a select group of acknowledged industry experts who work to guide SNIA technical efforts. In addition to driving the agenda and content for SDC, the Technical Council oversees and manages SNIA Technical Work Groups, reviews architectures submitted by Work Groups, and is the SNIA’s technical liaison to standards organizations. Learn more about these visionary leaders at http://www.snia.org/about/organization/tech_council.

    And finally, don’t forget to mark your calendars now for SDC 2017 – September 11-14, 2017, again at the Hyatt Regency Santa Clara. Watch for the Call for Presentations to open in February 2017.


    It’s Time for a Re-Introduction to Ethernet Networked Storage

    July 7th, 2016

    Ethernet technology had been a proven standard for over 30 years and there are many networked storage solutions based on Ethernet. While storage devices are evolving rapidly with new standards and specifications, Ethernet is moving towards higher speeds as well: 10Gbps, 25Gbps, 50Gbps and 100Gbps….making it time to re-introduce Ethernet Networked Storage.

    That’s exactly what Rob Davis and I plan to do on August 4th in a live SNIA Ethernet Storage Forum Webcast, “Re-Introducing Ethernet Networked Storage.” We will start by providing a solid foundation on Ethernet networked storage and move to the latest advancements, challenges, use cases and benefits. You’ll hear:

    • The evolution of storage devices – spinning media to NVM
    • New standards: NVMe and NVMe over Fabric
    • A retrospect of traditional networked storage including SAN and NAS
    • How new storage devices and new standards would impact Ethernet networked storage
    • Ethernet based software-defined storage and the hyper-converged model
    • A look ahead at new Ethernet technologies optimized for networked storage in the future

    I hope you will join us on August 4th at 10:00 a.m. PT. We’re confident you will learn some new things about Ethernet networked storage. Register today!


    Principles of Networked Solid State Storage – Q&A

    June 22nd, 2016

    At this month’s SNIA Ethernet Storage Forum Webcast, “Architectural Principles for Networked Solid State Storage Access,” Doug Voigt, Chair of the SNIA NVM Programming Technical Working Group, and a member of the SNIA Technical Council, outlined key architectural principles surrounding the application of networked solid state technologies. We had a flurry of questions near the end of the Webcast that we did not have enough time to answer. Here are Doug’s answers to all the questions we received during the event:

    Q. Are there wait cycles in accessing persistent memory?

    A. It depends entirely on which persistent memory (PM) technology is being accessed and how the memory interconnect is used.  Some technologies have write times that are quite different from read times.  When using tightly timed interconnects such as DDR with those technologies it may be difficult to avoid wait cycles.

    Q. How do Pmalloc and malloc share the virtual address space of the application?

    A. This is entirely up to the OS and other libraries operating within any constraints of the processor architecture-specific memory management units.  A good mental model would be fairly large regions of contiguous address space in both the physical and virtual domains, where each region will comprise a single type of memory. Capacity will be reserved for pmalloc and malloc in the appropriate regions.

    Q. Always flush after doing your memory-mapped IO.  Is that simply good hygiene?

    A. Not exactly. The term “Memory Mapped IO” is used to reference control plane (as opposed to data plane) access.  It is often reasonable to set up control plane memory as uncacheable. The need for strict order of access to physical control plane registers is so pervasive that caching is generally not useful. Uncacheable writes are always flushed by the processor, as opposed to the application.

    Generally with memory mapped IO devices the data plane uses direct memory access (DMA).  With memory mapped files (as opposed to memory mapped IO) Load/Store (more commonly referred to as “Ld/St”), not DMA, is used in the data plane. Disabling caching in the data plane is generally a big performance sacrifice for small byte range access.

    In the Ld/St datapath, strategically placed flushing is required to retain both performance and power failure recovery. The SNIA NVM Programming Model describes this type of functionality.

    Q. Once NVDIMM support become pervasive with support from NVMe drives in the server box, should network storage be more focused on SAS Flash or just SAS HDDs?

    A. Not necessarily.  NVMe over Fabric, Fibre Channel and iSCSI are also types of networked storage that will likely retain significant market share relative to SAS.

    Q. Are the ‘Big Data’ Data Warehouse applications starting to use the persistence memory and domain technologies in their applications?

    A. It is too early to see much of this yet. PM technologies might become a priority as a staging area for analytic applications with high ingest or checkpoint rates. NVDIMMs are likely to be too expensive to store anything “big” for quite a while.

    Q. Also, is the persistence memory/domains being used in the Hyper-converged and Converged hardware infrastructures?

    A. Persistent memory is quintessentially (Hyper-) converged.  It wouldn’t be unreasonable to expect some traction with hyper-converged solutions that experience high storage-performance demand.

    Q. What distance would you associate with 10’s of microseconds?

    A. In terms of transmission delay, 10’s of uS align with a campus or small city scale, but the distance itself is often not the primary factor.  Switching delays, transmission line properties and software overhead are generally bigger factors.

    Q. So latency would be the binding factor for distances…not a question, an observation.

    A. Yes, in effect, either through transmission or relay.  See above.

    Q. Aren’t there multi-threaded SSDs?

    A. Yes, but since the primary metric in this presentation is latency we ignore multi-threading.  It can enable more work to get done, but it generally increases latency rather than reducing it.

    Q. Is Pmalloc universal usage?

    A. The term is starting to be recognized among developers and has been used in research. Various similar names have been used in early research prototypessuch as pmalloc in Mnemosyne and nvmalloc in SCMFS.

    Q. So how would PM help in a (stock broking) requirement, where we currently prophesize an RDMA or iWARP solution?

    A. With PM the answer is always lower latency.  PM can be litegrated like memory or like flash. RDMA network paths for both of these options were discussed in the presentation. In either case, PM is low-latency enough that networking and software overheads will completely determine performance, even when using RDMA. The performance boost from PM is greatest when it is accessed locally.  If remote access is a requirement then the new work being done in the RDMA community should help.

    Q. If data stored in memory requires to be copied to a different host, memory (for consistency) how does PM assist, or is there an extension to PM? Coherency between multiple hosts in a cluster, if you will?

    A. PM technology does not help with this; the methods of managing consistency across hosts remain unchanged by PM.  All PM offers is low latency persistence.

    Coordination across hosts or nodes in a cluster must use existing clustering techniques such as locking and quorums. In addition, the relative timescales of memory access and network communication suggest the application of asynchronous remote replication techniques used in today’s storage solutions.

    Regarding coherency, PM brings nothing new to the known techniques for managing coherency.  Classical cluster architecture must be applied outside of symmetric multi-processing coherency domains. Within coherency domains, all of the logic is above the PM level in a processor side memory controller or a software emulation of the same algorithms.