• Home
  • About
  •  

    Clearing Up Confusion on Common Storage Networking Terms

    January 12th, 2017

    Do you ever feel a bit confused about common storage networking terms? You’re not alone. At our recent SNIA Ethernet Storage Forum webcast “Everything You Wanted To Know About Storage But Were Too Proud To Ask – Part Mauve,” we had experts from Cisco, Mellanox and NetApp explain the differences between:

    • Channel vs. Busses
    • Control Plane vs. Data Plane
    • Fabric vs. Network

    If you missed the live webcast, you can watch it on-demand. As promised, we’re also providing answers to the questions we got during the webcast. Between these questions and the presentation itself, we hope it will help you decode these common, but sometimes confusing terms.

    And remember, the “Everything You Wanted To Know About Storage But Were Too Proud To Ask” is a webcast series with a “colorfully-named pod” for each topic we tackle. You can register now for our next webcast: Part Teal, The Buffering Pod, on Feb. 14th.

    Q. Why do we have Fibre and Fiber

    A. Fiber Optics is the term used for the optical technology used by Fibre Channel Fabrics.  While a common story is that the “Fibre” spelling came about to accommodate the French (FC is after all, an international standard), in actuality, it was a marketing idea to create a more unique name, and in fact, it was decided to use the British spelling – “Fibre”.

    Q. Will OpenStack change all the rules of the game?

    A. Yes. OpenStack is all about centralizing the control plane of many different aspects of infrastructure.

    Q. The difference between control and data plane matters only when we discuss software defined storage and software defined networking, not in traditional switching and storage.

    A. It matters regardless. You need to understand how much each individual control plane can handle and how many control planes you have from a overall management perspective. In the case were you have too many control planes SDN and SDS can be a benefit to you.

    Q. As I’ve heard that networks use stateless protocols, would FC do the same?

    A. Fibre Channel has several different Classes, which can be either stateful or stateless. Most applications of Fibre Channel are Class 3, as it is the preferred class for SCSI traffic, A connection between Fibre Channel endpoints is always stateful (as it involves a login process to the Fibre Channel fabric). The transport protocol is augmented by Fibre Channel exchanges, which are managed on a per-hop basis. Retransmissions are handled by devices when exchanges are incomplete or lost, meaning that each exchange is a stateful transmission, but the protocol itself is considered stateless in modern SCSI-transport Fibre Channel.

    iSCSI, as a connection-oriented protocol, creates a nexus between an initiator and a target, and is considered stateful. In addition, SMB, NFSv4, ftp, and TCP are stateful protocols, while NFSv2, NFSv3, http, and IP are stateless protocols.

    Q. Where do CIFS/SMB come into the picture?

    A. CIFFS/SMB is part of a network stack.  We need to have a separate talk about network stacks and their layers.  In this presentation, we were talking primarily about the physical layer of the networks and fabrics.  To overly simplify network stacks, there are multiple layers of protocols that run on top of the physical layer.  In the case of FC, those protocols include the control plane protocols (such as FC-SW), and the data plane protocols.  In FC, the most common data plane protocol is FCP (used by SCSI, FICON, and FC-NVMe).  In the case of Ethernet, those protocols also include the control plan (such as TCP/IP), and data plane protocols.  In Ethernet, there are many commonly used data plane protocols for storage (such as iSCSI, NFS, and CIFFS/SMB)


    Ethernet Networked Storage – FAQ

    December 8th, 2016

    At our SNIA Ethernet Storage Forum (ESF) webcast “Re-Introduction to Ethernet Networked Storage,” we provided a solid foundation on Ethernet networked storage, the move to higher speeds, challenges, use cases and benefits. Here are answers to the questions we received during the live event.

    Q. Within the iWARP protocol there is a layer called MPA (Marker PDU Aligned Framing for TCP) inserted for storage applications. What is the point of this protocol?

    A. MPA is an adaptation layer between the iWARP Direct Data Placement Protocol and TCP/IP. It provides framing and CRC protection for Protocol Data Units.  MPA enables packing of multiple small RDMA messages into a single Ethernet frame.  It also enables an iWARP NIC to place frames received out-of-order (instead of dropping them), which can be beneficial on best-effort networks. More detail can be found in IETF RFC 5044 and IETF RFC 5041.

    Q. What is the API for RDMA network IPC?

    The general API for RDMA is called verbs. The OpenFabrics Verbs Working Group oversees the development of verbs definition and functionality in the OpenFabrics Software (OFS) code. You can find the training content from OpenFabrics Alliance here. General information about RDMA for Ethernet (RoCE) is available at the InfiniBand Trade Association website. Information about Internet Wide Area RDMA Protocol (iWARP) can be found at IETF: RFC 5040, RFC 5041, RFC 5042, RFC 5043, RFC 5044.

    Q. RDMA requires TCP/IP (iWARP), InfiniBand, or RoCE to operate on with respect to NVMe over Fabrics. Therefore, what are the advantages of disadvantages of iWARP vs. RoCE?

    A. Both RoCE and iWARP support RDMA over Ethernet. iWARP uses TCP/IP while RoCE uses UDP/IP. Debating which one is better is beyond the scope of this webcast, but you can learn more by watching the SNIA ESF webcast, “How Ethernet RDMA Protocols iWARP and RoCE Support NVMe over Fabrics.”

    Q. 100Gb Ethernet Optical Data Center solution?

    A. 100Gb Ethernet optical interconnect products were first available around 2011 or 2012 in a 10x10Gb/s design (100GBASE-CR10 for copper, 100GBASE-SR10 for optical) which required thick cables and a CXP and a CFP MSA housing. These were generally used only for switch-to-switch links. Starting in late 2015, the more compact 4x25Gb/s design (using the QSFP28 form factor) became available in copper (DAC), optical cabling (AOC), and transceivers (100GBASE-SR4, 100GBASE-LR4, 100GBASE-PSM4, etc.). The optical transceivers allow 100GbE connectivity up to 100m, or 2km and 10km distances, depending on the type of transceiver and fiber used.

    Q. Where is FCoE being used today?

    A. FCoE is primarily used in blade server deployments where there could be contention for PCI slots and only one built-in NIC. These NICs typically support FCoE at 10Gb/s speeds, passing both FC and Ethernet traffic via connect to a Top-of-Rack FCoE switch which parses traffic to the respective fabrics (FC and Ethernet). However, it has not gained much acceptance outside of the blade server use case.

    Q. Why did iSCSI start out mostly in lower-cost SAN markets?

    A. When it first debuted, iSCSI packets were processed by software initiators which consumed CPU cycles and showed higher latency than Fibre Channel. Achieving high performance with iSCSI required expensive NICs with iSCSI hardware acceleration, and iSCSI networks were typically limited to 100Mb/s or 1Gb/s while Fibre Channel was running at 4Gb/s. Fibre Channel is also a lossless protocol, while TCP/IP is lossey, which caused concerns for storage administrators. Now however, iSCSI can run on 25, 40, 50 or 100Gb/s Ethernet with various types of TCP/IP acceleration or RDMA offloads available on the NICs.

    Q. What are some of the differences between iSCSI and FCoE?

    A. iSCSI runs SCSI protocol commands over TCP/IP (except iSER which is iSCSI over RDMA) while FCoE runs Fibre Channel protocol over Ethernet. iSCSI can run over layer 2 and 3 networks while FCoE is Layer 2 only. FCoE requires a lossless network, typically implemented using DCB (Data Center Bridging) Ethernet and specialized switches.

    Q. You pointed out that at least twice that people incorrectly predicted the end of Fibre Channel, but it didn’t happen. What makes you say Fibre Channel is actually going to decline this time?

    A. Several things are different this time. First, Ethernet is now much faster than Fibre Channel instead of the other way around. Second, Ethernet networks now support lossless and RDMA options that were not previously available. Third, several new solutions–like big data, hyper-converged infrastructure, object storage, most scale-out storage, and most clustered file systems–do not support Fibre Channel. Fourth, none of the hyper-scale cloud implementations use Fibre Channel and most private and public cloud architects do not want a separate Fibre Channel network–they want one converged network, which is usually Ethernet.

    Q. Which storage protocols support RDMA over Ethernet?

    A. The Ethernet RDMA options for storage protocols are iSER (iSCSI Extensions for RDMA), SMB Direct, NVMe over Fabrics, and NFS over RDMA. There are also storage solutions that use proprietary protocols supporting RDMA over Ethernet.

     

     

     

     

     

     

     

     

     

     

     

     


    Benefits of RDMA in Accelerating Ethernet Storage Q&A

    March 9th, 2015

    At our recent live Webcast “Benefits of RDMA in Accelerating Ethernet Storage Connectivity” experts from Emulex, Intel and Microsoft had an insightful discussion on the ways RDMA is having an impact on Ethernet storage. The live event was attended by nearly 200 people and feedback was overwhelming positive with several attendees thanking us for our vendor neutral presentation and one attendee commenting that it was, “Probably the most clearly comprehensible yet comprehensive webinar I’ve attended in some time.” If you missed the Webcast, it’s now available on demand. We did not have time to get to everyone’s questions, so as promised, below are answers to all of them. If you have additional questions, please ask them in the comments section in this blog and we’ll get back to you as soon as possible.

    Q. Is RDMA over RoCEv2 in production?

    A. The IBTA released the RoCEv2 Specification in September 2014.  In order to support that specification changes may be required across the RDMA stack, including firmware, drivers & operating systems.  Schedules for implementation of that specification will vary by operating system.  For example, the OpenFabrics Alliance (OFA) has not released an Open Fabrics Enterprise Distribution (OFED) version that implements that standard yet, although it is in process now.  Once OFA completes their OFED stack implementation, the Linux distribution vendors will then incorporate and support the updated OFED stack.  Implementations provided prior to full OFA and Distro vendor support would be preliminary, potentially incompatible with the OFED release, and require confirmation by the distro vendor with regard to the nature/level of support they would be providing

    Q. I would have liked a list of Windows applications that take advantage of SMB Direct – both in a Hyper-V host or bare metal.

    A. In Windows, any file-based application can make use of SMB3 and SMB Direct due to the native file-based programming interface support. No application changes are required. For certain enterprise applications such as Hyper-V and SQL Server, SMB3 is officially supported, and more information can be found in the product catalog at www.microsoft.com.

    Q. Are there any particular benefits in using one network protocol over another for SMB Direct/RDMA (iWARP vs. RoCE vs. IB)?

    A. There are no hard and fast rules; any adapter or protocol can be suitable for many scenarios. Of the Ethernet-based protocols we considered in today’s webcast

    • iWARP offers the benefit of operation over TCP with its reliability and routability, well-suited to a broad range of installed infrastructure.
    • RoCE offers a lightweight, efficient protocol when a DCB-enabled switched fabric is available. RoCE, however, is not routable.
    • RoCEv2 offers similar properties to RoCE, with the possibility to scale to larger routed and DCB-enabled fabrics.

    Q. Who are the vendors offering iWARP capable RNICs?

    A. Chelsio Communications has production iWARP adapters today, and both Intel and Qlogic have publicly committed to future iWARP controllers.

    Q. How much testing has been done with SMB3, and in particular SMB direct, over WAN connections?

    A. The SMB2 protocol was originally designed to adapt to WAN scenarios, and supports a credit-based management of large amounts of data to be outstanding, to make best use of WAN-type long pipes. The SMB3 protocol retains these design attributes, and the SMB Direct protocol also supports similar deep pipelining. The iWARP protocol, being layered on standard TCP, is well suited to such deployments, and RoCE WAN adapters are potentially available. Please contact the respective technology vendors for information on any available testing results.

    Q. I love a future webcast for RDMA enabled distributed filesystems.

    A. Thanks for the suggestion! We’re always looking for ideas for future webcasts and SNIA-ESF will consider this as a potential follow-on.

    Q. Is Live Migration the scenario where “packet size” is 1MB?

    A. All SMB Direct scenarios have workloads that range anywhere up to 8MB. For large file copies, most SMB3 clients request from 1MB to 8MB per operation, for Hyper-V live migration, transfers are typically similar, during the bulk transfer phase.

    Q. SMB3 is being compared to FC for enterprise. If Ethernet based protocols are of interest, wouldn’t FCoE give the same performance as FC (same stack) vs. SMB3?

    A. SMB3 with SMB Direct enables many workloads not possible with Fibre Channel over Ethernet, and performance comparisons are therefore difficult. Perhaps another SNIA webcast could investigate this!

    Q. Regarding your SMB direct example with lots of small operations, how do you deal with the overhead of registering and unregistering buffers for the RDMA operations?

    A. As answered later in the session, the registration and unregistration is not a protocol matter, but in the case of the Windows implementation, it is strictly performed for the specific buffers of each operation, which is critical for security, data integrity, and system protection. The standard “Fast Register Work Request” method is used, and careful implementation has shown that the overhead does not negatively impact performance, even for small I/O (4KB/operation). Check out Jose Barreto’s blog, which contains many benchmark results.

    Q. But isn’t Live Migration done in 1MB “chunks”? So not “small” I/Os?

    A. As answered later in the session, Hyper-V Live Migration is done in several phases, the first phase is the initial bulk copy of memory, done in large chunks, but immediately after it a second phase of copying individual pages which were dirtied by the live-running VM is performed. These operations are typically 4KB. Note: The faster the initial phase goes, the less work there is in this second phase, but in both phases, the faster, the better, and RDMA accelerates both.

    Q. Are iSER and iWARP alternatives to one another?

    A.  iWARP is an RDMA protocol, and iSER is a mapping of iSCSI to iWARP, as well as RoCE/InfiniBand.

    Q. What’s Intel’s roadmap for RoCE and/or iWARP?

    A. Intel is committed to iWARP and plans to incorporate it in future server chipsets and SOCs. See http://www.intel.com/content/www/us/en/ethernet-products/accelerating-ethernet-iwarp-video.html for more information.

    Q. Is there any other Transport being used other than IB to create a reliable transport for RoceV2? Puristically it is possible?

    A. RoCE was developed to leverage Infiniband as much as possible.  For that reason, the Infiniband transport was chosen when the RoCE standard was developed.  As the RoCEv2 standard was developed, the underlying Infiniband network protocol was replaced with IPv4 / IPv6 in order to provide the layer 3 routability and UDP to provide stateless encapsulation (and indication) of the Infiniband transport header that was retained.  While it may be possible to develop a reliable transport to replace Infiniband, the RoCE standards body has elected not to go that route as of this writing.

     

     

     


    Real-World FCoE Best Practices Q&A

    December 19th, 2014

    At our recent live Webcast “Real-World FCoE Designs and Best Practices,” IT leaders from Thermo Fisher Scientific and Gannett Co. shared their experiences from their FCoE deployments – one single-hop, one multi-hop. It was a candid discussion on the lessons they learned. If you missed the Webcast, it’s now available on demand. We polled the audience to see what stage of FCoE deployment they’re in (see the poll results at the end of this blog). Just over half said they’re still in learning mode. To that end, here are answers to the questions we got during the Webcast. As you will see, many of these questions were directed to our guest end users regarding their experiences. I hope that it will help you in your journey. If you have additional questions, please ask them in the comments section in this blog and we’ll get back to you as soon as possible.

    Q. Have any issues come up where the storage team needed to upgrade SAN switch firmware to solve a problem, but the network team objected to upgrading the FCFs?  This assumes a shared firmware release on both network and SAN switch products (i.e. Cisco NX-OS)

    A. No we need to work together as a team so as long as it is planned out in advance this has not been an issue.

    Q. Is there any overhead at the host CPU level when using FCOE/CNA vs. using FC/HBA? Has anyone done any benchmarking on this?

    A. To the host OS it is the CNA that presents a HBA and 10G Ethernet adapter, so to the host OS there is not a difference from what is normally presented for Ethernet and FC adapters. In a software FCoE implementation there might be, but you should check with the particular implementation from the OS vendors for this information.

    Q. Are there any high-level performance considerations when compared to typical FC SAN? Any obvious impact to IO latency as hosts are moved to FCoE compared to FC?

    A. There is a performance increase in comparison to 8GB Fibre channel since FCoE using Ethernet and 64/66b encoding vs. 8/10b encoding that native 8GB uses. On dedicated links it could be around 50% increase in performance from 10GB FCoE vs. 8GB FC.

     Q. Have you planned to use of 40G – FCoE in you edge core design?

    A. We have purchased the hardware to go 40G if we choose to.

     Q. Was DCB used to isolate the network traffic with FC traffic at the CNA?

    A. DCB is a set of technologies that includes DCBX, PFC, ETS that are used with FCoE.

     Q. Was FCoE implemented on existing hosts or just on new ones being added to the SAN?

    A. Only on new hosts.

    Q. Can you expand on Domain_ID sprawl ?

    A. In FC or FCoE fabrics each storage vendor supports only a certain amount of switches per fabric. Each full FC or FCoE switch will consume a Domain ID, so it is important to consider how many switches or domain IDs are allowed in a supported fabric based on the storage vendor’s fabric recommendations. There are technologies such as NPIV and vendor specific technologies that can be helpful to limit domain ID sprawl in your fabrics.

    Remember the poll I mentioned during the Webcast? Here are the results. Let us know where you are in your FCoE deployment plans.

    Screen Shot 2014-12-19 at 9.15.15 AM

     

     


    Webcast Preview: End Users Share their FCoE Stories

    December 9th, 2014

    Fibre Channel over Ethernet (FCoE) has been growing in popularity year after year. From access layer, to multi-hop and beyond, FCoE has established itself as a true solution in the data center.

    Are you interested in learning how customers are using FCoE? Join us on December 10th, at 3:00 pm ET, 1:00 pm PT for our live Webcast, “Real World FCoE Designs and Best Practices”. This live SNIA Webcast examines the most used FCoE designs and looks at how this is being used in REAL world customer implementations. You will hear from two IT leaders who have implemented FCoE and why they did so. We will cover:

    • Real-world Use Cases and Customer Implementations of:
      • Single-Hop FCoE
      • Multi-Hop FCoE
      • Use of FCoE for Inter-Switch Links (ISLs)

    This will be a vendor-neutral live presentation. Please join us on December 10th and bring your questions for our panel.


    Expanding Your Data Center with FCoE – Q&A

    September 3rd, 2014

    At our recent live ESF Webcast, “Expert Insights: Expanding the Data Center with FCoE,” we examined the current state of FCoE and looked at how this protocol can expand the agility of the data center if you missed it, it’s now available on-demand. We did not have time to address all the questions, so here are answers to them all. If you think of additional questions, please feel free to comment on this blog.

    Q. You mentioned using 40 and 100G for inter-switch links.  Are there use cases for end point (FCoE target and initiator) 40 and 100G connectivity?

    A. Today most end points are only supporting 10G, but we are starting to see 40G server offerings enter the market, and activity among the storage vendors designing these 40G products into their arrays.

    Q. What about interoperability between FCoE switch vendors?

    A. Each switch vendor has his own support matrix, and would need to be examined independently.

    Q. Is FCoE supported on copper cable?

    A. Yes, FCoE supports “Twin Ax” copper and is widely used for server to top of rack switch connections to seven meters.  In fact, Converged Network Adapters are now available that support 10GBASE-T copper cables with the familiar RJ-45 jack.  At least one major switch vendor has qualified FCoE running over 10GBASE-T to 30 meters.

    Q. What distance does FCoE support?

    A. Distance limits are dependent on the hardware in use and the buffering available for Priority Flow Control. The lengths can vary from 3m up to over 80km. Top of rack switches would fall into the 3m range while larger class switch/directors would support longer lengths.

    Q. Can FCoE take part in management/orchestration by OpenStack Neutron?

    A. As of this writing there are no OpenStack extensions in Neutron for FCoE-specific plugins.

    Q. So how is this FC-BB-6 different than FIP snooping?

    A. FIP Snooping is a part of FC-BB-5 (Appendix D), which allows switch devices to identify an FCoE Frame format and create a forwarding ACL to a known FCF. FC-BB-6 creates additional architectural elements for deployments, including a “switch-less” environment (VN2VN), and a distributed switch architecture with a controlling FCF. Each of these cases is independent from the other, and you would choose one instead of the others. You can learn more about VN2VN from our SNIA-ESF Webcast, “How VN2VN Will Help Accelerate Adoption of FCoE.”

    Q. You mentioned DCB at the beginning of the presentation. Are there other purposes for DCB? Seems like a lot of change in the network to create a DCB environment for just FCoE. What are some of the other technologies that can take advantage of DCB?

    A. First, DCB is becoming very ubiquitous. Unlike the early days of the standard, where only a few switches supported it, today most enterprise switches support DCB protocols. As far as other use cases for DCB, iSCSI benefits from DCB, since it eliminates dropped packets and the TCP/IP protocol’s backoff algorithm when packets are dropped, smoothing out response time for iSCSI traffic. There is a protocol known as RoCE or RDMA over Converged Ethernet. RoCE requires the lossless fabric DCB creates to achieve consistent low latency and high bandwidth.  This is basically the InfiniBand API running over Ethernet. Microsoft’s latest version of file serving protocol, SMB Direct, and the Hyper-V Live Migration can utilize RoCE, and there is an extension to iSCSI known as iSER, which replaces TCP/IP with RDMA for the iSCSI datamover; enabling all iSCSI reads and writes to be done as RDMA operations using RoCE.

    Q. Great point about RoCE.  iSCSI RDMA (iSER) is required from DCB if the adapters support RoCE, right?

    A. Agreed. Please see the answer above to the DCB question.

    Q. Did that Boeing Aerospace diagram still have traditional FC links, and if yes, where?

    A. There was no Fibre Channel storage attached in that environment. Having the green line in the ledger was simply to show that Fibre Channel would have it’s own color should there be any links.

    Q. What is the price of a 10 Gbp CNA compare to a 10Gbps NIC ?

    A. Price is dependent on vendor and economics. But, there are several approaches to delivering the value of FCoE which can influence pricing:

    • Purpose built silicon that offloads the FC and Ethernet protocol functions offer a number of advantages including high performance, low CPU overhead, advanced features, etc., though even this depends on the vendor’s implementation.   But, these added features come with the expectation of additional cost. But, the processing of the protocols has to be done somewhere, and if you need your server CPUs to process applications instead of network protocols, then the value is justified.
    • With the introduction of Open FCoE drivers with DCB supported NICs, new options are available for customers to deploy the value of FCoE at the host. Open FCoE offloads the FC processing onto the host CPU and standard 10GbE NICs with DCB support can be used to manage the Ethernet transport functions. Where you have excess CPU capacity on your server, you might be in a position to reduce costs and deploy a software driver with  a 10GbE or faster NIC enhanced with the limited set of hardware offloads necessary to achieve full performance with Open FCoE. However, Open FCoE isn’t available with every OS or every NIC, so you need to consider OS support and availability.
    • A third consideration is that most enterprise servers include some form of advanced 10GbE networking on the motherboard that either supports purpose built silicon or DCB enabled silicon. So, depending upon which server and OS you deploy, you may have several options via embedded silicon.

     


    Upcoming Webcast: Is FCoE the Answer to Data Center Agility?

    August 4th, 2014

    Fibre Channel over Ethernet (FCoE) has been growing in popularity year after year. From access layer, to multi-hop and beyond, FCoE has established itself as a true solution in the data center.

    Interested in learning how the Data Center is expanding with FCoE? Join us on August 20th, at 4:00 pm ET, 1:00 pm PT for our live Webcast, “Expanding the Data Center with FCoE.”  Continuing our conversation from our February Webcast, “Use Cases for iSCSI and FCoE,” which is now available on demand. This live SNIA Webcast examines the current state of FCoE and looks at how this protocol can expand the agility of the data center.

    • We’ll take an unbiased look at the data center using FCoE, covering:
    • The history and evolution of convergence
    • Using FCoE as a storage overlay
    • Single-hop, multi-hop and beyond
    • 40G/100G  – Where does it fit
    • Futures:
      • OpenStack
      • Defining Network Functions Virtualization (NFV)
      • Mapping NFV to FCoE
    • Real-world Use Cases

    This will be a vendor-neutral live presentation. Please join us on August 20th and bring your questions for our expert panel. Register now.

     

     

     


    Ethernet Meets Enterprise Storage – Finally

    May 27th, 2014

    Presumptuous, yes, because Ethernet has been a mainstay in enterprises since its early days over 40 years ago.  It initially grew to prominence as the local area network (LAN) connection in the enterprise. More recent advances have enabled Ethernet to become a standard for mission critical storage connectivity for block, file and object storage in many enterprises.

    Block storage in large enterprises has long been focused on Fibre Channel due to its performance capabilities.   In order to bring the same performance benefits to Ethernet, the IEEE 802.1 Data Center Bridging Task Group proposed a number of new standards to enhance Ethernet reliability.  For example, 802.1Qbb Priority-based Flow Control (PFC) provides a link level flow control mechanism to ensure lossless transmission under congestion, 802.1Qaz Enhanced Transmission Selection (ETS) provides a management framework for prioritized bandwidth and Data Center Bridging Exchange Protocol (DCBX) enabled these features to be used between neighbors to ensure consistency on the network. Collectively, these and other enhancements have brought those enterprise-class storage networking features to the Ethernet platform.

    In addition, the International Committee for Information Technology Services (INCITS) T11 Fibre Channel committee developed a specification for Fibre Channel over Ethernet (FCoE) in its FC-BB-5 standard in 2009, which allows the Fibre Channel protocol to run directly on top of Ethernet, eliminating the TCP/IP stack and allowing for efficient performance of the Fibre Channel protocol.  FCoE also depends on the Data Center Bridging standards from IEEE 802.1 in order to ensure the “losslessness” and flow control needed by Fibre Channel.

    An alternative to FCoE, iSCSI, was designed to run over standard Ethernet with TCP/IP and was designed to tolerate the “lossy” aspects of Ethernet.  Its architecture and the additional layers of encapsulation involved can impact latency and performance. However, more recent innovations in iSCSI have enabled it to run over a DCB Ethernet network, which enables iSCSI to inherit some of the enterprise storage features which have always been inherent in Fibre Channel.  For more on this, read last year’s blog “How DCB Makes iSCSI Better ” from Allen Ordoubadian.

    In 2013, INCITS submitted the FC-BB-6 standard for review which introduced, among other things, the VN2VN standard.  The VN2VN proposal will allow FCoE to work in a standard DCB switching environment without the presence of a Fibre Channel Forwarder (FCF).  An FCF allows for bridging between servers which are communicating with FCoE and storage devices which are communicating with traditional Fibre Channel.  As DCB switches and FCoE storage become more prevalent, the FC-BB-6 standard will allow for end-to-end FCoE connectivity in either a point to point (P2P) or DCB mesh environment. This will result in lower cost for FCoE environments. Products are beginning to appear which support VN2VN and over the next 18 months it is likely that all major vendors will support it. Check out our ESF Webcast “How VN2VN Will Help Accelerate Adoption of FCoE” for more details.

    The availability of CNAs with processing capability allows for offloading storage protocol processing from the host processor, though some CNAs use host-based storage protocol initiators in system software and do selective stateless offloads in the data path.  Both FCoE and iSCSI require the storage protocol to be encapsulated in a frame to be sent across the Ethernet network.  In an enterprise environment, especially a virtual server environment, CPU utilization is tracked closely and target CPU thresholds are often set.  Anything which can minimize spikes in CPU utilization can allow for more workloads to be placed on servers and allows for predictable energy consumption.

    For file storage, Ethernet has traditionally been the connectivity option of choice for file servers used as “shares” for centralized employee document storage. In the 21st century, usage of network attached storage (NAS) with the Network File System (NFS) has increased for enterprise databases and Hadoop clusters, especially with the availability of 10Gb Ethernet.  New features in NFS 4 and later introduced security and stateful protocol support after development of NFS was taken over by the Internet Engineering Task Force (IETF).

    Object storage, has been around for nearly 20 years as a repository for storing data as objects which include not only the original file, but also a globally unique identifier and metadata which describes the object and various parameters about the object.  It has been used to store many forms of unstructured data, but found niches in certain areas, such as legal documents with retention policies and archiving photos and videos.  More recently, there seems to be a resurgence in object storage as the amount of unstructured data generated by enterprises continues to skyrocket.  Open source object storage in Ceph and OpenStack are also helping to drive the adoption. SNIA ESF is hosting a live Webcast on object storage on June 11, 2014, called “Object Storage 101.” I encourage you to register for this presentation for an unbiased look at the what, how and why of object storage technologies.

    When combined with the advances in link speed, throughput capabilities, latency and input/output operations per second (IOPS) in modern 10Gb/s and 40Gb/s Ethernet, these existing and emerging Ethernet standards and storage architectures are having a profound effect on the ability of Ethernet as an enterprise class storage networking platform.  Vendors and customers are seeing the advantage in one wire, the Ethernet cable, carrying all LAN, WAN and storage traffic.

     

     

     


    Use Cases for iSCSI and FCoE – Your Questions Answered

    March 11th, 2014

    We had a tremendous response to our recent Webcast “Use Cases for iSCSI and FCoE – Where Each Makes Sense.” We had a lot of questions that we didn’t have time to address, so here are answers to them all. If you think of additional questions, please feel free to comment on this blog.

    Q. You stated that FCoE requires End to End DCB connectivity.  That is not entirely true if you have native Fibre Channel storage. 

    Once native FC is added, it is a hybrid FCoE/native FC network, not a simple FCoE network.  To be clearer I could’ve stated that for FCoE all Ethernet links traversed must be DCB enabled.

    Q. Any impact on the protocol choice if you bring SDN solutions with overlay networks using VXLAN or NVGRE within virtual switching in hypervisors into the picture?

    An excellent question, but complicated enough that it probably deserves a discussion on its own.  Overlay networks encapsulate Ethernet frames into routable packets.  On a view of strict adherence to ISO ordering, that means L2 constructs like Data Center Bridging become “invisible” until decap.  You lose the “lossless,” low-latency that FCoE expects and iSCSI may be taking advantage of, depending on your implementation.  That doesn’t really favor one protocol over the other, but FCoE may lose advantages it has over iSCSI when confined to a single L2 subnet.  But, unfortunately, the real answer to your question requires that you investigate in detail how the system software you are using handles encapsulated storage packets for both block storage protocols.  Microsoft’s Hyper-V is different from VMware’s vSphere, and each flavor of SDN could be different as well.  Proceed with caution.

    Q. Have you heard of any enterprise customers who are interested in NIC Partitioning to separate iSCSI, FCoE, and typical network traffic?  If so, can you provide information about those customers’ use cases?

    We have not come across many customers that are interested in large-scale deployments yet.

    Q. What are the use cases for using standalone FCoE switches in SAN keeping aside Cisco UCS and Blade Servers?

    There are two ways to look at this:

    1) To use FCoE as an end-to-end (Initiating server to target storage array) solution instead of, or to replace, Fibre Channel. Although, not very prevalent to date, the reason this option is chosen is to create a single converged LAN/SAN network that essentially retains the native FC constructs. The potential benefit would be in reduction in the amount of equipment required and the resources needed to deploy and administer two separate networks. This can be done in a phased approach, that uses multiprotocol switches, able to be used as Ethernet, FC or both on every port.  This will provide future proofing, reduced qualification costs, and lower OPEX by no longer requiring the purchase of multiple switches of different protocols.

    2) To continue the use of FC for connectivity from the Top of Rack switch to the storage arrays, but use FCoE connectivity for server access. This is much more prevalent, and even when deployed outside of the Cisco UCS blade servers, is used to increase flexibility in highly virtualized server environments or multi-tenancy, where workloads/VMs from the same physical servers need to connect to different storage types.

    Q. How do iSCSI and FCoE switches handle redundancy?  With FC, it is a best practice to implement dual fabrics with each storage system and server with paths down each.

    Physical topology can be identical.  A storage system has one set of targets (either IP addresses or FCoE targets) on one switch and other targets on the other switch.  The initiators are configured to see any targets available on that leg.

    To prevent Ethernet broadcast storms, technologies like per VLAN Spanning Tree and link aggregation are used.  TRILL can also be used.  For more details, I recommend reading this blog post by J Metz of Cisco.  http://blogs.cisco.com/datacenter/understanding-fcoe-and-trill-the-easy-way/

    Q. Doesn’t increasing CPU mean software processing for FCoE and iSCSI at both endpoints can reduce costs considerably (i.e. no full HBA functionality needed at the endpoints)?

    Absolutely.  If you have CPU cycles to spare at both endpoints, there is no reason to take on the extra cost of offload.  However, remember the principle behind Moore’s law also works on things like network adapters and HBAs.  It isn’t unreasonable to think that full offload capabilities will be included by default in a few years as technology progresses.  And even if they aren’t, the actual application of Moore’s law will push the difference in CPU utilization to be trivial.

    Q. How do large data centers configure and manage iSCSI?  Is it by configuring the initiators and targets? My understanding is that most installations don’t use iSNS.  Is this true?

    It is true that most implementations of iSCSI don’t use iSNS.  iSCSI initiators are simply configured with the target address by the administrator.  In the FC world, SNS is simply there, but the iSCSI equivalent, iSNS, has always been optional.  (SNS stands for Simple Name Service.  It is a service that helps initiators find targets.)

    Q. I have been doing a lot of testing to compare iSCSI to FC and noticed that as we move from traditional storage to SSD-based storage the IOPS increase faster for FCoE. For example, 18K+ for FCoE vs. 12K for iSCSI. Have you seen similar results?

    I have seen some similar results. However, I’ve also seen some that don’t necessarily line up with that.  I haven’t had the time to research this topic.  Sounds like a good topic for a future post.

    Q. Do you have any information about the number of customers who use FCoE Boot and iSCSI Boot?

    Unfortunately I don’t.  I do have anecdotal evidence to support customers using full-offload are more likely to boot from SAN.  Since more full-offload FCoE adapters are in use that full-offload iSCSI adapters today, it makes sense that more are booting over FCoE than iSCSI, but again, I don’t have any evidence to support that.

    Q. What about iSCSI over RoCE?

    There are three network/fabric technologies that use RDMA: InfiniBand, iWARP, and RoCE.  You can run iSCSI over any of these using the open-source iSER code supported by the Open Fabrics Alliance (https://www.openfabrics.org ).  iSER has been written to OFA’s “verbs” for RDMA (rather than to the more familiar “sockets).  However, note that of these three underlying transports, only iWARP is truly routable in general.  So technically you could implement iSER on InfiniBand or RoCE but it may not do for you what you expect iSCSI to do for you, i.e., go anywhere the internet goes.

    Q. How does FCIP compare with iSCSI for long distance requirements?

    FC networks rely on guaranteed packet delivery to deliver low latency, predictable performance. IP networks are a best effort network allowing for dropped packets with transmission retries. Given the possibility of latency loss, FCIP has experienced limited adoption. Useful where required. But, typically not a core part of infrastructure. If cost is a concern and long distance is required as part of the solution, then iSCSI is the better choice as it designed to allow for lossy networks. 

    Q. Slide 22 – Was that hardware based iSCSI or software based iSCSI?

    What was shown in the chart was software-based iSCSI, however you would see similar results with hardware-based iSCSI.

    Q. What about FC vs FCoE performance? Any numbers?

    Both Fibre Channel and FCoE can achieve line rate.  Here’s an example of testing Yahoo! did on an 8Gb FC HBA and a 10 GbE CNA that showed exactly that result: http://www.intel.com/content/www/us/en/network-adapters/10-gigabit-network-adapters/10-gbe-ethernet-yahoo-case-study.html .  So as Fibre Channel moves to 16 Gbps, it will outperform a 10GbE CNA, at least for peak performance.  However, the tables turn with a 40 GbE CNA, several of which are in production now.

    Q. Do you see SR-IOV used currently or in the future to separate FCoE or iSCSI from standard LAN traffic?

    So far we have seen that with the exception of a few operating systems (e.g., AIX), SR-IOV support today is network only.  Additionally, most customers want guaranteed bandwidth for storage and they wouldn’t be willing to run it on the same port as heavy NIC traffic.

    Q. Are you aware of any FCoE targets for Windows?

    I’m not aware of any right now.

    Q. What is the max IOPS (at 4K) you can push thru 10G FCoE and iSCSI? Max latency (at 512 bytes)?

    Latency is not determined by the pipe.

    Q. Does FCoE really require a CNA? What about software only FCoE drivers?

    Open FCoE does exist, but most FCoE implementations today use CNAs.  I do expect the adoption of FCoE software solutions to increase fairly substantially.  A lot of it comes down to the choice of booting via FCoE or another method.

    Q. Do you think that the difference in FCoE/iSCSI usage for different App tiers can be related to the performance of the protocols?

    Objectively, no.  Either protocol implemented can be configured to hit or exceed a performance number.  In my opinion, market perception of the protocols has more to do with the tier assignment than anything technical.

    Q. Doesn’t 32 GbFC make it competitive with 40GbE FCoE?

    From a purely technical perspective it helps, but FCoE is often deployed to reduce costs by simplifying cabling and switching by converging IP and storage onto the same fabric.  32Gb FC is slower than 40Gb and does nothing to reduce costs.  Unless 32Gb FC is significantly less expensive than 40 Gb Ethernet on a per port basis, market forces are going to push towards Ethernet.  There are still plenty of cases where organizations may deploy 32Gb FC instead of FCoE, but again, those criteria will mostly be non-technical.

    Thanks to all my SNIA-ESF colleagues and Dell’Oro Group for helping me with these answers. If you missed the original Webcast, you can watch it on-demand here. You can also download a copy of the slides.


    Why the FCoE – iSCSI Debate Continues

    February 11th, 2014

    Why the FCoE – iSCSI Debate Continues

    This is my first blog post for SNIA-ESF.  As a Principal Storage Architect, I have been doing extensive research on the factors that are driving FCoE vs. iSCSI choices over the last several years. The more I dive into the topic, the more intriguing the debate becomes. In fact, this blog is a preview of an upcoming white paper I’m writing and a Webcast SNIA is hosting on February 18th. If you agree this debate is interesting, I encourage you to attend. Details on the Webcast are at the end of this post.

    A Look Back at FCoE and iSCSI History

    There are two entrenched standards for block storage protocols over Ethernet networks.  FCoE was ratified in 2009, while iSCSI was ratified in 2004.  Of course, various vendors and early adopters supported these protocols before ratification, so the history of these protocols is a couple of years longer than it looks, respectively.  While iSCSI simply encapsulates the SCSI protocol in IP, FCoE operates lower in the network stack and to do so required many enhancements to Ethernet.  While iSCSI runs on any IP network (mostly Ethernet these days), FCoE requires Data Center Bridging and Converged Network Adapters all running at 10 Gbps or faster.

    All of the Data Center Bridging enhancements that make FCoE possible, like lossless Ethernet, benefit all of the protocols using Ethernet as the transport protocol.  DCB doesn’t just make FCoE possible, but it improves iSCSI at the same time  (see the SNIA-ESF blog, How DCB Makes iSCSI Better). So given that modern servers, networks, and storage may all be connected by hardware capable of running FCoE, that same network is also able to run iSCSI, as well as other network traffic.  Nothing precludes them from running simultaneously on the same network either.  The leading storage vendors that offer both FCoE and iSCSI target systems allow administrators to present the same LUN over either protocol with little effort, so a transition from one protocol to the other is not difficult.

    Strengths and Weaknesses

    So which network protocol is the right choice?

    Each protocol has strengths and weaknesses when judged relative to each other.  FCoE has higher throughput at lower host CPU utilization than iSCSI and FCoE doesn’t have to process the TCP/IP stack as iSCSI does. iSCSI is relatively simple to setup and troubleshoot when compared to FCoE because zoning is not a factor and IP connectivity (although not optimized for storage traffic) is likely in place already.  Also, while FCoE has a comprehensive set of existing tools available to ease troubleshooting, there aren’t as many qualified people to use them in most enterprises.  Ease of use, plus the ability to use low cost NICs and switches, gives iSCSI a cost advantage.  (However, if you check out our SNIA-ESF webcast, “How VN2VN Will Help Accelerate Adoption of FCoE,” you’ll hear about new technologies that reduce the costs of deploying FCoE.) FC, and by extension FCoE, are perceived to be enterprise-grade, suitable for all workloads; and while iSCSI is being widely adopted at the enterprise level, it is still perceived by some not to be ready for Tier-1 applications.  The graph below is excerpted from the report “Intel 10GbE Adapter Performance Evaluation” prepared by Demartek for Intel in September 2010.  This data is consistent with the rest of the report findings and is only intended to be representative of the results from comparative iSCSI and FCoE testing.  The report is interesting reading and I recommend looking at it for more information. This graph shows IOPS and CPU utilization for JetStress tests running against NetApp storage over multi-path iSCSI and FCoE.  Note that latencies were all similar and running the tests against EMC storage showed similar results.

    FCoE-iSCSI_Data

    Many other factors must be considered, but according to industry pundits- as well as my own personal experience – in the majority of cases either protocol is adequate for the task at hand, and that is to effectively transfer block data across an Ethernet network.

    Maximizing Throughput

    The reality is, most servers, applications, and storage arrays simply won’t take advantage of FCoE’s superior performance or any storage protocol running over 10GbE.  iSCSI and NAS protocols are very fast and are typically sufficient to meet most application requirements.  But this is not meant to be a SAN vs NAS post – besides years of history, thousands of happy end users, and billions of continued investment show that both work well enough to meet most business needs.  The commonly deployed storage systems and hosts are simply not configured with enough hardware to saturate multiple 10 gigabit network links.  While this is rare today, it is going to become more common to see systems capable of saturating 10GbE pipes in the near future, especially as flash memory, either in all-flash arrays or tiered storage systems, find more application.  (Hear more on the impact of flash in our SNIA-ESF webcast, “Flash – Plan for the Disruption”). At least as it relates to spinning media disk systems – network bandwidth increases faster than storage system throughput can keep up.  So consider the storage system to be the bottleneck or limiting factor when evaluating storage network performance.  After all, in most data center environments, the ratio of servers and applications to storage systems is high. So, it’s reasonable to expect the storage system to be the bottleneck.  The absolute throughput of FCoE and iSCSI, when pushing a storage system to its limits, is not sufficient alone to be used as the sole basis for the decision between the two protocols except, for a few edge cases.  Bottom line: Whether the storage system is the bottleneck or the network is the bottleneck the performance relationship between FCoE and iSCSI does not change.

    These edge cases tend to be extremely IO intensive database workloads and big data applications, such as Hadoop.  Citing the graph above, FCoE is about 15-20% faster on identical hardware than iSCSI.  Granted this is a single graph of a single test, but the data is consistent across tests performed by IBM using Emulex network interfaces.  If absolute throughput and efficiency (both network and CPU) are the only criteria when deciding between block protocols, FCoE looks like the choice.  Since these cases are rare – because complexity, supportability, and even politics are almost always considered – the decision is not so obvious.  Again, beyond the scope of this article, NAS protocols should be considered when determining the proper protocol for an application also.

    Is There a Clear Winner?

    While FCoE can claim technical superiority, iSCSI has the edge in cost and supportability.  The number and range of systems supporting iSCSI connectivity is greater, particularly at the entry level.  What’s more, the availability of people that can troubleshoot end-to-end connectivity for iSCSI is also much greater.  (The “ping” command diagnoses most iSCSI connectivity problems.)  Also, do a resume search on Monster or LinkedIn and the number of people that can configure VLANs dwarfs the number that can properly zone a Fibre Channel network.  Greater familiarity reduces the support and operating cost of iSCSI.

    IDC predicts that FCoE revenue will ramp very quickly through 2016. (If available to you, see the IDC Worldwide Enterprise Storage Systems 2012-2016 Forecast Update.)  As customers decide to transition existing Fibre Channel networks to an Ethernet infrastructure, deploying FCoE would be a comfortable choice due to existing IT expertise and functional expectations of the Fibre Channel protocol.

    Both iSCSI and FCoE are capable storage protocols and choosing one over the other will likely be dependent upon budget, IT skill set, and application requirements

    Don’t forget to join us on Feb. 18th

    Again, I encourage you to attend our February 18th Webcast, “Use Cases for iSCSI and FCoE –Where Each Makes Sense.”  Analysts from Dell’Oro Group will share their latest market research on this topic and I’ll dive into use cases for both iSCSI and FCoE. It’s a live event, so please come with your toughest questions. I hope you’ll join us!