• Home
  • About
  •  

    SMB3 – These Questions Rock!

    April 24th, 2017
    Earlier this month, the SNIA Ethernet Storage Forum hosted a live webcast on Server Message Block (SMB), “Rockin’ and Rollin’ with SMB3.” Presenting was Ned Pyle, Microsoft SMB Program Manager. If you missed the live event, I encourage you to watch it on-demand. We had a lot of questions from the big audience this event drew, so as promised, here are answers to them all. Q. Other than that audit setup, is there a way to determine, via the OS, which SMB version is in use?  Continue Reading...

    Buffers, Queues and Caches Explained

    April 19th, 2017
    Finely tuning buffers, queues and caches can make your storage system hum. And that’s exactly what we discussed in our recent SNIA Ethernet Storage Forum webcast, ““Everything You Wanted to Know About Storage But Were Too Proud To Ask – Part Teal: The Buffering Pod.” If you missed it, it’s now available on-demand. In this blog, you’ll find detailed answers from our panel of experts to all the great questions we received during the live event. I also encourage you to check out the other on-demand webcasts in this “Too Proud To Ask” series here and stay informed on upcoming events in this series by following us on Twitter @SNIAESF.  Continue Reading...

    Storage Expert Takes on Hyperconverged Questions

    April 17th, 2017
    Last month, we were fortunate enough to have Greg Schulz, analyst and founder of Server Storage IO, as a guest speaker at our SNIA Ethernet Storage Forum webcast, “What Does Hyperconverged Mean to Storage.” If you missed it, it’s now available on-demand. Greg fielded many great questions during the live event, but we didn’t have time to get to them all. So here they are:  Continue Reading...

    Q&A on All Things iSCSI

    April 7th, 2017
    In the recent SNIA Ethernet Storage Forum iSCSI pod webcast, from our “Everything You Wanted To Know About Storage Part Were Too Proud to Ask” series, we discussed all things iSCSI. If you missed the live event, it’s now available on-demand. As promised, we’ve compiled all the webcast questions with answers from our panel of experts. If you have additional questions, please feel free to ask them in the comment field of this blog. I also encourage you to check out the other on-demand webcasts in this “Too Proud To Ask” series here and stay informed on upcoming events in this series by following us on Twitter @SNIAESF.  Continue Reading...

    Would You Like Some Rosé with Your iSCSI?

    February 3rd, 2017

    Would you like some rosé with your iSCSI? I’m guessing that no one has ever asked you that before. But we at the SNIA Ethernet Storage Forum like to get pretty colorful in our “Everything You Wanted To Know about Storage But Were Too Proud To Ask” webcast series as we group common storage terms together by color rather than by number.

    In our next live webcast, Part Rosé – The iSCSI Pod, we will focus entirely on iSCSI, one of the most used technologies in data centers today. With the increasing speeds for Ethernet, the technology is more and more appealing because of its relative low cost to implement. However, like any other storage technology, there is more here than meets the eye.

    We’ve convened a great group of experts from Cisco, Mellanox and NetApp who will start by covering the basic elements to make your life easier if you are considering using iSCSI in your architecture, diving into:

    • iSCSI definition
    • iSCSI offload
    • Host-based iSCSI
    • TCP offload

    Like nearly everything else in storage, there is more here than just a protocol. I hope you’ll register today to join us on March 2nd and learn how to make the most of your iSCSI solution. And while we won’t be able to provide the rosé wine, our panel of experts will be on-hand to answer your questions.


    We’ve Been Thinking…What Does Hyperconverged Mean to Storage?

    February 1st, 2017

    Here at the SNIA Ethernet Storage Forum (ESF), we’ve been discussing how hyperconverged adoption will impact storage. Converged Infrastructure (CI), Hyperconverged Infrastructure (HCI), along with Cluster or Cloud In a Box (CIB) are popular trend topics that have gained both industry and customer adoption. As part of data infrastructures, CI, HCI, and CIB enable simplified deployment of resources (servers, storage, I/O networking, hypervisor, application software) across different environments.

    But what do these approaches mean for the storage environment? What are the key concerns and considerations related specifically to storage? How will the storage be connected to (or included in) the platform? Who will protect and backup the data? And most importantly, how do you know that you’re asking the right questions in order to get to the right answers?

    Find out on March 15th in a live SNIA-ESF webcast, “What Does Hyperconverged Mean to Storage.” We’ve invited expert Greg Schulz, founder and analyst of Server StorageIO, to answer the questions we’ve been debating. Join us, as Greg will move beyond the hype (pun intended) to discuss:

    • What are the storage considerations for CI, CIB and HCI
    • Why fast applications and fast servers need fast I/O
    • Networking and server-storage I/O considerations
    • How to avoid aggravation-causing aggregation (bottlenecks)
    • Aggregated vs. disaggregated vs. hybrid converged
    • Planning, comparing, benchmarking and decision-making
    • Data protection, management and east-west I/O traffic
    • Application and server north-south I/O traffic

    Register today and please bring your questions. We’ll be on-hand to answer them during this event. We hope to see you there!


    Buffers, Queues, and Caches, Oh My!

    January 18th, 2017

    Buffers and Queues are part of every data center architecture, and a critical part of performance – both in improving it as well as hindering it. A well-implemented buffer can mean the difference between a finely run system and a confusing nightmare of troubleshooting. Knowing how buffers and queues work in storage can help make your storage system shine.

    However, there is something of a mystique surrounding these different data center components, as many people don’t realize just how they’re used and why. Join our team of carefully-selected experts on February 14th in the next live webcast in our “Too Proud to Ask” series, “Everything You Wanted to Know About Storage But Were Too Proud To Ask – Part Teal: The Buffering Pod” where we’ll demystify this very important aspect of data center storage. You’ll learn:

    • What are buffers, caches, and queues, and why you should care about the differences?
    • What’s the difference between a read cache and a write cache?
    • What does “queue depth” mean?
    • What’s a buffer, a ring buffer, and host memory buffer, and why does it matter?
    • What happens when things go wrong?

    These are just some of the topics we’ll be covering, and while it won’t be exhaustive look at buffers, caches and queues, you can be sure that you’ll get insight into this very important, and yet often overlooked, part of storage design.

    Register today and spend Valentine’s Day with our experts who will be on-hand to answer your questions on the spot!


    Clearing Up Confusion on Common Storage Networking Terms

    January 12th, 2017

    Do you ever feel a bit confused about common storage networking terms? You’re not alone. At our recent SNIA Ethernet Storage Forum webcast “Everything You Wanted To Know About Storage But Were Too Proud To Ask – Part Mauve,” we had experts from Cisco, Mellanox and NetApp explain the differences between:

    • Channel vs. Busses
    • Control Plane vs. Data Plane
    • Fabric vs. Network

    If you missed the live webcast, you can watch it on-demand. As promised, we’re also providing answers to the questions we got during the webcast. Between these questions and the presentation itself, we hope it will help you decode these common, but sometimes confusing terms.

    And remember, the “Everything You Wanted To Know About Storage But Were Too Proud To Ask” is a webcast series with a “colorfully-named pod” for each topic we tackle. You can register now for our next webcast: Part Teal, The Buffering Pod, on Feb. 14th.

    Q. Why do we have Fibre and Fiber

    A. Fiber Optics is the term used for the optical technology used by Fibre Channel Fabrics.  While a common story is that the “Fibre” spelling came about to accommodate the French (FC is after all, an international standard), in actuality, it was a marketing idea to create a more unique name, and in fact, it was decided to use the British spelling – “Fibre”.

    Q. Will OpenStack change all the rules of the game?

    A. Yes. OpenStack is all about centralizing the control plane of many different aspects of infrastructure.

    Q. The difference between control and data plane matters only when we discuss software defined storage and software defined networking, not in traditional switching and storage.

    A. It matters regardless. You need to understand how much each individual control plane can handle and how many control planes you have from a overall management perspective. In the case were you have too many control planes SDN and SDS can be a benefit to you.

    Q. As I’ve heard that networks use stateless protocols, would FC do the same?

    A. Fibre Channel has several different Classes, which can be either stateful or stateless. Most applications of Fibre Channel are Class 3, as it is the preferred class for SCSI traffic, A connection between Fibre Channel endpoints is always stateful (as it involves a login process to the Fibre Channel fabric). The transport protocol is augmented by Fibre Channel exchanges, which are managed on a per-hop basis. Retransmissions are handled by devices when exchanges are incomplete or lost, meaning that each exchange is a stateful transmission, but the protocol itself is considered stateless in modern SCSI-transport Fibre Channel.

    iSCSI, as a connection-oriented protocol, creates a nexus between an initiator and a target, and is considered stateful. In addition, SMB, NFSv4, ftp, and TCP are stateful protocols, while NFSv2, NFSv3, http, and IP are stateless protocols.

    Q. Where do CIFS/SMB come into the picture?

    A. CIFFS/SMB is part of a network stack.  We need to have a separate talk about network stacks and their layers.  In this presentation, we were talking primarily about the physical layer of the networks and fabrics.  To overly simplify network stacks, there are multiple layers of protocols that run on top of the physical layer.  In the case of FC, those protocols include the control plane protocols (such as FC-SW), and the data plane protocols.  In FC, the most common data plane protocol is FCP (used by SCSI, FICON, and FC-NVMe).  In the case of Ethernet, those protocols also include the control plan (such as TCP/IP), and data plane protocols.  In Ethernet, there are many commonly used data plane protocols for storage (such as iSCSI, NFS, and CIFFS/SMB)


    Questions on the 2017 Ethernet Roadmap for Networked Storage

    January 9th, 2017

    Last month, experts from Dell EMC, Intel, Mellanox and Microsoft convened to take a look ahead at what’s in store for Ethernet Networked Storage this year. It was a fascinating discussion of anticipated updates. If you missed the webcast, “2017 Ethernet Roadmap for Networked Storage,” it’s now available on-demand. We had a lot of great questions during the live event and we ran out of time to address them all, so here are answers from our speakers.

    Q. What’s the future of twisted pair cable? What is the new speed being developed with twisted pair cable?

    A. By twisted pair I assume you mean USTP CAT5,6,7 etc.  The problem going forward with high speed signaling is the USTP stands for Un-Shielded and the signal radiates off the wire very quickly.   At 25G and 50G this is a real problem and forces the line card end to have a big, power consuming and costly chip to dig the signal out of the noise. Anything can be done, but at what cost.  25G BASE-T is being developed but the reach is somewhere around 30 meters.  Cost, size, power consumption are all going up and reach going down – all opposite to the trends in modern high speed data centers.  BASE-T will always have a place for those applications that don’t need the faster rates.

    Q. What do you think of RCx standards and cables?

    A. So far, Amphenol, JAE and Volex are the suppliers who are members of the MSA. Very few companies have announced or discussed RCx.  In addition to a smaller connector, not having an EEPROM eliminates steps in the cable assembly manufacture, hence helping with lowering the cost when compared to traditional DAC cabling. The biggest advantage of RCx is that it can help eliminate bulky breakout cables within a rack since a single RCx4 receptacle can accept a number of combinations of single lane, 2 lane or 4 lane cable with the same connector on the host. RCx ports can be connected to existing QSFP/SFP infrastructure with appropriate cabling. It remains to be seen, however, if it becomes a standard and popular product or remain as a custom solution.

    Q. How long does AOC normally reach, 3m or 30m?  

    A. AOCs pick it up after DAC drops off about 3m.  Most popular reaches are 3,5,and 10m and volume drops rapidly after 15,20,30,50, and100. We are seeing Ethernet connected HDD’s at 2.5GbE x 2 ports, and Ceph touting this solution.  This seems to play well into the 25/50/100GbE standards with the massive parallelism possible.

    Q. How do we scale PCIe lanes to support NVMe drives to scale, and to replace the capacity we see with storage arrays populated completely with HDDs?

    A. With the advent of PCIe Gen 4, the per-lane rate of PCIe is going from 8 GT/s to 16GT/s. Scaling of PCIe is already happening.

    Q. How many NVMe drives does it take to saturate 100GbE?

    A. 3 or 4 depending on individual drives.

    Q. How about the reliability of Ethernet? A lot of people think Fibre Channel has better reliability than Ethernet.

    A. It’s true that Fibre Channel is a lossless protocol. Ethernet frames are sometimes dropped by the switch, however, network storage using TCP has built in error-correction facility. TCP was designed at a time when networks were less robust than today. Ethernet networks these days are far more reliable.

    Q. Do the 2.5GbE and 5GbE refer to the client side Ethernet port or the server Ethernet port?

    A. It can exist on both the client side and the server side Ethernet port.

    Q. Are there any 25GbE or 50GbE NICs available on the market?

    A. Yes, there are many that are on the market from a number of vendors, including Dell, Mellanox, Intel, and a number of others.

    Q. Commonly used Ethernet speeds are either 10GbE or 40GbE. Do the new 25GbE and 50GbE require new switches?

    A. Yes, you need new switches to support 25GbE and 50GbE. This is, in part, because the SerDes rate per lane at 25 and 50GbE is 25Gb/s, which is not supported by the 10 and 40GbE switches with a maximum SerDes rate of 10Gb/s.

    Q. With a certain number of SerDes coming off the switch ASIC, which would you prefer to use 100G or 40G if assuming both are at the same cost?

    A. Certainly 100G. You get 2.5X the bandwidth for the same cost under the assumptions made in the question.

    Q. Are there any 100G/200G/400G switches and modulation available now?

    A. There are many 100G Ethernet switches available on the market today include Dell’s Z9100 and S6100, Mellanox’s SN2700, and a number of others. The 200G and 400G IEEE standards are not complete as of yet. I’m sure all switch vendors will come out with switches supporting those rates in the future.

    Q. What does lambda mean?

    ALambda is the symbol for wavelength.

    Q. Is the 50GbE standard ratified now?

    A. IEEE 802.3 just recently started development of a 50GbE standard based upon a single-lane 50 Gb/s physical layer interface. That standard is probably about 2 years away from ratification. The 25G Ethernet Consortium has a ratified specification for 50GbE based upon a dual-lane 25 Gb/s physical layer interface.

    Q. Are there any parallel options for using 2 or 4 lanes like in 128GFCp?

    A. Many Ethernet specifications are based upon parallel options. 10GBASE-T is based upon 4 twisted-pairs of copper cabling. 100GBASE-SR4 is based upon 4 lanes (8 fibers) of multimode fiber. Even the industry MSA for 100G over CWDM4 is based upon four wavelengths on a duplex single-mode fiber. In some instances, the parallel option is based upon the additional medium (extra wires or fibers) but with fiber optics, parallel can be created by using different wavelengths that don’t interfere with each other.

     

     


    Common Questions on Clustered File Systems

    November 18th, 2016

    More than 350 people have already seen our SNIA Ethernet Storage Forum (ESF) webcast “Clustered File Systems: No Limits.” Our presenters, James Coomer and Jerry Lotto, did a great job explaining what clustered file systems are, key considerations, choices and performance. As we expected, there were plenty of questions, so as promised, here are answers to them all.

    Q: Parallel NFS (pNFS) has been in development/standard effort for a long time, and I believe pNFS is not in the Linux kernel it appears pNFS is yet to be prime time.

    A: pNFS has been in Linux for over a decade! Clients and server are widely available, and you should look at the SNIA White Paper “An Updated Overview of NFSv4; NFSv4.0, NFSv4.1, pNFS, and NFSv4.2” for more information on the current state of play.

    Q: Why the emphasis on parallel I/O? Any single storage server can feed results at link capacity, so you do not need multiple storage servers to feed a client at full speed. Isn’t the more critical issue the bottleneck on access to metadata for a single directory or file? Federated NAS bottlenecks updates for each directory behind a single master server?

    A: Any one storage server can usually saturate one client, but often there are multiple hungry clients making requests simultaneously. Using parallel I/O allows multiple servers to feed multiple high-bandwidth clients across a narrow or wide set of data. This smooths out the I/O load on the servers in a near-perfect manner regardless of the number of clients performing I/O. It is absolutely true that metadata serving can become a bottleneck, so parallel file systems use cached and/or distributed metadata to overcome this and again, every client takes part in that interaction and shares some responsibility for managing communicating metadata updates.

    Q: Can any application access parallel file system (i.e. through an agent in the driver level)? Or does it require specific code within the application?

    A: Native access to a parallel file system requires a specific client or agent in the host, but many parallel file systems allow any client to access the data through a NAS protocol gateway. No changes are needed to applications to use a parallel file system – These parallel file systems are mounted as a POSIX compliant file system and therefore adhere to basically the same standards as an NFS mount for example.

    Q: Are parallel file system clients compatible with scale-out NAS servers?

    A: Nearly all scale-out NAS servers speak a standard NAS protocol like NFS or SMB. Clients running a parallel file system client can also access NAS via these standard protocols. Exceptions to this may possibly (but none that we know of) occur for scale-out NAS servers that support a modified NFS/SMB protocol or a custom NAS client which might conceivably conflict with the parallel file system client when installed on an OS.

    Q: Of course I am biased, but I am fond of the AFS (Andrew File System) Family of File Systems.   There is OpenAFS, but there is also what we are doing at AuriStor extending beyond the core AFS global namespace model (security functionality, and performance)

    A: AFS is another distributed file system which supports large scale deployments, native clients for many platforms, and strong security features. It also uses local caching of files to improve performance. It uses a weakly consistent file locking system so multiple clients can access the same file simultaneously but they cannot both update the same file at the same time. OpenAFS is an open-source implementation of AFS. Auristor (formerly Your File System, Inc.) is a startup providing a commercial parallel file system that is compatible with AFS.

    Q: I am more familiar with Veritas Cluster File System, could you please do a quick compare with Lustre or GPFS?

    A: The Veritas Cluster File System (formerly VxCFS, now part of Veritas InfoScale) is a distributed file system that runs on Linux and popular flavors of Unix. It supports up to 64 nodes and allows multiple nodes to share the same back-end storage hardware. Comparing it to Lustre and GPFS is beyond the scope of this webinar, but in basic terms, parallel file systems can offer far greater scalability and bandwidth for example, through the use of optimized RDMA clients for high performance networks.

    Q: Why do file apps need shared access to data, but block apps do not?

    A: Traditionally block storage did not offer shared access to data (except when used as shared back-end storage for a clustered file system), while apps that needed shared access to data usually chose to use a NAS protocol such as SMB or NFS. So in many cases file-based apps use file sharing protocols because they need shared access to data from multiple clients. (In other cases file-based applications do not require sharing but the storage administrators believe it’s easier to manage or less expensive than networked block storage.)

    Q: Do Lustre and GPFS have SMB Direct support?

    A: Not today. SMB Direct is an option to use RDMA and multi-channel with the SMB 3 protocol. Both Lustre and GPFS support the ability to export a file system via NFS or SMB, but generally they do not support SMB Direct yet. Both Lustre and GPFS support RDMA access through their clients.

    How to the clients avoid doing simultaneous writes to the same file?

    A: Some parallel file systems allow this by letting different clients write to different parts of the same file. Others do not allow this. In either case, distributed file locking is used to prevent two clients from writing simultaneously to the same part of a file (or to the same file if it’s not allowed).

    Q: How can you say that the application “does not have to worry about” how the clustered file system serializes writes? Doesn’t this require continuous end-to-end connectivity?

    A: When the application writes data it generally writes to a POSIX-compliant file system and does not need to worry about how the parallel file system serializes, distributes, or protects the data because this is virtualized (managed) by the file system. It usually does require continuous end-to-end connectivity from the clients to the servers, though in some cases caching could allow for brief gaps in connectivity and in some systems not every client needs to have network connectivity to every server. There are multiple mechanisms within parallel file systems to manage the various cases of clients/servers disappearing from the network, temporarily or permanently (whilst for example holding a lock).

    Q: How does a parallel file system handle the sequences of write on a same file? Just append one by one? What if a client modified a line?

    A: This is the biggest challenge for and reason to use a parallel file system.  Beneath the covers, coherency is maintained by Spectrum Scale using a token management server process which issues locks for object requests.  Similar functionality is implemented in Lustre using a distributed lock manager.  These objects are most commonly blocks within files rather than entire files, but this is application controlled.  The end result is a POSIX-compliant interface that scales to thousands of clients.

    Q: What does FPO stand for?

    A: File Placement Optimizer – a shared-nothing architecture and licensing model for IBM Spectrum Scale (aka GPFS). Learn more here.

    Q: Is there a concept in parallel file systems for “auto-tuning” yet? Seems like the early days of SAN management and tuning…

    A: Default tuning values are optimized for general purpose workloads, but the whole purpose of tuning parameters is to adjust away from those defaults to optimize the file system for a particular application workload or fil esystem architecture.  Both IBM and OpenSFS with the support of Intel have published extensive documentation on best practices for optimization and tuning for either file system.  We are not aware of any work on “automating” that process but there has been recent work (e.g. in spectrum scale) to simplify the tuning process.

    Q: Which is better as interconnect between disk and servers, shared access or share-nothing?

    A: The use of shared access in the interconnect between disks and servers is limited to providing HA functionality in Lustre or Spectrum Scale, the ability to service I/O requests to a storage device if the server which has primary responsibility for that device is not available.  This usually involves multiple server-attached external storage which can add cost to building the file system.  The alternative approach to HA is to replicate blocks of data to different disks on different servers, cutting back on the usable capacity of the file system.  If HA is not a requirement, a share-nothing architecture will generally involve less hardware and therefore be less expensive to build.

    If you have more questions, please comment on this blog. And I encourage you to check out the SNIA ESF webcast library for educational, vendor-neutral content on Ethernet networked storage topics