Cloud File Services: SMB/CIFS and NFS…in the Cloud – Q&A

Cloud File Services: SMB/CIFS and NFS…in the Cloud – Q&A

At our recent live ESF Webcast, “Cloud File Services: SMB/CIFS and NFS…in the Cloud” we talked about evaporating your existing file server into the cloud. Over 300 people have viewed the Webcast. If you missed it, it’s now available on-demand. It was an interactive session with a lot of great questions from attendees. We did not have time to address them all – so here is a complete Q&A from the Webcast. If you think of additional questions, please feel free to comment on this blog.

Q. Can your Storage OS take advantage of born-in-the-cloud File Storage like Zadara Storage at AWS and Azure?

A. The concept presented is generic in nature.  Whichever storage OS the customer chooses to use in the cloud will have its own requirements on the underlying storage beneath it.  Most Storage OS’s used for Cloud File Services will likely use block or object backends rather than a file backend.

Q. Regarding Cloud File Services for “Client file services,” since the traditional file services require the client and server to be in a connected mode, and in the same network. And, they are tied to identities available in the network. How can the SMB/NFS protocols be used to serve data from the cloud to the clients that could be coming from different networks (4G/Corporate)? Isn’t REST the appropriate interface for that model?

A. The answer depends on the use case.  There are numerous examples of SMB over the WAN, for example, so it’s not far fetched to imagine someone using Cloud File Services as an alternative to a “Sync & Share” solution for client file services.  REST (or similar) may be appropriate for some, while file-based protocols will work better for others.  Cloud File Services provides the capability to extend into this environment where it couldn’t before.

Q. Is Manila like VMware VSAN or VASA?

A. Please take a look at the Manila project on OpenStack’s website https://wiki.openstack.org/wiki/Manila

Q. How do you take care of data security while moving data from on-premises to cloud (Storage OS)?

A. The answer depends on the Storage OS you are using for your Cloud File Services platform.  If your Storage OS supports encryption, for example, in its storage-to-storage in-flight data transport, then data security in-flight would be taken care of.  There are many facets to security that need to be thought through, including security at rest, some of which may depend on the environment (private/on-premises, service provider, hyperscalar) the Storage OS is sitting in.

Q. How do you get the data out of the cloud?  I think that’s been a traditional concern with cloud storage.

A. That’s the beauty of Cloud File Services!  With data movement and migration provided at the storage-level by the same Storage OS across all locations, you can simply move the data between on-premises and off-premises and expect similar behavior on both ends.  If you choose to put data into a native environment specific to a hyperscalar or service provider, you run the risk of lock-in.

Q. 1. How does one address issue of “chatty” applications over the cloud?  2. File services have “poor” performance for small files. How does one address that issue? Block & Objects do address that issue 3. Why not expose SMB, NFS, Object Interface on the Compute note?

A. 1. We should take this opportunity to make the applications less chatty! :)  One possible solution here is to operate the application and Storage OS in the same environment, in much the same way you would have on-premises.  If you choose a hyperscalar or service provider, for chatty use cases, it may be best to keep the application and storage pieces “closer” together.

2. Newer file protocols are getting much better at this.  SMB 3.02 for instance, was optimized for 8K transactions.  With a modern Storage OS, you will be able to take advantage of new developments.

3. That is precisely the idea: the Storage OS operating in the “compute nodes,” serving out their interfaces, while taking advantage of different backend offerings for cost and scalability.

Q. Most storage arrays NetApp, EMC etc., can provide 5 9s of resilience, Cloud VMs typically offer 3 9′s.  How do you get to 5 9′s with CFS?

A. Cloud File Services (CFS) as a platform can span across all of your environments, and as such, the availability guarantees will depend upon each environment in which CFS is operating.

Q. Why are we “adding” another layer? Why can we just use powerful “NAS” devices that can have different media like NVMe, Flash SSD or HDDs?

A. Traditional applications may not want to change, but this architecture should suit those well.  It’s worth examining that “cloud-ready” model.  Is the goal to be “cloud-ready,” or is the goal to support the scaling, failover, and on-demand-ness that the cloud has the ability to provide?  Shared nothing is a popular way of accomplishing some of this, but it may not be the only way.

The existing interfaces provided by hyperscalars do provide abstraction, but if you are building an application, you run a strong risk of lock-in on any particular abstraction.  What is your exit strategy then?  How do you move your data (and applications) out?

By leveraging a common Storage OS across your entire infrastructure (on-premises, service providers, and hyperscalars), you have a very simple exit strategy, and your exit and mobility strategy become very similar, if not the same, with the ability to scale or move across any environment you choose.

Q. How do you virtualize storage OS? What happens to native storage OS hardware/storage?

A. A Storage OS can be virtualized similar to a PC or traditional server OS.  Some pieces may have to be switched or removed, but it is still an operating system.

Q. Why is your Storage displayed as part of your Compute layer?

A. In the hyperscalar model, the Storage OS is sitting in the compute layer because it is, in effect, running as a virtual machine the same as any other.  It can then take advantage of different tiers of storage offered to it.

Q. My concern is that it would be slower as a VM than a storage controller.  There’s really no guarantee of storage performance in the cloud in fact most hyperscalers won’t give me a good SLA without boatloads of money.  How might you respond to this?

A. Of course with on-premises infrastructure, a company or service provider will have more of a guarantee in the sense that they control the hardware behind it.  However, as we’ve seen, SLA’s continue to improve over time, and costs continue to come down for the Public Cloud.

Q. Does FreeNAS qualify as a Storage OS?

A. I recommend checking with their team.

Q. Isn’t this similar to Hybrid cloud?

A. Cloud File Services (CFS) is one way of looking at Hybrid Cloud.  Savvy readers and listeners will pick up that having the same Storage OS everywhere doesn’t necessarily limit you to only File Services. iSCSI or RESTful interfaces could work exactly the same.

Q. What do you mean by Storage OS? Can you give some examples?

A. As I work for NetApp, one example is Data ONTAP.  EMC has several as well, such as one for the VNX platform.  Most major storage vendors will have their own OS.

Q. I think one of the key questions is the data access latency over WAN, how I can move my data to the cloud, how I can move back when needed – for example, when the service is terminated?

A. Latency is a common concern, and connectivity is always important.  Moving your data into and out of the cloud is the beauty of the Cloud File Services platform, as I mentioned in other answers.  If one of your environments goes down (for example, your on-premises datacenter) then you would feasibly be able to shift your workloads over to one of your other environments, similar to a DR situation.  That is one example of where storage replication and application awareness across sites is important.

Q. Running applications like Oracle, Exchange through SMB/NFS (NAS), don’t you think it will be slow compared to FC (block storage)?

A. Oracle has had great success running over NFS, and it is extremely popular.  While Exchange doesn’t currently support running directly over SMB at this time, it’s not ludicrous to think that it may happen at some point in the future, in the same way that SQL has.

Q. What about REST and S3 API or are they just for object storage?  What about CINDER?

A. The focus of this presentation was only File Services, but as I mentioned in another answer, if your Storage OS supports these services (like REST or S3), it’s feasible to imagine that you could span them in the same way that we discussed CFS.

Q. Why SAN based application moving to NAS?

A. This was discussed in one of the early slides in the presentation (slide 10, I believe).  Data mobility and granular management were discussed, as it’s easier to move, delete, and otherwise manage files than LUNs, an admin can operate at a more granular level, and it’s easier to operate and maintain.  No HBA’s, etc.  File protocols are generally considered “easier” to use.

 

 

New Webcast: Cloud File Services: SMB/CIFS and NFS…in the Cloud

Imagine evaporating your existing file server into the cloud with the same experience and level of control that you currently have on-premises. On October 1st, ESF will host a live Webcast that introduces the concept of Cloud File Services and discusses the pros and cons you need to consider.

There are numerous companies and startups offering “sync & share” solutions across multiple devices and locations, but for the enterprise, caveats are everywhere. Register now for this Webcast to learn:

  • Key drivers for file storage
  • Split administration with sync & share solutions and on-premises file services
  • Applications over File Services on-premises (SMB 3, NFS 4.1)
  • Moving to the cloud: your storage OS in a hyperscalar or service provider
  • Accommodating existing File Services workloads with Cloud File Services
  • Accommodating cloud-hosted applications over Cloud File Services

This Webcast will be a vendor-neutral and informative discussion on this hot topic. And since it’s live, we encourage your to bring your questions for our experts. I hope you’ll register today and we’ll look forward to having you attend on October 1st 

 

Ethernet Meets Enterprise Storage – Finally

Presumptuous, yes, because Ethernet has been a mainstay in enterprises since its early days over 40 years ago.  It initially grew to prominence as the local area network (LAN) connection in the enterprise. More recent advances have enabled Ethernet to become a standard for mission critical storage connectivity for block, file and object storage in many enterprises.

Block storage in large enterprises has long been focused on Fibre Channel due to its performance capabilities.   In order to bring the same performance benefits to Ethernet, the IEEE 802.1 Data Center Bridging Task Group proposed a number of new standards to enhance Ethernet reliability.  For example, 802.1Qbb Priority-based Flow Control (PFC) provides a link level flow control mechanism to ensure lossless transmission under congestion, 802.1Qaz Enhanced Transmission Selection (ETS) provides a management framework for prioritized bandwidth and Data Center Bridging Exchange Protocol (DCBX) enabled these features to be used between neighbors to ensure consistency on the network. Collectively, these and other enhancements have brought those enterprise-class storage networking features to the Ethernet platform.

In addition, the International Committee for Information Technology Services (INCITS) T11 Fibre Channel committee developed a specification for Fibre Channel over Ethernet (FCoE) in its FC-BB-5 standard in 2009, which allows the Fibre Channel protocol to run directly on top of Ethernet, eliminating the TCP/IP stack and allowing for efficient performance of the Fibre Channel protocol.  FCoE also depends on the Data Center Bridging standards from IEEE 802.1 in order to ensure the “losslessness” and flow control needed by Fibre Channel.

An alternative to FCoE, iSCSI, was designed to run over standard Ethernet with TCP/IP and was designed to tolerate the “lossy” aspects of Ethernet.  Its architecture and the additional layers of encapsulation involved can impact latency and performance. However, more recent innovations in iSCSI have enabled it to run over a DCB Ethernet network, which enables iSCSI to inherit some of the enterprise storage features which have always been inherent in Fibre Channel.  For more on this, read last year’s blog “How DCB Makes iSCSI Better ” from Allen Ordoubadian.

In 2013, INCITS submitted the FC-BB-6 standard for review which introduced, among other things, the VN2VN standard.  The VN2VN proposal will allow FCoE to work in a standard DCB switching environment without the presence of a Fibre Channel Forwarder (FCF).  An FCF allows for bridging between servers which are communicating with FCoE and storage devices which are communicating with traditional Fibre Channel.  As DCB switches and FCoE storage become more prevalent, the FC-BB-6 standard will allow for end-to-end FCoE connectivity in either a point to point (P2P) or DCB mesh environment. This will result in lower cost for FCoE environments. Products are beginning to appear which support VN2VN and over the next 18 months it is likely that all major vendors will support it. Check out our ESF Webcast “How VN2VN Will Help Accelerate Adoption of FCoE” for more details.

The availability of CNAs with processing capability allows for offloading storage protocol processing from the host processor, though some CNAs use host-based storage protocol initiators in system software and do selective stateless offloads in the data path.  Both FCoE and iSCSI require the storage protocol to be encapsulated in a frame to be sent across the Ethernet network.  In an enterprise environment, especially a virtual server environment, CPU utilization is tracked closely and target CPU thresholds are often set.  Anything which can minimize spikes in CPU utilization can allow for more workloads to be placed on servers and allows for predictable energy consumption.

For file storage, Ethernet has traditionally been the connectivity option of choice for file servers used as “shares” for centralized employee document storage. In the 21st century, usage of network attached storage (NAS) with the Network File System (NFS) has increased for enterprise databases and Hadoop clusters, especially with the availability of 10Gb Ethernet.  New features in NFS 4 and later introduced security and stateful protocol support after development of NFS was taken over by the Internet Engineering Task Force (IETF).

Object storage, has been around for nearly 20 years as a repository for storing data as objects which include not only the original file, but also a globally unique identifier and metadata which describes the object and various parameters about the object.  It has been used to store many forms of unstructured data, but found niches in certain areas, such as legal documents with retention policies and archiving photos and videos.  More recently, there seems to be a resurgence in object storage as the amount of unstructured data generated by enterprises continues to skyrocket.  Open source object storage in Ceph and OpenStack are also helping to drive the adoption. SNIA ESF is hosting a live Webcast on object storage on June 11, 2014, called “Object Storage 101.” I encourage you to register for this presentation for an unbiased look at the what, how and why of object storage technologies.

When combined with the advances in link speed, throughput capabilities, latency and input/output operations per second (IOPS) in modern 10Gb/s and 40Gb/s Ethernet, these existing and emerging Ethernet standards and storage architectures are having a profound effect on the ability of Ethernet as an enterprise class storage networking platform.  Vendors and customers are seeing the advantage in one wire, the Ethernet cable, carrying all LAN, WAN and storage traffic.

 

 

 

SNIA’s Events Strategy Today and Tomorrow

David Dale, SNIA Chairman

Last month Computerworld/IDG and the SNIA posted a notice to the SNW website stating that they have decided to conclude the production of SNW.  The contract was expiring and both parties declined to renew.  The IT industry has changed significantly in the 15 years since SNW was first launched, and both parties felt that their individual interests would be best served by pursuing separate events strategies.

For the SNIA, events are a strategically important vehicle for fulfilling its mission of developing standards, maintaining an active ecosystem of storage industry experts, and providing vendor-neutral educational materials to enable IT professions to deal with and derive value from constant technology change.  To address the first two categories, SNIA has a strong track record of producing Technical Symposia throughout the year, and the successful Storage Developer Conference in September.

To address the third category, IT professionals, SNIA has announced a new event, to be held in Santa Clara, CA, from April 22-24 – the Data Storage Innovation Conference. This event is targeted at IT decision-makers, technology implementers, and those expected to influence, implement and support data storage innovation as actual production solutions.  See the press release and call for presentations for more information.  We are excited to embark on developing this contemporary new event into an industry mainstay in the coming years.

Outside of the USA, events are also critically important vehicles for the autonomous SNIA Regional Affiliates to fulfill their mission.  The audience there is typically more biased towards business executives and IT managers, and over the years their events have evolved to incorporate adjacent technology areas, new developments and regional requirements.

As an example of this evolution, SNIA Europe’s events partner, Angel Business Communications, recently announced that its very successful October event, SNW Europe/Datacenter Technologies/Virtualization World, will be simply known as Powering the Cloud starting in 2014, in order to unite the conference program and to be more clearly relevant to today’s IT industry. See the press release for more details.

Other Regional Affiliates have followed a similar path with events such as Implementing Information Infrastructure Summit and Information Infrastructure Conference – both tailored to meet regional needs.

The bottom line on this is that the SNIA is absolutely committed to a global events strategy to enable it to carry out its mission.  We are excited about the evolution of our various events to meet the changing needs of the market and continue to deliver unique vendor-neutral content. IT professionals, partners, vendors and their customers around the globe can continue to rely on SNIA events to inform them about new technologies and developments and help them navigate the rapidly changing world of IT.

SUSE Announces NFSv4.1 and pNFS Support

SUSE, founded in 1992, provides an enterprise ready Linux distribution in the form of SLES; the SUSE Linux Enterprise Server. As of late last month (October 22, 2013), SUSE announced that SLES 11 with service pack 3 now supports the Linux client for NFSv4.1 and pNFS client. This major distribution joins RedHat’s RHEL (RedHat Enterprise Linux) 6.4 in supporting enterprise quality Linux distributions with support for files based NFSv4.1 and pNFS.

For the adventurous, block and object pNFS support is available in the upstream kernel. Most regularly maintained distributions based on a Linux 3.1 or better kernel (if not all distributions now – check with the supplier of the distribution if you’re unsure) should provide the files, block and object compliant client directly in the download.

The future of pNFS looks very exciting. We now have a fully pNFS compliant Linux client, and a number of commercial files, blocks and object servers. Remember that although pNFS block and object support is available, currently these distributions support only the pNFS files layout. For those users not needing pNFS with block or objects support and requiring enterprise quality support, SUSE and RedHat are an excellent solution.

pNFS and Future NFSv4.2 Features

In this third and final blog post on NFS (see previous blog posts Why NFSv4.1 and pNFS are Better than NFSv3 Could Ever Be and The Advantages of NFSv4.1) I’ll cover pNFS (parallel NFS), an optional feature of NFSv4.1 that improves the bandwidth available for NFS protocol access, and some of the proposed features of NFSv4.2 – some of which are already implemented in commercially available servers, but will be standardized with the ratification of NFSv4.2 (for details, see the IETF NFSv4.2 draft documents).

Finally, I’ll point out where you can get NFSv4.1 clients with support for pNFS today.

Parallel NFS (pNFS) and Layouts

Parallel NFS (pNFS) represents a major step forward in the development of NFS. Ratified in January 2010 and described in RFC-5661, pNFS depends on the NFS client understanding how a clustered filesystem stripes and manages data. It’s not an attribute of the data, but an arrangement between the server and the client, so data can still be accessed via non-pNFS and other file access protocols.  pNFS benefits workloads with many small files, or very large files, especially those run on compute clusters requiring simultaneous, parallel access to data.

 NFS 3 image 1

Clients request information about data layout from a Metadata Server (MDS), and get returned layouts that describe the location of the data. (Although often shown as separate, the MDS may or may not be standalone nodes in the storage system depending on a particular storage vendor’s hardware architecture.) The data may be on many data servers, and is accessed directly by the client over multiple paths. Layouts can be recalled by the server, as in the case for delegations, if there are multiple conflicting client requests. 

By allowing the aggregation of bandwidth, pNFS relieves performance issues that are associated with point-to-point connections. With pNFS, clients access data servers directly and in parallel, ensuring that no single storage node is a bottleneck. pNFS also ensures that data can be better load balanced to meet the needs of the client.

The pNFS specification also accommodates support for multiple layouts, defining the protocol used between clients and data servers. Currently, three layouts are specified; files as supported by NFSv4, objects based on the Object-based Storage Device Commands (OSD) standard (INCITS T10) approved in 2004, and block layouts (either FC or iSCSI access). The layout choice in any given architecture is expected to make a difference in performance and functionality. For example, pNFS object based implementations may perform RAID parity calculations in software on the client, to allow RAID performance to scale with the number of clients and to ensure end-to-end data integrity across the network to the data servers.

So although pNFS is new to the NFS standard, the experience of users with proprietary precursor protocols to pNFS shows that high bandwidth access to data with pNFS will be of considerable benefit.

Potential performance of pNFS is definitely superior to that of NFSv3 for similar configurations of storage, network and server. The management is definitely easier, as NFSv3 automounter maps and hand-created load balancing schemes are eliminated; and by providing a standardized interface, pNFS ensures fewer issues in supporting multi-vendor NFS server environments.

Some Proposed NFSv4.2 features

NFSv4.2 promises many features that end-users have been requesting, and that makes NFS more relevant as not only an “every day” protocol, but one that has application beyond the data center. As the requirements document for NFSv4.2 puts it, there are requirements for: 

  • High efficiency and utilization of resources such as, capacity, network bandwidth, and processors.
  • Solid state flash storage which promises faster throughput and lower latency than magnetic disk drives and lower cost than dynamic random access memory.

Server Side Copy

Server-Side Copy (SSC) removes one leg of a copy operation. Instead of reading entire files or even directories of files from one server through the client, and then writing them out to another, SSC permits the destination server to communicate directly to the source server without client involvement, and removes the limitations on server to client bandwidth and the possible congestion it may cause.

Application Data Blocks (ADB)

ADB allows definition of the format of a file; for example, a VM image or a database. This feature will allow initialization of data stores; a single operation from the client can create a 300GB database or a VM image on the server.

Guaranteed Space Reservation & Hole Punching

As storage demands continue to increase, various efficiency techniques can be employed to give the appearance of a large virtual pool of storage on a much smaller storage system.  Thin provisioning, (where space appears available and reserved, but is not committed) is commonplace, but often problematic to manage in fast growing environments. The guaranteed space reservation feature in NFSv4.2 will ensure that, regardless of the thin provisioning policies, individual files will always have space available for their maximum extent.

 NFS 3 image 2

While such guarantees are a reassurance for the end-user, they don’t help the storage administrator in his or her desire to fully utilize all his available storage. In support of better storage efficiencies, NFSv4.2 will introduce support for sparse files. Commonly called “hole punching”, deleted and unused parts of files are returned to the storage system’s free space pool.

Obtaining Servers and Clients

With this background on the features of NFS, there is considerable interest in the end-user community for NFSv4.1 support from both servers and clients. Many Network Attached Storage (NAS) vendors now support NFSv4, and in the last 12 months, there has been a flurry of activity and many developments in server support of NFSv4.1 and pNFS.

For NFS server vendors, there are NFSv4.1 and files based, block based and object based implementations of pNFS available; refer to the vendor websites, where you will get the latest up-to-date information.

On the client side, there is RedHat Enterprise Linux 6.4 that includes full support for NFSv4.1 and pNFS (see www.redhat.com), Novell SUSE Linux Enterprise Server 11 SP2 with NFSv4.1 and pNFS based on the 3.0 Linux kernel (see www.suse.com), and Fedora available at fedoraproject.org.

Conclusion     

NFSv4.1 includes features intended to enable its use in global wide area networks (WANs).  These advantages include:

  • Firewall-friendly single port operations
  • Advanced and aggressive cache management features
  • Internationalization support
  • Replication and migration facilities
  • Optional cryptography quality security, with access control facilities that are compatible across UNIX® and Windows®
  • Support for parallelism and data striping

The goal for NFSv4.1 and beyond is to define how you get to storage, not what your storage looks like. That has meant inevitable changes. Unlike earlier versions of NFS, the NFSv4 protocol integrates file locking, strong security, operation coalescing, and delegation capabilities to enhance client performance for data sharing applications on high-bandwidth networks.

NFSv4.1 servers and clients provide even more functionality such as wide striping of data to enhance performance.  NFSv4.2 and beyond promise further enhancements to the standard that increase its applicability to today’s application requirements. It is due to be ratified in August 2012, and we can expect to see server and client implementations that provide NFSv4.2 features soon after this; in some cases, the features are already being shipped now as vendor specific enhancements. 

With careful planning, migration to NFSv4.1 (and NFSv4.2 when it becomes generally available) from prior versions can be accomplished without modification to applications or the supporting operational infrastructure, for a wide range of applications; home directories, HPC storage servers, backup jobs and a variety of other applications.

FOOTNOTE: Parts of this blog were originally published in Usenix ;login: February 2012 under the title The Background to NFSv4.1. Used with permission.

 

The Advantages of NFSv4.1

In a previous blog post Why NFSv4.1 and pNFS are Better than NFSv3 Could Ever Be, some of the issues with NFSv3 that made it difficult to implement as a WAN based or data center wide protocol were discussed. The question then becomes; why not move to NFSv4 instead of NFSv4.1? Isn’t that a bigger leap from NFSv3?

Well, practical experience and some issues with NFSv4 made NFSv4.1 a necessity; for one, it introduces the key concept of sessions, and provides a foundation for pNFS (parallel NFS) which we’ll discuss in a later blog post. And all the features of NFSv4 were carried over into NFSv4.1, since it was a minor version update; there’s little more to do to take advantage of NFSv4.1, so that’s where your focus evaluation and implementation should be.

TCP for Transport

NFSv3 supports both TCP (Transmission Control Protocol) and UDP (User Datagram Protocol), and UDP is sometimes employed (for those applications that support it) because it is perceived to be lightweight and faster in comparison to TCP.

The downside of UDP is that it’s connectionless (that is, stateless) and an unreliable protocol. There is no guarantee that the datagrams will be delivered in any given order to the destination host — or even delivered at all — so applications must be specifically designed to handle missing, duplicate or incorrectly ordered data. UDP is also not a good network citizen; there is no concept of congestion or flow control, and no ability to apply quality of service (QoS) criteria.

The NFSv4 specification requires that any transport used provides congestion control. The easiest way to do this is via TCP. By using TCP, NFSv4 clients and servers are able to adapt to known frequent spikes in unreliability on the Internet; and retransmission is managed in the transport layer instead of in the application layer, greatly simplifying applications and their management on a shared network.

NFSv4 also introduces strict rules about retries over TCP in contrast to the complete lack of rules in NFSv3 for retries over TCP. As a result, if NFSv3 clients have timeouts that are too short, NFSv3 servers may drop requests. NFSv4 uses the timers that are built into the connection-oriented transport.

Network Ports

To access an NFS server, an NFSv3 client must contact the server’s portmapper to find the port of the mountd server. It then contacts the mount server to get an initial file handle, and again contacts the portmapper to get the port of the NFS server. Finally, the client can access the NFS server.

This creates problems for using NFS through firewalls, because firewalls typically filter traffic based on well-known port numbers. If the client is inside a firewalled network, and the server is outside the network, the firewall needs to know what ports the portmapper, mountd and nfsd servers are listening on. The mount server can listen on any port, so telling the firewall what port to permit is not practical. While the NFS server usually listens on port 2049, sometimes it does not. While the portmapper always listens on the same port (111), many firewall administrators, out of excessive caution, block requests to port 111 from inside the firewalled network to servers outside the network. As a result, NFSv3 is not practical to use through firewalls. (Aside from which, without security, it’s risky too.)

NFSv4 uses a single port number by mandating the server will listen on port 2049. There are no “auxiliary” protocols like statd, lockd and mountd required as the mounting and locking protocols have been incorporated into the NFSv4 protocol. This means that NFSv4 clients do not need to contact the portmapper, and do not need to access services on floating ports.

As NFSv4 uses a single TCP connection with a well-defined destination TCP port, it traverses firewalls and network address translation (NAT) devices with ease, and makes firewall configuration as simple as configuration for HTTP servers.

Mounts and Automounter

The automounter daemons and the utilities on different flavors of UNIX and Linux are capable of identifying different NFS versions. However, using the automounter will require at least port 111 to be permitted through any firewall between server and client, as it uses the portmapper.

This is undesirable if you are extending the use of NFSv4 beyond traditional NFSv3 environments, so in preference the widely available “mirror mount” facility can be used. It enhances the behavior of the NFSv4 client by creating a new mountpoint whenever it detects that a directory’s fsid differs from that of its parent and automatically mounts filesystems when they are encountered at the NFSv4 server .

This enhancement does not require the use of the automounter and therefore does not rely on the content or propagation of automounter maps, the availability of NFSv3 services such as mountd, or opening firewall ports beyond the single port 2049 required for NFSv4.

Internationalization Support; UTF-8

Yes, those funny characters outside of US-ASCII are supported. In a welcome recognition that it set no longer provides the descriptive capabilities demanded by languages with larger alphabets or those that use an extensive range of non-Roman glyphs, NFSv4 uses UTF-8 for file names, directories, symlinks and user and group identifiers. As UTF-8 is backwards compatible with 7 bit encoded ASCII, any names that are 7 bit ASCII will continue to work.

Compound RPCs

Latency in a wide area network (WAN) is a perennial issue, and is very often measured in tenths of a second to seconds. NFS uses Remote Procedure Calls (RPCs) to undertake all its communication with the server, and although the payload is normally small, meta-data operations are largely synchronous and serialized. Operations such as file lookup (LOOKUP), the fetching of attributes (GETATTR) and so on, make up the largest percentage by count of the average traffic load on NFS.

This mix of a typical NFS set of RPC calls in versions prior to NFSv4 requires each RPC call is a separate transaction over the wire. NFSv4 avoids the expense of single RPC requests and the attendant latency issues and allows these calls to be bundled together. For instance, a lookup, open, read and close can be sent once over the wire, and the server can execute the entire compound call as a single entity. The effect is to reduce latency considerably for multiple operations.

Delegations

Servers are employing ever more quantities of RAM and flash technologies, and very large caches in the orders of terabytes are not uncommon. Applications running over NFSv3 can’t take advantage of these caches unless they have specific application support. With increasing WAN latencies doing every IO over the wire introduces significant delay.

NFSv4 allows the server to delegate certain responsibilities to the client, a feature that allows caching locally where the data is being accessed. Once delegated, the client can act on the file locally with the guarantee that no other client has a conflicting need for the file; it allows the application to have locking, reading and writing requests serviced on the application server without any further communication with the NFS server. To prevent deadlocking conditions, the server can recall the delegation via an asynchronous callback to the client should there be a conflicting request for access to the file from a different client.

Migration, Replicas and Referrals

For broader use within a datacenter, and in support of high availability applications such as databases and virtual environments, copying data for backup and disaster recovery purposes, or the ability to migrate it to provide VM location independence are essential. NFSv4 provides facilities for both transparent replication and migration of data, and the client is responsible for ensuring that the application is unaware of these activities. An NFSv4 referral allows servers to redirect clients from this server’s namespace to another server; it allows the building of a global namespace while maintaining the data on discrete and separate servers.

Sessions

Perhaps one of the most significant features of NFSv4.1 is the introduction of stateful sessions. Sessions bring the advantages of correctness and simplicity to NFS semantics. In order to improve on the correctness of NFSv4, NFSv4.1 sessions introduce “exactly-once” semantics.
Servers maintain one or more session states in agreement with the client; they maintain the server’s state relative to the connections belonging to a client. Clients can be assured that their requests to the server have been executed, and that they will never be executed more than once.

Sessions extend the idea of NFSv4 delegations, which introduced server-initiated asynchronous callbacks; clients can initiate session requests for connections to the server. For WAN based systems, this simplifies operations through firewalls.

Security

An area of great confusion, many believe that NFSv4 requires the use of strong security. The NFSv4 specification simply states that implementation of strong RPC security by servers and clients is mandatory, not the use of strong RPC security. This misunderstanding may explain the reluctance of users from migrating to NFSv4 due to the additional work in implementing or modifying their existing Kerberos security.

Security is increasingly important as NFSv4 makes data more easily available over the WAN. This feature was considered so important by the IETF NFS working group that the security specification using Kerberos v5 was “retrofitted” to the NFSv2 and NFSv3 specifications.

Graphic for Advantages of NFS

Although access to an NFS filesystem without strong security such as provided by Kerberos is possible, across a WAN it should really be considered only as a temporary measure. In that spirit, it should be noted that NFSv4 can be used without implementing Kerberos security. The fact that it is possible does not make it desirable! A fuller description of the issues and some migration considerations can be found in the SNIA White Paper “Migrating from NFSv3 to NFSv4”.

Many of the practical issues faced in implementing robust Kerberos security in a UNIX environment can be eased by using a Windows Active Directory (AD) system. Windows uses the standard Kerberos protocol as specified in RFC 1510; AD user accounts are represented to Kerberos in the same way as accounts in UNIX realms. This can be a very attractive solution in mixed-mode environments.

In the next post, we’ll discuss one of the primary features of NFSv4.1; pNFS, or parallelized NFS, and some of the new work being done in support of NFSv4.2.
FOOTNOTE: Parts of this blog were originally published in Usenix ;login: February 2012 under the title The Background to NFSv4.1. Used with permission.

The Advantages of NFSv4.1

In a previous blog post Why NFSv4.1 and pNFS are Better than NFSv3 Could Ever Be, some of the issues with NFSv3 that made it difficult to implement as a WAN based or data center wide protocol were discussed. The question then becomes; why not move to NFSv4 instead of NFSv4.1? Isn’t that a bigger leap from NFSv3?

Well, practical experience and some issues with NFSv4 made NFSv4.1 a necessity; for one, it introduces the key concept of sessions, and provides a foundation for pNFS (parallel NFS) which we’ll discuss in a later blog post. And all the features of NFSv4 were carried over into NFSv4.1, since it was a minor version update; there’s little more to do to take advantage of NFSv4.1, so that’s where your focus evaluation and implementation should be.

TCP for Transport

NFSv3 supports both TCP (Transmission Control Protocol) and UDP (User Datagram Protocol), and UDP is sometimes employed (for those applications that support it) because it is perceived to be lightweight and faster in comparison to TCP.

The downside of UDP is that it’s connectionless (that is, stateless) and an unreliable protocol. There is no guarantee that the datagrams will be delivered in any given order to the destination host — or even delivered at all — so applications must be specifically designed to handle missing, duplicate or incorrectly ordered data. UDP is also not a good network citizen; there is no concept of congestion or flow control, and no ability to apply quality of service (QoS) criteria.

The NFSv4 specification requires that any transport used provides congestion control. The easiest way to do this is via TCP. By using TCP, NFSv4 clients and servers are able to adapt to known frequent spikes in unreliability on the Internet; and retransmission is managed in the transport layer instead of in the application layer, greatly simplifying applications and their management on a shared network.

NFSv4 also introduces strict rules about retries over TCP in contrast to the complete lack of rules in NFSv3 for retries over TCP. As a result, if NFSv3 clients have timeouts that are too short, NFSv3 servers may drop requests. NFSv4 uses the timers that are built into the connection-oriented transport.

Network Ports

To access an NFS server, an NFSv3 client must contact the server’s portmapper to find the port of the mountd server. It then contacts the mount server to get an initial file handle, and again contacts the portmapper to get the port of the NFS server. Finally, the client can access the NFS server.

This creates problems for using NFS through firewalls, because firewalls typically filter traffic based on well-known port numbers. If the client is inside a firewalled network, and the server is outside the network, the firewall needs to know what ports the portmapper, mountd and nfsd servers are listening on. The mount server can listen on any port, so telling the firewall what port to permit is not practical. While the NFS server usually listens on port 2049, sometimes it does not. While the portmapper always listens on the same port (111), many firewall administrators, out of excessive caution, block requests to port 111 from inside the firewalled network to servers outside the network. As a result, NFSv3 is not practical to use through firewalls. (Aside from which, without security, it’s risky too.)

NFSv4 uses a single port number by mandating the server will listen on port 2049. There are no “auxiliary” protocols like statd, lockd and mountd required as the mounting and locking protocols have been incorporated into the NFSv4 protocol. This means that NFSv4 clients do not need to contact the portmapper, and do not need to access services on floating ports.

As NFSv4 uses a single TCP connection with a well-defined destination TCP port, it traverses firewalls and network address translation (NAT) devices with ease, and makes firewall configuration as simple as configuration for HTTP servers.

Mounts and Automounter

The automounter daemons and the utilities on different flavors of UNIX and Linux are capable of identifying different NFS versions. However, using the automounter will require at least port 111 to be permitted through any firewall between server and client, as it uses the portmapper.

This is undesirable if you are extending the use of NFSv4 beyond traditional NFSv3 environments, so in preference the widely available “mirror mount” facility can be used. It enhances the behavior of the NFSv4 client by creating a new mountpoint whenever it detects that a directory’s fsid differs from that of its parent and automatically mounts filesystems when they are encountered at the NFSv4 server .

This enhancement does not require the use of the automounter and therefore does not rely on the content or propagation of automounter maps, the availability of NFSv3 services such as mountd, or opening firewall ports beyond the single port 2049 required for NFSv4.

Internationalization Support; UTF-8

Yes, those funny characters outside of US-ASCII are supported. In a welcome recognition that it set no longer provides the descriptive capabilities demanded by languages with larger alphabets or those that use an extensive range of non-Roman glyphs, NFSv4 uses UTF-8 for file names, directories, symlinks and user and group identifiers. As UTF-8 is backwards compatible with 7 bit encoded ASCII, any names that are 7 bit ASCII will continue to work.

Compound RPCs

Latency in a wide area network (WAN) is a perennial issue, and is very often measured in tenths of a second to seconds. NFS uses Remote Procedure Calls (RPCs) to undertake all its communication with the server, and although the payload is normally small, meta-data operations are largely synchronous and serialized. Operations such as file lookup (LOOKUP), the fetching of attributes (GETATTR) and so on, make up the largest percentage by count of the average traffic load on NFS.

This mix of a typical NFS set of RPC calls in versions prior to NFSv4 requires each RPC call is a separate transaction over the wire. NFSv4 avoids the expense of single RPC requests and the attendant latency issues and allows these calls to be bundled together. For instance, a lookup, open, read and close can be sent once over the wire, and the server can execute the entire compound call as a single entity. The effect is to reduce latency considerably for multiple operations.

Delegations

Servers are employing ever more quantities of RAM and flash technologies, and very large caches in the orders of terabytes are not uncommon. Applications running over NFSv3 can’t take advantage of these caches unless they have specific application support. With increasing WAN latencies doing every IO over the wire introduces significant delay.

NFSv4 allows the server to delegate certain responsibilities to the client, a feature that allows caching locally where the data is being accessed. Once delegated, the client can act on the file locally with the guarantee that no other client has a conflicting need for the file; it allows the application to have locking, reading and writing requests serviced on the application server without any further communication with the NFS server. To prevent deadlocking conditions, the server can recall the delegation via an asynchronous callback to the client should there be a conflicting request for access to the file from a different client.

Migration, Replicas and Referrals

For broader use within a datacenter, and in support of high availability applications such as databases and virtual environments, copying data for backup and disaster recovery purposes, or the ability to migrate it to provide VM location independence are essential. NFSv4 provides facilities for both transparent replication and migration of data, and the client is responsible for ensuring that the application is unaware of these activities. An NFSv4 referral allows servers to redirect clients from this server’s namespace to another server; it allows the building of a global namespace while maintaining the data on discrete and separate servers.

Sessions

Perhaps one of the most significant features of NFSv4.1 is the introduction of stateful sessions. Sessions bring the advantages of correctness and simplicity to NFS semantics. In order to improve on the correctness of NFSv4, NFSv4.1 sessions introduce “exactly-once” semantics.
Servers maintain one or more session states in agreement with the client; they maintain the server’s state relative to the connections belonging to a client. Clients can be assured that their requests to the server have been executed, and that they will never be executed more than once.

Sessions extend the idea of NFSv4 delegations, which introduced server-initiated asynchronous callbacks; clients can initiate session requests for connections to the server. For WAN based systems, this simplifies operations through firewalls.

Security

An area of great confusion, many believe that NFSv4 requires the use of strong security. The NFSv4 specification simply states that implementation of strong RPC security by servers and clients is mandatory, not the use of strong RPC security. This misunderstanding may explain the reluctance of users from migrating to NFSv4 due to the additional work in implementing or modifying their existing Kerberos security.

Security is increasingly important as NFSv4 makes data more easily available over the WAN. This feature was considered so important by the IETF NFS working group that the security specification using Kerberos v5 was “retrofitted” to the NFSv2 and NFSv3 specifications.

Graphic for Advantages of NFS

Although access to an NFS filesystem without strong security such as provided by Kerberos is possible, across a WAN it should really be considered only as a temporary measure. In that spirit, it should be noted that NFSv4 can be used without implementing Kerberos security. The fact that it is possible does not make it desirable! A fuller description of the issues and some migration considerations can be found in the SNIA White Paper “Migrating from NFSv3 to NFSv4”.

Many of the practical issues faced in implementing robust Kerberos security in a UNIX environment can be eased by using a Windows Active Directory (AD) system. Windows uses the standard Kerberos protocol as specified in RFC 1510; AD user accounts are represented to Kerberos in the same way as accounts in UNIX realms. This can be a very attractive solution in mixed-mode environments.

In the next post, we’ll discuss one of the primary features of NFSv4.1; pNFS, or parallelized NFS, and some of the new work being done in support of NFSv4.2.
FOOTNOTE: Parts of this blog were originally published in Usenix ;login: February 2012 under the title The Background to NFSv4.1. Used with permission.

Ethernet Storage Forum – 2012 Year in Review and What to Expect in 2013

As we come to a close of the year 2012, I want to share some of our successes and briefly highlight some new changes for 2013. Calendar year 2012 has been eventful and the SNIA-ESF has been busy. Here are some of our accomplishments:

  • 10GbE – With virtualization and network convergence, as well as the general availability of LOM and 10GBASE-T cabling, we saw this is a “breakout year” for 10GbE. In July, we published a comprehensive white paper titled “10GbE Comes of Age.” We then followed up with a Webcast “10GbE – Key Trends, Predictions and Drivers.” We ran this live once in the U.S. and once in the U.K. and combined, the Webcast has been viewed by over 400 people!
  • NFS – has also been a hot topic. In June we published a white paper “An Overview of NFSv4” highlighting the many improved features NFSv4 has over NFSv3. A Webcast to help users upgrade, “NFSv4 – Plan for a Smooth Migration,” has also been well received with over 150 viewers to date.  A 4-part Webcast series on NFS is now planned. We kicked the series off last month with “Reasons to Start Working with NFSv4 Now” and will continue on this topic during the early part of 2013. Our next NFS Webcast will be “Advances in NFS – NFSv4.1 and pNFS.” You can register for that here.
  • Flash – The availability of solid state devices based on NAND flash is changing the performance efficiencies of storage. Our September Webcast “Flash – Plan for the Disruption” discusses how Flash is driving the need for 10GbE and has already been viewed by more than 150 people.

We have also added to expand membership and welcome new membership from Tonian and LSI to the ESF. We expect with this new charter to see an increase in membership participation as we drive incremental value and establish ourselves as a leadership voice for Ethernet Storage.

As we move into 2013, we expect two hot trends to continue – the broader use of file protocols in datacenter applications, and the continued push toward datacenter consolidation with the use of Ethernet as a storage network. In order to better address these two trends, we have modified our charter for 2013. Our NFS SIG will be renamed the File Protocol SIG and will focus on promoting not only NFS, but also SMB / CIFS solutions and protocols. The iSCSI SIG will be renamed to the Storage over Ethernet SIG and will focus on promoting data center convergence topics with Ethernet networks, including the use of block and file protocols, such as NFS, SMB, FCoE, and iSCSI, over the same wire. This modified charter will allow us to have a richer conversation around storage trends relevant to your IT environment.

So, here is to a successful 2012, and excitement for the coming year.

Why NFSv4.1 and pNFS are Better than NFSv3 Could Ever Be

NFSv4 has been a standard file sharing protocol since 2003, but has not been widely adopted; party because NFSv3 was “just good enough”. Yet, NFSv4 improves on NFSv3 in many important ways; and NFSv4.1 is a further improvement on that. In this post, I explain the how NFSv4.1 is better suited to a wide range of datacenter and HPC use than its predecessor NFSv3 and NFSv4, as well as providing resources for migrating from NFSv3 to NFSv4.1. And, most importantly, I make the argument that users should, at the very least, be evaluating and deploying NFSv4.1 for use in new projects; and ideally, should be using it wholesale in their existing environments.

The background to NFSv4.1
NFSv2 (specified in RFC-1813, but never an Internet standard) and its popular successor NFSv3 was first released in 1995 by Sun. NFSv3 has proved a popular and robust protocol over the 15 years it has been in use, and with wide adoption it soon eclipsed some of the early competitive UNIX-based filesystem protocols such as DFS and AFS. NFSv3 was extensively adopted by storage vendors and OS implementers beyond Sun’s Solaris; it was available on an extensive list of systems, including IBM’s AIX, HP’s HP-UX, Linux and FreeBSD. Even non-UNIX systems adopted NFSv3; Mac OS, OpenVMS, Microsoft Windows, Novell NetWare, and IBM’s AS/400 systems. In recognition of the advantages of interoperability and standardization, Sun relinquished control of future NFS standards work, and work leading to NFSv4 was by agreement between Sun and the Internet Society (ISOC), and is undertaken under the auspices of the Internet Engineering Task Force (IETF).

In April 2003, the Network File System (NFS) version 4 Protocol was ratified as an Internet standard, described in RFC-3530, which superseded NFSv3. This was the first open filesystem and networking protocol from the IETF. NFSv4 introduces the concept of state to ameliorate some of the less desirable features of NFSv3, and other enhancements to improved usability, management and performance.

But shortly following its release, an Internet draft written by Garth Gibson and Peter Corbett outlined several problems with NFSv4; specifically, that of limited bandwidth and scalability, since NFSv4 like NFSv3 requires that access is to a single server. NFSv4.1 (as described in RFC-5661, ratified in January 2010) was developed to overcome these limitations, and new features such as parallel NFS (pNFS) were standardized to address these issues.

Now NFSv4.2 is now moving towards ratification. In a change to the original IETF NFSv4 development work, where each revision took a significant amount of time to develop and ratify, the workgroup charter was modified to ensure that there would be no large standards documents that took years to develop, such as RFC-5661, and that additions to the standard would be an on-going yearly process. With these changes in the processes leading to standardization, features that will be ratified in NFSv4.2 (expected in early 2013) are available from many vendors and suppliers now.

Adoption of NFSv4.1
Every so often, I and others in the industry run Birds-of-a-Feather (BoFs) on the availability of NFSv4.1 clients and servers, and on the adoption of NFSv4.1 and pNFS. At our latest BoF at LISA ’12 in San Diego in December 2012, many of the attendees agreed; it’s time to move to NFSv4.1.

While there have been many advances and improvements to NFS, many users have elected to continue with NFSv3. NFSv4.1 is a mature and stable protocol with many advantages in its own right over its predecessors NFSv3 and NFSv2, yet adoption remains slow. Adequate for some purposes, NFSv3 is a familiar and well understood protocol; but with the demands being placed on storage by exponentially increasing data and compute growth, NFSv3 has become increasingly difficult to deploy and manage.

In essence, NFSv3 suffers from problems associated with statelessness. While some protocols such as HTTP and other RESTful APIs see benefit from not associating state with transactions – it considerably simplifies application development if no transaction from client to server depends on another transaction – in the NFS case, statelessness has led, amongst other downsides, to performance and lock management issues.

NFSv4.1 and parallel NFS (pNFS) address well-known NFSv3 “workarounds” that are used to obtain high bandwidth access; users that employ (usually very complicated) NFSv3 automounter maps and modify them to manage load balancing should find pNFS provides comparable performance that is significantly easier to manage.

So what’s the problem with NFSv3?
Extending the use of NFS across the WAN is difficult with NFSv3. Firewalls typically filter traffic based on well-known port numbers, but if the NFSv3 client is inside a firewalled network, and the server is outside the network, the firewall needs to know what ports the portmapper, mountd and nfsd servers are listening on. As a result of this promiscuous use of ports, the multiplicity of “moving parts” and a justifiable wariness on the part of network administrators to punch random holes through firewalls, NFSv3 is not practical to use in a WAN environment. By contrast, NFSv4 integrates many of these functions, and mandates that all traffic (now exclusively TCP) uses the single well-known port 2049.


Plus, NFSv3 is very chatty for WAN usage; and there may be many messages sent between the client and the server to undertake simple activities, such as finding, opening, reading and closing a file. NFSv4 can compound these operations into a single RPC (Remote Procedure Call) and reduce considerably the back-and-forth traffic across the network. The end result is reduced latency.

One of the most annoying NFSv3 “features” has been its handling of locks. Although NFSv3 is stateless, the essential addition of lock management (NLM) to prevent file corruption by competing clients means NFSv3 application recovery is slowed considerably. Very often stale locks have to be manually released, and the lock management is handled external to the protocol. NFSv4’s built-in lock leasing, lock timeouts, and client-server negotiation on recovery simplifies management considerably.

In a change from NFSv3, these locking and delegation features make NFSv4 stateful, but the simplicity of the original design is retained through well-defined recovery semantics in the face of client and server failures and network partitions. These are just some of the benefits that make NFSv4.1 desirable as a modern datacenter protocol, and for use in HPC, database and highly virtualized applications.
NFSv3 is extremely difficult to parallelise, and often takes some vendor-specific “pixie dust” to accomplish. In contrast, pNFS with NFSv4.1brings parallelization directly into the protocol; it allows many streams of data to multiple servers simultaneously, and it supports files as per usual, along with block and object support through an extensible layout mechanism. The management is definitely easier, as NFSv3 automounter maps and hand-created load-balancing schemes are eliminated and, by providing a standardized interface, pNFS ensures fewer issues in supporting multi-vendor NFS server environments.

Next post; the Advantages of NFSv4.1

FOOTNOTE: Parts of this blog were originally published in Usenix ;login: February 2012 under the title The Background to NFSv4.1. Used with permission.