Ethernet in the Age of AI Q&A

AI is having a transformative impact on networking. It’s a topic that the SNIA Data, Storage & Networking Community covered in our live webinar, “Ethernet in the Age of AI: Adapting to New Networking Challenges.” The presentation explored various use cases of AI, the nature of traffic for different workloads, the network impact of these workloads, and how Ethernet is evolving to meet these demands. The webinar audience was highly engaged and asked many interesting questions. Here are the answers to them all.

Q. What is the biggest challenge when designing and operating an AI Scale out fabric?

A. The biggest challenge in designing and operating an AI scale-out fabric is achieving low latency and high bandwidth at scale. AI workloads, like training large neural networks, demand rapid, synchronized data transfers between thousands of GPUs or accelerators. This requires specialized interconnects, such as RDMA, InfiniBand, or NVLink, and optimized topologies like fat-tree or dragonfly to minimize communication delays and bottlenecks.

Balancing scalability with performance is critical; as the system grows, maintaining consistent throughput and minimizing congestion becomes increasingly complex. Additionally, ensuring fault tolerance, power efficiency, and compatibility with rapidly evolving AI workloads adds to the operational challenges.

Unlike standard data center networks, AI fabrics handle intensive east-west traffic patterns that require purpose-built infrastructure. Effective software integration for scheduling and load balancing is equally essential. The need to align performance, cost, and reliability makes designing and managing an AI scale-out fabric a multifaceted and demanding task.

Q. What are the most common misconceptions about AI scale-out fabrics? Read More

Hidden Costs of AI Q&A

At our recent SNIA Networking Storage Forum webinar, “Addressing the Hidden Costs of AI,” our expert team explored the impacts of AI, including sustainability and areas where there are potentially hidden technical and infrastructure costs. If you missed the live event, you can watch it on-demand in the SNIA Educational Library. Questions from the audience ranged from training Large Language Models to fundamental infrastructure changes from AI and more. Here are answers to the audience’s questions from our presenters. Q: Do you have an idea of where the best tradeoff is for high IO speed cost and GPU working cost? Is it always best to spend maximum and get highest IO speed possible? A: It depends on what you are trying to do If you are training a Large Language Model (LLM) then you’ll have a large collection of GPUs communicating with one another regularly (e.g., All-reduce) and doing so at throughput rates that are up to 900GB/s per GPU! For this kind of use case, it makes sense to use the fastest network option available. Any money saved by using a cheaper/slightly less performant transport will be more than offset by the cost of GPUs that are idle while waiting for data. If you are more interested in Fine Tuning an existing model or using Retrieval Augmented Generation (RAG) then you won’t need quite as much network bandwidth and can choose a more economical connectivity option. It’s worth noting Read More

Storage for Automotive Q&A

At our recent SNIA Networking Storage Forum (NSF) webcast “Revving up Storage for Automotive” our expert presenters, Ryan Suzuki and John Kim, discussed storage implications as vehicles are turning into data centers on wheels. If you missed the live event, it is available on-demand together with the presentations slides. Our audience asked several interesting questions on this quickly evolving industry. Here are John and Ryan’s answers to them. Q: What do you think the current storage landscape is missing to support the future of IoV [Internet of Vehicles]? Are there any identified cases of missing features from storage (edge/cloud) which are preventing certain ideas from being implemented and deployed? Read More

Revving Up Storage for Automotive

Each year cars become smarter and more automated. In fact, the automotive industry is effectively transforming the vehicle into a data center on wheels. Connectedness, autonomous driving, and media & entertainment all bring more and more storage onboard and into networked data centers. But all the storage in (and for) a car is not created equal. There are 10s if not 100s of different processors on a car today. Some are attached to storage, some are not and each application demands different characteristics from the storage device. The SNIA Networking Storage Forum (NSF) is exploring this fascinating topic on December 7, 2021 at our live webcast “Revving Up Storage for Automotive” where industry experts from both the storage and automotive worlds will discuss: Read More

Storage for AI Q&A

What types of storage are needed for different aspects of AI? That was one of the many topics covered in our SNIA Networking Storage Forum (NSF) webcast “Storage for AI Applications.” It was a fascinating discussion and I encourage you to check it out on-demand. Our panel of experts answered many questions during the live roundtable Q&A. Here are answers to those questions, as well as the ones we didn’t have time to address. Q. What are the different data set sizes and workloads in AI/ML in terms of data set size, sequential/ random, write/read mix? A. Data sets will vary incredibly from use case to use case. They may be GBs to possibly 100s of PB. In general, the workloads are very heavily reads maybe 95%+. While it would be better to have sequential reads, in general the patterns tend to be closer to random. In addition, different use cases will have very different data sizes. Some may be GBs large, while others may be <1 KB. The different sizes have a direct impact on performance in storage and may change how you decide to store the data. Read More

Storage for Applications Webcast Series

Everyone enjoys having storage that is fast, reliable, scalable, and affordable. But it turns out different applications have different storage needs in terms of I/O requirements, capacity, data sharing, and security.  Some need local storage, some need a centralized storage array, and others need distributed storage—which itself could be local or networked. One application might excel with block storage while another with file or object storage. For example, an OLTP database might require small amounts of very fast flash storage; a media or streaming application might need vast quantities of inexpensive disk storage with extra security safeguards; while a third application might require a mix of different storage tiers with multiple servers sharing the same data. This SNIA Networking Storage Forum “Storage for Applications” webcast series will cover the storage requirements for specific uses such as artificial intelligence (AI), database, cloud, media & entertainment, automotive, edge, and more. With limited resources, it’s important to understand the storage intent of the applications in order to choose the right storage and storage networking strategy, rather than discovering the hard way that you’ve chosen the wrong solution for your application. We kick off this series on October 5, 2020 with “Storage for AI Applications.” AI is a technology which itself encompasses a broad range of use cases, largely divided into training and inference. Read More

Q&A on the Ethics of AI

Earlier this month, the SNIA Cloud Storage Technologies Initiative (CSTI) hosted an intriguing discussion on the Ethics of Artificial Intelligence (AI). Our expert, Rob Enderle, Founder of The Enderle Group, and Eric Hibbard, Chair of the SNIA Security Technical Work Group, shared their experiences and insights on what it takes to keep AI ethical. If you missed the live event, it Is available on-demand along with the presentation slides at the SNIA Educational Library. As promised during the live event, our experts have provided written answers to the questions from this session, many of which we did not have time to get to. Q. The webcast cited a few areas where AI as an attacker could make a potential cyber breach worse, are there also some areas where AI as a defender could make cybersecurity or general welfare more dangerous for humans? Read More

The Effort to Keep Artificial Intelligence Ethical

Artificial Intelligence (AI) technologies are possibly the most substantive and meaningful change to modern business. The ability to process large amounts of data with varying degrees of structure and form enables giant leaps in insight to drive revenue and profit. Likewise, governments and society have significant opportunity for improvement of the lives of the populace through AI. However, with the power that AI brings comes the risks of any technology innovation. The SNIA Cloud Storage Technologies Initiative (CSTI) will explore some of the ethical issues that can arise from AI at our live webcast on March 16, 2021 “The Ethics of Artificial Intelligence.” Our expert speakers, Rob Enderle, President and Principal Analyst at The Enderle Group and Eric Hibbard, Chair of the SNIA Security Technical Work Group, will join me for an interactive discussion on: Read More

Keeping Up with 5G, IoT and Edge Computing

The broad adoption of 5G, Internet of things (IoT) and edge computing will reshape the nature and role of enterprise and cloud storage over the next several years. What building blocks, capabilities and integration methods are needed to make this happen? That will be the topic of discussion at our live SNIA Cloud Storage Technologies webcast on October 21, 2020 “Storage Implications at the Velocity of 5G Streaming.” Join my SNIA expert colleagues, Steve Adams and Chip Maurer, for a discussion on common questions surrounding this topic, including:  Read More

AIOps Q&A

Last month, the SNIA Cloud Storage Technologies Initiative was fortunate to have artificial intelligence (AI) expert, Parviz Peiravi, explore the topic of AI Operations (AIOps) at our live webcast, “IT Modernization with AIOps: The Journey.” Parviz explained why the journey to cloud native and microservices, and the complexity that comes along with that, requires a rethinking of enterprise architecture. If you missed the live presentation, it’s now available on demand together with the webcast slides. We had some interesting questions from our live audience. As promised, here are answers to them all: Q. Can you please define the Data Lake and how different it is from other data storage models?           Read More