Storage for AI Q&A

What types of storage are needed for different aspects of AI? That was one of the many topics covered in our SNIA Networking Storage Forum (NSF) webcast “Storage for AI Applications.” It was a fascinating discussion and I encourage you to check it out on-demand. Our panel of experts answered many questions during the live roundtable Q&A. Here are answers to those questions, as well as the ones we didn’t have time to address. Q. What are the different data set sizes and workloads in AI/ML in terms of data set size, sequential/ random, write/read mix? A. Data sets will vary incredibly from use case to use case. They may be GBs to possibly 100s of PB. In general, the workloads are very heavily reads maybe 95%+. While it would be better to have sequential reads, in general the patterns tend to be closer to random. In addition, different use cases will have very different data sizes. Some may be GBs large, while others may be <1 KB. The different sizes have a direct impact on performance in storage and may change how you decide to store the data. Read More

Automating Discovery for NVMe IP-based SANs

NVMe® IP-based SANs (including transports such as TCP, RoCE, and iWARP) have the potential to provide significant benefits in application environments ranging from the Edge to the Data Center. However, before we can fully unlock the potential of the NVMe IP-based SAN, we first need to address the manual and error prone process that is currently used to establish connectivity between NVMe Hosts and NVM subsystems.  This process includes administrators explicitly configuring each Host to access the appropriate NVM subsystems in their environment. In addition, any time an NVM Subsystem interface is added or removed, a Host administrator may need to explicitly update the configuration of impacted hosts to reflect this change. Due to the decentralized nature of this configuration process, using it to manage connectivity for more than a few Host and NVM subsystem interfaces is impractical and adds complexity when deploying an NVMe IP-based SAN in environments that require a high-degrees of automation. Read More

Q&A (Part 2) from “Storage Trends for 2021 and Beyond” Webcast

Questions from “Storage Trends for 2021 and Beyond” Webcast Answered

This is part two of the Q&A portion of the roundtable talk between Rick Kutcipal, board director, SCSI Trade Association (STA); Jeff Janukowicz, Research vice president at IDC; and Chris Preimesberger, former editor-in-chief of eWeek, where they discussed prominent data storage technologies shaping the market. If you missed this webcast titled “Storage Trends for 2021 and Beyond,” it’s available on demand here.

Part One of the Q&A can be found at https://www.scsita.org/library/qa-part-1-from-storage-trends-for-2021-and-beyond-webcast/.

Read More

See You – Virtually – at SDC 2021

SNIA Storage Developer Conference goes virtual September 28-29 2021, and compute, memory, and storage are important topics.  SNIA Compute, Memory, and Storage Initiative is a sponsor of SDC 2021 – so visit our booth for the latest information and a chance to chat with our experts.  With over 120 sessions available to watch live during the event and later on-demand, live Birds of a Feather chats, and a Persistent Memory Bootcamp and Hackathon accessing new systems in the cloud, we want to make sure you don’t miss anything!  Register here to see sessions live – or on demand to your schedule. Agenda highlights include: Read More

Storage at the Edge Q&A

The ability to run analytics from the data center to the Edge, where the data is generated and lives creates new use cases for nearly every business. The impact of Edge computing on storage strategy was the topic at our recent SNIA Cloud Storage Technologies Initiative (CSTI) webcast, “Extending Storage to the Edge – How It Should Affect Your Storage Strategy.” If you missed the live event, it’s available on-demand. Our experts, Erin Farr, Senior Technical Staff Member, IBM Storage CTO Innovation Team and Vincent Hsu, IBM Fellow, VP & CTO for Storage received several interesting questions during the live event. As promised, here are answers to them all. Q. What is the core principle of Edge computing technology? A. Edge computing is an industry trend rather than a standardized architecture, though there are organizations like LF EDGE with the objective of establishing an open, interoperable framework. Edge computing is generally about moving the workloads closer to where the data is generated and creating new innovative workloads due to that proximity. Common principles often include the ability to manage Edge devices at scale, using open technologies to create portable solutions, and of ultimately doing all of this with enterprise levels of security. Reference architectures exist for guidance, though implementations can vary greatly by industry vertical. Q. We all know connectivity is not guaranteed – how does that affect these different use cases? What are the HA implications? Read More

Next-generation Interconnects: The Critical Importance of Connectors and Cables

Modern data centers consist of hundreds of subsystems connected with optical transceivers, copper cables, and industry standards-based connectors. As data demands escalate, it drives the throughput of these interconnects to increase rapidly, making the maximum reach of copper cabling very short. At the same time, data centers are expanding in size, with nodes stretching further apart. This is making longer-reach optical technologies much more popular. However, optical interconnect technologies are more costly and complex than copper with many new buzz-words and technology concepts. The rate of change from the vast uptick in data demand accelerates new product development at an incredible pace. While much of the enterprise is still on 10/40/100GbE and 128GFC speeds, the optical standards bodies are beginning to deliver 800G, with 1.6Tb transceivers in discussion! The introduction of new technologies creates a paradigm shift that requires changes and adjustments throughout the network. Read More

Genomics Compute, Storage & Data Management Q&A

Everyone knows data is growing at exponential rates. In fact, the numbers can be mind-numbing. That’s certainly the case when it comes to genomic data where 40,000PB of storage each year will be needed by 2025. Understanding, managing and storing this massive amount of data was the topic at our SNIA Cloud Storage Technologies Initiative webcast “Moving Genomics to the Cloud: Compute and Storage Considerations.” If you missed the live presentation, it’s available on-demand along with presentation slides. Our live audience asked many interesting questions during the webcast, but we did not have time to answer them all. As promised, our experts, Michael McManus, Torben Kling Petersen and Christopher Davidson have answered them all here. Q. Human genomes differ only by 1% or so, there’s an immediate 100x improvement in terms of data compression, 2743EB could become 27430PB, that’s 2.743M HDDs of 10TB each. We have ~200 countries for the 7.8B people, and each country could have 10 sequencing centers on average, each center would need a mere 1.4K HDDs, is there really a big challenge here? A. The problem is not that simple unfortunately. The location of genetic differences and the size of the genetic differences vary a lot across people. Still, there are compression methods like CRAM and PetaGene that can save a lot of space. Also consider all of the sequencing for rare disease, cancer, single cell sequencing, etc. plus sequencing for agricultural products. Q. What’s the best compression ratio for human genome data? Read More

Demystifying the Fibre Channel SAN Protocol

Every wonder how Fibre Channel (FC) hosts and targets really communicate? Join the SNIA Networking Storage Forum (NSF) on September 23, 2021 for a live webcast, “How Fibre Channel Hosts and Targets Really Communicate.” This SAN overview will dive into details on how initiators (hosts) and targets (storage arrays) communicate and will address key questions, like:
  • How do FC links activate?
  • Is FC routable?
  • What kind of flow control is present in FC?
  • How do initiators find targets and set up their communication?
  • Finally, how does actual data get transferred between initiators and hosts, since that is the ultimate goal?
Read More

Storage for Applications Webcast Series

Everyone enjoys having storage that is fast, reliable, scalable, and affordable. But it turns out different applications have different storage needs in terms of I/O requirements, capacity, data sharing, and security.  Some need local storage, some need a centralized storage array, and others need distributed storage—which itself could be local or networked. One application might excel with block storage while another with file or object storage. For example, an OLTP database might require small amounts of very fast flash storage; a media or streaming application might need vast quantities of inexpensive disk storage with extra security safeguards; while a third application might require a mix of different storage tiers with multiple servers sharing the same data. This SNIA Networking Storage Forum “Storage for Applications” webcast series will cover the storage requirements for specific uses such as artificial intelligence (AI), database, cloud, media & entertainment, automotive, edge, and more. With limited resources, it’s important to understand the storage intent of the applications in order to choose the right storage and storage networking strategy, rather than discovering the hard way that you’ve chosen the wrong solution for your application. We kick off this series on October 5, 2020 with “Storage for AI Applications.” AI is a technology which itself encompasses a broad range of use cases, largely divided into training and inference. Read More

Can Cloud Storage and Big Data Live Happily Ever After?

“Big Data” has pushed the storage envelope, creating a seemingly perfect relationship with Cloud Storage. But local storage is the third wheel in this relationship, and won’t go down easy. Can this marriage survive when Big Data is being pulled in two directions? Should Big Data pick one, or can the three of them live happily ever after? This will be the topic of discussion on October 21, 2021 at our live SNIA Cloud Storage Technologies webcast, “Cloud Storage and Big Data, A Marriage Made in the Clouds.” Join us as our SNIA experts will cover: Read More