Get Started

As the last several years have shown, scaling up AI systems to train larger models with more parameters across more data is a very expensive proposition, and one that has made Nvidia fabulously rich.

But putting AI into production in enterprises, whether they are hyperscalers or regular enterprises, is quite possibly going to be more expensive, particularly as we move away from batch systems and move up to human-machine interactions with GenAI systems and all the way up to machine-machine, or agentic, AI inference.

The biggest bottlenecks in AI systems – compute, memory, and interconnect – are holding back both performance and profitability. These challenges are becoming increasingly apparent as we push the boundaries of AI capabilities.

Estimates from a simulator built by Ayar Labs suggests that the next generation of the GPT foundation model from OpenAI will include 32 different models with a total of 14 trillion parameters. No expected configuration of future iron from Nvidia-based “Rubin” GPU accelerators and improved versions of its existing copper-based NVSwitch interconnects will be able to sufficiently lower the cost of AI inference for this platform while also moving the interactivity of the inference to speeds that are suitable for agentic AI.

This is obviously a problem. If GenAI is to take hold, then something has got to give. And that something is very likely going to be electrical interconnections between AI accelerators and quite possibly even between those accelerators and their HBM stacked memory.

But how should AI accelerator architectures evolve to increase the performance of AI clusters while at the same time boosting their performance to levels that make agentic AI economically – and therefore technically – feasible?

By Timothy Prickett Morgan

Leave a Reply

Your email address will not be published. Required fields are marked *

Privacy Settings
We use cookies to enhance your experience while using our website. If you are using our Services via a browser you can restrict, block or remove cookies through your web browser settings. We also use content and scripts from third parties that may use tracking technologies. You can selectively provide your consent below to allow such third party embeds. For complete information about the cookies we use, data we collect and how we process them, please check our Privacy Policy
Youtube
Consent to display content from - Youtube
Vimeo
Consent to display content from - Vimeo
Google Maps
Consent to display content from - Google
Spotify
Consent to display content from - Spotify
Sound Cloud
Consent to display content from - Sound