Grey Beards on Systems

168: GreyBeards Year End 2024 podcast


Listen Later

It’s time once again for our annual YE GBoS podcast. This year we have Howard back making a guest appearance with our usual cast of Jason and Keith in attendance. And the topic de jour seemed to be AI rolling out to the enterprise and everywhere else in the IT world. 

We led off with our discussion from last year, AI (again) but then it was all about new announcements, new capabilities and new functionality. This year it’s all about starting to take AI tools and functionality and make them available to help optimize organizational functionality.

We talked some about RAGs and Chatbots but these seemed almost old school.

Agentic AI

Keith mentioned Agentic AI which purports to improve businesses by removing/optimizing intermediate steps in business processes. If one can improve human and business productivity by 10%, the impact on the US and world’s economies would  be staggering.

And we’re not just talking about knowledge summarization, curation, or discussion, agentic AI takes actions that would have been previously done by a human, if done at all.  

Manufacturers could use AI agents to forecast sales, allowing the business to optimize inventory positioning to better address customer needs. 

Most, if not all, businesses have elaborate procedures which require a certain amount of human hand holding. Reducing human hand holding, even a little bit, with AI agents, that never slees, and can occasionally be trained to do better, could seriously help the bottom and top lines for any organization 

We can see evidence of Agentic AI proliferating in SAAS solutions, i.e., SalesForce, SAP, Oracle and all others are spinning out Agentic AI services.

I think it was Jason that mentioned GEICO, a US insurance company, is re-factoring, re-designing and re-implementing all their applications to take advantage of Agentic AI and other AI options. 

AI’s impact on HW & SW infrastructure

The AI rollout is having dramatic impacts on both software and hardware infrastructure. For example, customers are building their own OpenStack clouds to support AI training and inferencing.

Keith mentioned that AWS just introduced S3 Tables, a fully managed services meant to store and analyze massive amounts of tabular data for analytics. Howard mentioned that AWS’s S3 Tables had to make a number of tradeoffs to use immutable S3 object storage. VAST’s Parquet database provides the service without using immutable objects.

Software impacts are immense as AI becomes embedded in more and more applications and system infrastructure. But AI’s hardware impacts may be even more serious.

Howard made mention of the power zero sum game, meaning that most data centers have a limited amount of power they support. Any power saved from other IT activities are immediately put to use to supply more power to AI training and infererencing.

Most IT racks today support equipment that consumes 10-20Kw of power. AI servers will require much more

Jason mentioned one 6u server with 8 GPUS that cost on the order of 1 Ferrari ($250K US), draws 10Kw of power, with each GPU having 2-400 GigE links not to mention the server itself having 2-400 GigE links. So a single 6U (GPU) server has 18-400GbE links or could need 7.2Tb of bandwidth.

Unclear how many of these one could put in a rack but my guess is it’s not going to be fully populated. 6 of these servers would need >42Tb of bandwidth and over 60Kw of power and that’s not counting the networking and other infrastructure required to support all that bandwidth.  

Speaking of other infrastructure, cooling is the other side of this power problem. It’s just thermodynamics, power use generates heat, that heat needs to be disposed of. And with 10Kw servers we are talking a lot of heat. Jason mentioned that at this year’s SC24 conference, the whole floor was showing off liquid cooling.  Liquid cooling was also prominent at OCP.

At the OCP summit this year Microsoft was talking about deploying near term 150Kw racks and down the line 1Mw racks. AI’s power needs are why organizations around the world are building out new data centers in out of the way places that just so happen to have power and cooling nearby. 

Organizations have an insatiable appetite for AI training data. And good (training) data is getting harder to find. Solidigm latest 122TB SSD may be coming along just when the data needs for AI are starting to take off.

SCI is pivoting

We could have gone on for hours on AI’s impact on IT infrastructure, but I had an announcement to make.

Silverton Consulting will be pivoting away from storage to a new opportunity that is based in space. I discuss this on SCI’s website but the opportunities for LEO and beyond services are just exploding these days and we want to be a part of that. 

What that means for GBoS is TBD. But we may be transitioning to something more broader than just storage. But heck we have been doing that for years.

Stay tuned, it’s going to be one hell of a ride

Jason Collier, Principal Member Of Technical Staff at AMD, Data Center and Embedded Solutions Business Group

Jason Collier (@bocanuts) is a long time friend, technical guru and innovator who has over 25 years of experience as a serial entrepreneur in technology.

He was founder and CTO of Scale Computing and has been an innovator in the field of hyperconvergence and an expert in virtualization, data storage, networking, cloud computing, data centers, and edge computing for years.

He’s on LinkedIN. He’s currently working with AMD on new technology and he has been a GreyBeards on Storage co-host since the beginning of 2022

Howard Marks, Technologist Extraordinary and Plenipotentiary at VAST Data

Howard Marks is Technologist Extraordinary and Plenipotentiary at VAST Data, where he explains engineering to customers and customer requirements to engineers.

Before joining VAST, Howard was an independent consultant, analyst, and journalist, writing three books and over 200 articles on network and storage topics since 1987 and, most significantly, a founding co-host of the Greybeards on Storage podcast.

Keith Townsend, President of The CTO Advisor, a Futurum Group Company

Keith Townsend (@CTOAdvisor) is a IT thought leader who has written articles for many industry publications, interviewed many industry heavyweights, worked with Silicon Valley startups, and engineered cloud infrastructure for large government organizations. Keith is the co-founder of The CTO Advisor, blogs at Virtualized Geek, and can be found on LinkedIN.

...more
View all episodesView all episodes
Download on the App Store

Grey Beards on SystemsBy Ray Lucchesi and others

  • 4.8
  • 4.8
  • 4.8
  • 4.8
  • 4.8

4.8

18 ratings


More shows like Grey Beards on Systems

View all
The Daily by The New York Times

The Daily

111,423 Listeners

Unexplored Territory by Duncan Epping

Unexplored Territory

11 Listeners

Oxide and Friends by Oxide Computer Company

Oxide and Friends

47 Listeners