
Sign up to save your podcasts
Or
In this video interview, Harmail Chatha, senior director of cloud computing operations at Nutanix, describes the growing challenges of managing data centers as business demands for enterprise AI applications climb.
Find more enterprise cloud news, features stories and profiles at The Forecast.
Transcript:
Harmail Chatha: You have GPU clouds available like AWS Google and Azure, but they’re kind of the true ias PaaS platforms. Now you actually have clouds that offer you bare metal with GPUs. And what’s happening across the industry is there’s a lot of companies that are deploying these in traditional data centers and taking up a lot of the power and space as well, because we’ve always had this concept of hyperdepth racks where we optimize for vertical growth versus horizontal growth. But what’s happening within the data centers now is a lot of horizontal growth because of not enough power available, not enough infrastructure available to support the power needs of GPU environments. And obviously cooling isn’t there as well. So I don’t think AI is necessarily pushing the limits within data centers yet, but it’s going to very, very soon if data centers don’t start to adapt to new technologies, new cooling infrastructure, new power densities that are required as well. So I think we’re going to start seeing the limitations within data centers, but I don’t believe it’s there yet. But as more and more consumption goes in and customers identify workloads that they’re going to be running with ai, I definitely see it hitting a limit.
[Related: AI Lifecycle’s Impact on IT Infrastructure]
They’re at the heart of the power problem in the data centers right now as well, right? These newer generations of CPUs with GPUs are consuming anywhere from 30 to 50% more power within the servers. Hence, the industry is really taxed from a power consumption perspective. Whereas you can deploy a full rack of the older gear now you can only deploy half a rack. So how do you solve that problem? So now, whereas we historically have been 17.3 kilowatts per rack, fully maximizing the rack. Now our new design, it’s going to be 34 plus kilowatts per rack with liquid cooling to the rack and ultimately getting to the chip as well. So we’re really at the onset of designing our data center of the future also, because what’s legacy is not going to work any longer. It’s going to be super inefficient. Cooling challenges within the data center. Air cooling is not going to be enough with this new AI technology going in and the consumption of power within the GPUs as well.
[Related: Report Shows Enterprise AI Driving Big Investment Burst in Cloud Services]
Companies have to start really honing in or zooming into their environments. Not so much holistically at a data center level, but what does a workload look like and how do you measure the emissions of that workload in itself? Right? And we’re just kind of touching the surface on scope one, scope two, how do you really measure scope three, which is the most challenging one? It’s basically considered everything else that’s not direct emissions, indirect emissions, but scope three being all encompassing. How do you get to embody emissions as well as an industry? We’re not there yet embodied emissions of a server. So we’re talking about VMs to workloads, but embodied emissions means what’s that single little cable within the system, the server itself, and how do you measure the emissions of that? There’s thousands, hundreds of thousands of parts that go into a server. How do suppliers measure the transportation cost and the development cost of those components as well? So really it’s all about zooming in right now, right? As we continue to mature in this space, there’s a lot of effort that’s going to go into measuring, and there’s so many companies, new startups coming out that are starting to just touch the surface of how do you measure emissions in itself?
[Related: Guiding Enterprise IT Hardware Buyers into the AI Future]
ROI has a lot of interest from companies wanting to learn and understand sustainability. It’s no longer like a hypothetical topic or a conversation, but kind of roll your sleeves up and you have to take the step, take the initiative to first kind of educate yourself, what is it that you need to do, understand the different scopes there are, and really start to measure what your footprint looks like. So what I’m seeing is, of course a lot of interest in the industry, but I think what’s happening with some of the, for example, us who’ve been on this journey for the last three plus years now, or some of the more mature companies that are been measuring their carbon footprint, we’ve been doing this at a very holistic data center level, at a building level, and then kind of have gone down to a customer’s environment level.
[Related: Get a Grip on Data Storage in Quest for Enterprise AI]
So we have data halls, we have cages, so we’re able to measure our footprint there. But now, as we announced just earlier today in the opening keynote is worse within (Nutanix) Prism Central application, we can measure the electrical consumption of a node, and ultimately you can get to a cluster. So we’re going a step deeper versus just being holistic at a data center level now. So this is a good step in the right direction, but where we ultimately need to go and continue to do more work is you got to get to the VM level. Once you can measure the vm, then you got to get to the workload level, and that’s when you’re going to be able to make smart and intelligent decisions on what a workload consumption looks like, correlate that back to the emission factor, and then intelligently you’re able to move those applications around to more sustainable data centers that might have lower P use more renewable energy as well. So I think that’s the journey we’re on. I’m glad we’re measuring at the node level, but ultimately we got to get to that VM and application level as well.
5
55 ratings
In this video interview, Harmail Chatha, senior director of cloud computing operations at Nutanix, describes the growing challenges of managing data centers as business demands for enterprise AI applications climb.
Find more enterprise cloud news, features stories and profiles at The Forecast.
Transcript:
Harmail Chatha: You have GPU clouds available like AWS Google and Azure, but they’re kind of the true ias PaaS platforms. Now you actually have clouds that offer you bare metal with GPUs. And what’s happening across the industry is there’s a lot of companies that are deploying these in traditional data centers and taking up a lot of the power and space as well, because we’ve always had this concept of hyperdepth racks where we optimize for vertical growth versus horizontal growth. But what’s happening within the data centers now is a lot of horizontal growth because of not enough power available, not enough infrastructure available to support the power needs of GPU environments. And obviously cooling isn’t there as well. So I don’t think AI is necessarily pushing the limits within data centers yet, but it’s going to very, very soon if data centers don’t start to adapt to new technologies, new cooling infrastructure, new power densities that are required as well. So I think we’re going to start seeing the limitations within data centers, but I don’t believe it’s there yet. But as more and more consumption goes in and customers identify workloads that they’re going to be running with ai, I definitely see it hitting a limit.
[Related: AI Lifecycle’s Impact on IT Infrastructure]
They’re at the heart of the power problem in the data centers right now as well, right? These newer generations of CPUs with GPUs are consuming anywhere from 30 to 50% more power within the servers. Hence, the industry is really taxed from a power consumption perspective. Whereas you can deploy a full rack of the older gear now you can only deploy half a rack. So how do you solve that problem? So now, whereas we historically have been 17.3 kilowatts per rack, fully maximizing the rack. Now our new design, it’s going to be 34 plus kilowatts per rack with liquid cooling to the rack and ultimately getting to the chip as well. So we’re really at the onset of designing our data center of the future also, because what’s legacy is not going to work any longer. It’s going to be super inefficient. Cooling challenges within the data center. Air cooling is not going to be enough with this new AI technology going in and the consumption of power within the GPUs as well.
[Related: Report Shows Enterprise AI Driving Big Investment Burst in Cloud Services]
Companies have to start really honing in or zooming into their environments. Not so much holistically at a data center level, but what does a workload look like and how do you measure the emissions of that workload in itself? Right? And we’re just kind of touching the surface on scope one, scope two, how do you really measure scope three, which is the most challenging one? It’s basically considered everything else that’s not direct emissions, indirect emissions, but scope three being all encompassing. How do you get to embody emissions as well as an industry? We’re not there yet embodied emissions of a server. So we’re talking about VMs to workloads, but embodied emissions means what’s that single little cable within the system, the server itself, and how do you measure the emissions of that? There’s thousands, hundreds of thousands of parts that go into a server. How do suppliers measure the transportation cost and the development cost of those components as well? So really it’s all about zooming in right now, right? As we continue to mature in this space, there’s a lot of effort that’s going to go into measuring, and there’s so many companies, new startups coming out that are starting to just touch the surface of how do you measure emissions in itself?
[Related: Guiding Enterprise IT Hardware Buyers into the AI Future]
ROI has a lot of interest from companies wanting to learn and understand sustainability. It’s no longer like a hypothetical topic or a conversation, but kind of roll your sleeves up and you have to take the step, take the initiative to first kind of educate yourself, what is it that you need to do, understand the different scopes there are, and really start to measure what your footprint looks like. So what I’m seeing is, of course a lot of interest in the industry, but I think what’s happening with some of the, for example, us who’ve been on this journey for the last three plus years now, or some of the more mature companies that are been measuring their carbon footprint, we’ve been doing this at a very holistic data center level, at a building level, and then kind of have gone down to a customer’s environment level.
[Related: Get a Grip on Data Storage in Quest for Enterprise AI]
So we have data halls, we have cages, so we’re able to measure our footprint there. But now, as we announced just earlier today in the opening keynote is worse within (Nutanix) Prism Central application, we can measure the electrical consumption of a node, and ultimately you can get to a cluster. So we’re going a step deeper versus just being holistic at a data center level now. So this is a good step in the right direction, but where we ultimately need to go and continue to do more work is you got to get to the VM level. Once you can measure the vm, then you got to get to the workload level, and that’s when you’re going to be able to make smart and intelligent decisions on what a workload consumption looks like, correlate that back to the emission factor, and then intelligently you’re able to move those applications around to more sustainable data centers that might have lower P use more renewable energy as well. So I think that’s the journey we’re on. I’m glad we’re measuring at the node level, but ultimately we got to get to that VM and application level as well.