
Sign up to save your podcasts
Or


Meta GenAI Infra Blog Review // Special MLOps Podcast episode by Demetrios.
[00:00] Meta handles trillions of AI model executions
[07:01] Meta creating AGI, ethical and sustainable
[08:13] Concerns about energy use in training models
[12:22] Network, hardware, and job optimization for reliability
[17:21] Highlights of Arista and Nvidia hardware architecture
[20:11] Meta's clusters optimized for efficient fabric
[24:40] Varied steps, careful checkpointing in AI training
[28:46] Meta is maintaining huge GPU clusters for AI
[29:47] AI training is faster and more demanding
[35:27] Ops planner orchestrates a million operations and reduces maintenance
[37:15] Ops planner ensures safety and well-tested changes
By Demetrios4.6
2323 ratings
Meta GenAI Infra Blog Review // Special MLOps Podcast episode by Demetrios.
[00:00] Meta handles trillions of AI model executions
[07:01] Meta creating AGI, ethical and sustainable
[08:13] Concerns about energy use in training models
[12:22] Network, hardware, and job optimization for reliability
[17:21] Highlights of Arista and Nvidia hardware architecture
[20:11] Meta's clusters optimized for efficient fabric
[24:40] Varied steps, careful checkpointing in AI training
[28:46] Meta is maintaining huge GPU clusters for AI
[29:47] AI training is faster and more demanding
[35:27] Ops planner orchestrates a million operations and reduces maintenance
[37:15] Ops planner ensures safety and well-tested changes

1,097 Listeners

627 Listeners

302 Listeners

346 Listeners

146 Listeners

226 Listeners

205 Listeners

97 Listeners

522 Listeners

133 Listeners

228 Listeners

35 Listeners

22 Listeners

42 Listeners

71 Listeners