
Sign up to save your podcasts
Or


Meta GenAI Infra Blog Review // Special MLOps Podcast episode by Demetrios.
[00:00] Meta handles trillions of AI model executions
[07:01] Meta creating AGI, ethical and sustainable
[08:13] Concerns about energy use in training models
[12:22] Network, hardware, and job optimization for reliability
[17:21] Highlights of Arista and Nvidia hardware architecture
[20:11] Meta's clusters optimized for efficient fabric
[24:40] Varied steps, careful checkpointing in AI training
[28:46] Meta is maintaining huge GPU clusters for AI
[29:47] AI training is faster and more demanding
[35:27] Ops planner orchestrates a million operations and reduces maintenance
[37:15] Ops planner ensures safety and well-tested changes
By Demetrios4.6
2323 ratings
Meta GenAI Infra Blog Review // Special MLOps Podcast episode by Demetrios.
[00:00] Meta handles trillions of AI model executions
[07:01] Meta creating AGI, ethical and sustainable
[08:13] Concerns about energy use in training models
[12:22] Network, hardware, and job optimization for reliability
[17:21] Highlights of Arista and Nvidia hardware architecture
[20:11] Meta's clusters optimized for efficient fabric
[24:40] Varied steps, careful checkpointing in AI training
[28:46] Meta is maintaining huge GPU clusters for AI
[29:47] AI training is faster and more demanding
[35:27] Ops planner orchestrates a million operations and reduces maintenance
[37:15] Ops planner ensures safety and well-tested changes

1,296 Listeners

288 Listeners

1,105 Listeners

626 Listeners

583 Listeners

306 Listeners

343 Listeners

212 Listeners

551 Listeners

512 Listeners

150 Listeners

101 Listeners

228 Listeners

688 Listeners

34 Listeners