The New Stack Podcast

Keeping GPUs Ticking Like Clockwork


Listen Later

Clockwork began with a narrow goal—keeping clocks synchronized across servers—but soon realized that its precise latency measurements could reveal deeper data center networking issues. This insight led the company to build a hardware-agnostic monitoring and remediation platform capable of automatically routing around faults. Today, Clockwork’s technology is especially valuable for large GPU clusters used in training LLMs, where communication efficiency and reliability are critical. CEO Suresh Vasudevan explains that AI workloads are among the most demanding distributed applications ever, and Clockwork provides building blocks that improve visibility, performance and fault tolerance. Its flagship feature, FleetIQ, can reroute traffic around failing switches, preventing costly interruptions that might otherwise force teams to restart training from hours-old checkpoints. Although the company originated from Stanford research focused on clock synchronization for financial institutions, the team eventually recognized that packet-timing data could underpin powerful network telemetry and dynamic traffic control. By integrating with NVIDIA NCCL, TCP and RDMA libraries, Clockwork can not only measure congestion but also actively manage GPU communication to enhance both uptime and training efficiency. 

Learn more from The New Stack about the latest in Clockwork: 

Clockwork’s FleetIQ Aims To Fix AI’s Costly Network Bottleneck 

What Happens When 116 Makers Reimagine the Clock? 

Join our community of newsletter subscribers to stay on top of the news and at the top of your game. 

 

...more
View all episodesView all episodes
Download on the App Store

The New Stack PodcastBy The New Stack

  • 4.3
  • 4.3
  • 4.3
  • 4.3
  • 4.3

4.3

31 ratings


More shows like The New Stack Podcast

View all
Freakonomics Radio by Freakonomics Radio + Stitcher

Freakonomics Radio

32,246 Listeners

The Joe Rogan Experience by Joe Rogan

The Joe Rogan Experience

229,674 Listeners

The Tim Ferriss Show by Tim Ferriss: Bestselling Author, Human Guinea Pig

The Tim Ferriss Show

16,174 Listeners

The New Stack Analysts by The New Stack

The New Stack Analysts

9 Listeners

The New Stack @ Scale by The New Stack

The New Stack @ Scale

3 Listeners

Software Engineering Radio - the podcast for professional software developers by team@se-radio.net (SE-Radio Team)

Software Engineering Radio - the podcast for professional software developers

273 Listeners

Pivot by New York Magazine

Pivot

9,724 Listeners

The a16z Show by Andreessen Horowitz

The a16z Show

1,105 Listeners

Software Engineering Daily by Software Engineering Daily

Software Engineering Daily

626 Listeners

The Reasoning Show by Massive Studios

The Reasoning Show

154 Listeners

The New Stack Context by The New Stack

The New Stack Context

4 Listeners

DevOps Paradox by Darin Pope & Viktor Farcic

DevOps Paradox

25 Listeners

All-In with Chamath, Jason, Sacks & Friedberg by All-In Podcast, LLC

All-In with Chamath, Jason, Sacks & Friedberg

10,254 Listeners

Dwarkesh Podcast by Dwarkesh Patel

Dwarkesh Podcast

551 Listeners

Hard Fork by The New York Times

Hard Fork

5,576 Listeners

The Rest Is History by Goalhanger

The Rest Is History

15,506 Listeners