March 11, 2025

Episode 4: VLAs: The GPT Moment for Robotics?

25 minutes

In this episode of the Kickstart AI podcast, we explore Vision Language Action (VLA) models - the breakthrough technology powering the latest generation of AI robots.

The Figure AI Helix Breakthrough:

- Figure AI's autonomous humanoid robots

- How these robots complete complex tasks without teleoperations

- The significance of robots coordinating actions using their brains (VLA’s)

What are VLAs?:

- How VLAs combine vision input, language instructions, and action prediction

- The critical role of pre-training on internet-scale data

- Why this represents a "GPT-2 moment" for robotics

State of the art in VLA’s:

- Exploring Figure AI's SOTA Helix VLA architecture

- The dual-system approach inspired by Kahneman's "Thinking Fast and Slow"

- How this works in practice

The Path to Generalization:

- Current capabilities in handling new objects and environments

- The impressive coordination between multiple robots

- Training with just 500 hours of teleoperated data

The Future of VLAs and humanoid robots:

- The "Wozniacki Coffee Test" as the next major challenge

- Novel data collection methods emerging in the industry

- The potential market for household robot assistants

Links and References:

- What started our awe: Figure AI Helix demo video

- Stanford/Berkeley OpenVLA project

- X1 technologies’ robot operator job posting

...more

View all episodes

By KickstartAI

March 11, 2025

Episode 4: VLAs: The GPT Moment for Robotics?

25 minutes

In this episode of the Kickstart AI podcast, we explore Vision Language Action (VLA) models - the breakthrough technology powering the latest generation of AI robots.

The Figure AI Helix Breakthrough:

- Figure AI's autonomous humanoid robots

- How these robots complete complex tasks without teleoperations

- The significance of robots coordinating actions using their brains (VLA’s)

What are VLAs?:

- How VLAs combine vision input, language instructions, and action prediction

- The critical role of pre-training on internet-scale data

- Why this represents a "GPT-2 moment" for robotics

State of the art in VLA’s:

- Exploring Figure AI's SOTA Helix VLA architecture

- The dual-system approach inspired by Kahneman's "Thinking Fast and Slow"

- How this works in practice

The Path to Generalization:

- Current capabilities in handling new objects and environments

- The impressive coordination between multiple robots

- Training with just 500 hours of teleoperated data

The Future of VLAs and humanoid robots:

- The "Wozniacki Coffee Test" as the next major challenge

- Novel data collection methods emerging in the industry

- The potential market for household robot assistants

Links and References:

- What started our awe: Figure AI Helix demo video

- Stanford/Berkeley OpenVLA project

- X1 technologies’ robot operator job posting

...more

Share Episode 4: VLAs: The GPT Moment for Robotics?

Sign up to save your podcasts

Episode 4: VLAs: The GPT Moment for Robotics?

Episode 4: VLAs: The GPT Moment for Robotics?