The KickstartAI Podcast

Episode 4: VLAs: The GPT Moment for Robotics?


Listen Later

In this episode of the Kickstart AI podcast, we explore Vision Language Action (VLA) models - the breakthrough technology powering the latest generation of AI robots.


The Figure AI Helix Breakthrough:

  • - Figure AI's autonomous humanoid robots

    - How these robots complete complex tasks without teleoperations

    - The significance of robots coordinating actions using their brains (VLA’s)


    What are VLAs?:

    - How VLAs combine vision input, language instructions, and action prediction

    - The critical role of pre-training on internet-scale data

    - Why this represents a "GPT-2 moment" for robotics


    State of the art in VLA’s:

    - Exploring Figure AI's SOTA Helix VLA architecture

    - The dual-system approach inspired by Kahneman's "Thinking Fast and Slow"

    - How this works in practice


  • The Path to Generalization:

    - Current capabilities in handling new objects and environments

    - The impressive coordination between multiple robots

    - Training with just 500 hours of teleoperated data


    The Future of VLAs and humanoid robots:

    - The "Wozniacki Coffee Test" as the next major challenge

    - Novel data collection methods emerging in the industry

    - The potential market for household robot assistants


    Links and References:

    - What started our awe: Figure AI Helix demo video

    - Stanford/Berkeley OpenVLA project

    - X1 technologies’ robot operator job posting

  • ...more
    View all episodesView all episodes
    Download on the App Store

    The KickstartAI PodcastBy KickstartAI