March 01, 2026

EP107: DeepMind’s SIMA 2 Masters Unseen Video Games

22 minutes

SIMA 2 (Scalable Instructable Multiworld Agent) is a generalist AI agent developed by Google DeepMind that navigates and operates within 3D virtual worlds. By integrating Gemini models, SIMA 2 evolves beyond its predecessor's basic instruction-following abilities, acting as an interactive companion that can reason about its goals, converse with users, and explain the steps it intends to take.

The core advancements of SIMA 2 include:

Advanced Reasoning and Multimodal Understanding: It can interpret high-level user intent, process abstract logical commands, and understand multimodal inputs, including on-screen user sketches, different languages, and emojis.
Broad Generalization: SIMA 2 can transfer conceptual knowledge across different environments (such as applying the concept of "mining" from one game to "harvesting" in another). This allows it to successfully perform tasks in games it has never been trained on, like ASKA and MineDojo, as well as in real-time, newly-imagined worlds generated by the Genie 3 model.
Autonomous Self-Improvement: The agent utilizes an iterative improvement cycle where it learns through trial-and-error and feedback provided by Gemini. This allows SIMA 2 to master increasingly complex tasks in new worlds entirely through self-directed play, without needing additional human-generated data or interventions.

While SIMA 2 still faces research challenges—such as handling long-horizon tasks that require extensive multi-step reasoning, overcoming short memory limits, and executing precise low-level actions—it represents a major milestone toward Artificial General Intelligence (AGI). The competencies it develops in virtual gaming environments, like navigation and collaborative task execution, serve as foundational building blocks for the future of embodied AI and physical robotics.

...more

View all episodes

By Yun Wu

March 01, 2026

EP107: DeepMind’s SIMA 2 Masters Unseen Video Games

22 minutes

The core advancements of SIMA 2 include:

Advanced Reasoning and Multimodal Understanding: It can interpret high-level user intent, process abstract logical commands, and understand multimodal inputs, including on-screen user sketches, different languages, and emojis.
Broad Generalization: SIMA 2 can transfer conceptual knowledge across different environments (such as applying the concept of "mining" from one game to "harvesting" in another). This allows it to successfully perform tasks in games it has never been trained on, like ASKA and MineDojo, as well as in real-time, newly-imagined worlds generated by the Genie 3 model.
Autonomous Self-Improvement: The agent utilizes an iterative improvement cycle where it learns through trial-and-error and feedback provided by Gemini. This allows SIMA 2 to master increasingly complex tasks in new worlds entirely through self-directed play, without needing additional human-generated data or interventions.

...more

Share EP107: DeepMind’s SIMA 2 Masters Unseen Video Games

Sign up to save your podcasts

EP107: DeepMind’s SIMA 2 Masters Unseen Video Games

EP107: DeepMind’s SIMA 2 Masters Unseen Video Games