Share Causal-JEPA: Learning World Models through Object-Level Latent Interventions

Copy link

February 18, 2026

Causal-JEPA: Learning World Models through Object-Level Latent Interventions

15 minutes

This paper introduces Causal-JEPA (C-JEPA), a novel world modeling framework that integrates object-centric representations with a Joint Embedding Predictive Architecture to improve visual reasoning and robotic planning. By applying object-level latent masking during training, the model is forced to infer the states of missing entities from their surroundings, effectively learning the causal interactions and dependencies between objects. This approach avoids the high computational costs of pixel-level reconstruction, instead focusing on low-dimensional latent space predictions that capture essential environmental dynamics. Experiments on benchmarks like CLEVRER and Push-T demonstrate that C-JEPA significantly enhances counterfactual reasoning and planning efficiency compared to traditional patch-based models. Ultimately, the research shows that treating objects as independent variables through structured masking creates a robust inductive bias for understanding complex, interactive scenes.

...more

View all episodes

By Enoch H. Kang

February 18, 2026

Causal-JEPA: Learning World Models through Object-Level Latent Interventions

15 minutes

...more

Sign up to save your podcasts