
Sign up to save your podcasts
Or
This work introduces a novel approach and a new benchmark for advancing embodied AI agents operating in 3D environments. The proposed model, 3DLLM-MEM, is designed with a dual-memory system, combining a limited working memory with a long-term episodic memory using dense 3D representations to handle complex tasks requiring spatial-temporal reasoning and interaction with objects across multiple rooms over extended periods. The researchers also present 3DMEM-BENCH, a comprehensive benchmark featuring various tasks, including embodied tasks, question answering, and captioning, specifically curated to evaluate the ability of these AI agents to maintain and utilize long-term spatial-temporal memory in realistic settings. Experiments on this benchmark demonstrate that 3DLLM-MEM significantly outperforms existing methods, particularly on challenging tasks requiring robust long-term memory capabilities.
This work introduces a novel approach and a new benchmark for advancing embodied AI agents operating in 3D environments. The proposed model, 3DLLM-MEM, is designed with a dual-memory system, combining a limited working memory with a long-term episodic memory using dense 3D representations to handle complex tasks requiring spatial-temporal reasoning and interaction with objects across multiple rooms over extended periods. The researchers also present 3DMEM-BENCH, a comprehensive benchmark featuring various tasks, including embodied tasks, question answering, and captioning, specifically curated to evaluate the ability of these AI agents to maintain and utilize long-term spatial-temporal memory in realistic settings. Experiments on this benchmark demonstrate that 3DLLM-MEM significantly outperforms existing methods, particularly on challenging tasks requiring robust long-term memory capabilities.