AI Pulse

AI Pulse - Thursday, March 14th 2024


Listen Later

In today's episode, we cover the following papers: First, we discuss Google DeepMind's Gemma family of open language models, including the 7 billion and 2 billion parameter versions, which demonstrate strong performance across various language understanding, reasoning, and safety benchmarks. We then explore key findings and rules of thumb for continually pre-training large language models, such as re-warming and re-decaying learning rates, using replay data, and employing infinite learning rate schedules. Next, we overview the VLOGGER system for generating photorealistic videos of humans talking and moving based solely on audio or text input and a single image, highlighting its novel technical innovations and potential applications. We also summarize the TRUMANS dataset and a diffusion-based model for synthesizing realistic human-scene interactions, enabling controllable generation of human motions adhering to scene geometry and specified actions. Finally, we examine the SOTOPIA-π method for improving the social intelligence of language agents through interactive learning, behavior cloning from GPT-4, and self-reinforcement on positive examples rated by GPT-4, while discussing its limitations and the need for robust evaluation beyond LLM ratings.
...more
View all episodesView all episodes
Download on the App Store

AI PulseBy Pod Genie