
Sign up to save your podcasts
Or


This academic article investigates the emergence of social conventions in populations of large language models (LLMs), exploring whether these AI agents can establish shared behaviors through interaction. The research demonstrates that LLM populations spontaneously develop these conventions, similar to human groups, and that collective biases can arise during this process, even without individual agent bias. Furthermore, the study examines the impact of minority groups of adversarial agents, revealing that they can reach a critical mass to influence and alter the established social conventions within the LLM population. These findings highlight the potential for AI systems to autonomously develop norms and the implications for designing aligned and robust AI systems.
By Enoch H. KangThis academic article investigates the emergence of social conventions in populations of large language models (LLMs), exploring whether these AI agents can establish shared behaviors through interaction. The research demonstrates that LLM populations spontaneously develop these conventions, similar to human groups, and that collective biases can arise during this process, even without individual agent bias. Furthermore, the study examines the impact of minority groups of adversarial agents, revealing that they can reach a critical mass to influence and alter the established social conventions within the LLM population. These findings highlight the potential for AI systems to autonomously develop norms and the implications for designing aligned and robust AI systems.