
Sign up to save your podcasts
Or
This research paper explores whether training large language models (LLMs) to be truthful could make them politically biased, specifically leaning towards liberal viewpoints. The researchers trained different models on datasets designed to teach the models about truthfulness in everyday facts and scientific information. They then tested these models using a dataset of paired statements on various political topics, with one statement leaning left and the other leaning right. They found that most models trained on truthfulness datasets showed a left-leaning bias, especially larger models. The researchers also tested pre-existing models trained on general human preferences and found a similar left-leaning bias, particularly with larger models. This suggests that focusing on truthfulness during training might unintentionally introduce a political slant. However, the researchers acknowledge the limitations of using datasets to represent truth and the complexities of defining political leanings, calling for further investigation into this relationship.
https://arxiv.org/pdf/2409.05283v2
This research paper explores whether training large language models (LLMs) to be truthful could make them politically biased, specifically leaning towards liberal viewpoints. The researchers trained different models on datasets designed to teach the models about truthfulness in everyday facts and scientific information. They then tested these models using a dataset of paired statements on various political topics, with one statement leaning left and the other leaning right. They found that most models trained on truthfulness datasets showed a left-leaning bias, especially larger models. The researchers also tested pre-existing models trained on general human preferences and found a similar left-leaning bias, particularly with larger models. This suggests that focusing on truthfulness during training might unintentionally introduce a political slant. However, the researchers acknowledge the limitations of using datasets to represent truth and the complexities of defining political leanings, calling for further investigation into this relationship.
https://arxiv.org/pdf/2409.05283v2