
Sign up to save your podcasts
Or
This study examines the reliability of large language models (LLMs) as they grow larger and are trained to be more "instructable". The authors investigate three key aspects: difficulty concordance (whether LLMs make more errors on tasks humans perceive as difficult), task avoidance (whether LLMs avoid answering difficult questions), and prompting stability (how sensitive LLMs are to different phrasings of the same question). The research reveals a troubling trend: while larger, more instructable LLMs perform better on challenging tasks, their reliability on simpler tasks remains low, and they often provide incorrect answers instead of avoiding them. This suggests a fundamental shift is needed in the development of these models to ensure they have a predictable error distribution, particularly in high-stakes areas where reliability is paramount.
This study examines the reliability of large language models (LLMs) as they grow larger and are trained to be more "instructable". The authors investigate three key aspects: difficulty concordance (whether LLMs make more errors on tasks humans perceive as difficult), task avoidance (whether LLMs avoid answering difficult questions), and prompting stability (how sensitive LLMs are to different phrasings of the same question). The research reveals a troubling trend: while larger, more instructable LLMs perform better on challenging tasks, their reliability on simpler tasks remains low, and they often provide incorrect answers instead of avoiding them. This suggests a fundamental shift is needed in the development of these models to ensure they have a predictable error distribution, particularly in high-stakes areas where reliability is paramount.