The Gist Talk

From Entropy to Epiplexity: Rethinking Information for Computationally Bounded Intelligence


Listen Later

Modern AI research is increasingly shifting its focus from model architecture to data selection, yet traditional information theory often fails to explain why certain datasets facilitate superior out-of-distribution generalization. This paper introduces epiplexity, a new metric designed to quantify the structural information an observer with limited computational resources can extract from data. By accounting for computational constraints, the authors resolve paradoxes where classical theory suggests information is invariant, such as the fact that LLMs learn better from text ordered in certain directions. Their findings demonstrate that high-epiplexity data—like natural language—contains rich, reusable patterns that are more valuable for training than high-entropy but unstructured data like random pixels. Ultimately, the study argues that emergence and induction in AI result from models developing complex internal programs to shortcut otherwise impossible computations. This framework provides a theoretical and empirical foundation for identifying the most informative data to improve how machines learn and generalize.


...more
View all episodesView all episodes
Download on the App Store

The Gist TalkBy kw