One Loop to Optimize Them All: A Universal API for LLM-Driven Discovery
Source: optimize_anything: A Universal API for Optimizing any Text Parameter
Paper was published on May 19, 2026
This episode was AI-generated on May 22, 2026. The script was written by an AI language model and the host voices were synthesized by Eleven Labs. The producer is not affiliated with Anthropic or Eleven Labs.
Five separate LLM optimization frameworks have been racing to evolve code, prompts, and agents — and a new Berkeley paper argues they're all secretly the same algorithm. The unification claim comes with receipts: state-of-the-art circle packing for three dollars, ARC-AGI scores leaping from 32% to nearly 90%, and a clear theory of why richer feedback beats cleverer search.
Key Takeaways
Why the authors argue side information — error traces, profiler dumps, failed-test diagnostics — is the LLM-era analog of a gradient, and the ablation showing 4-6x faster convergence when you use itHow a 10-line seed agent evolved into a 300-line ARC-AGI pipeline that discovered rule induction, code verification, and fallback strategies on its ownThe 'refiner leapfrog' mechanism that let optimize_anything beat AlphaEvolve on circle packing at a third of the budgetWhen multi-task optimization helps (CUDA kernels share structure) and when it actively hurts (different circle-packing sizes don't transfer)Why a meaningful share of the headline numbers comes from the frontier proposer model — and where the architectural contributions still clearly do real workThe shift the paper implies: optimization expertise gets traded for evaluator-design expertise, and that's now the craft worth investing in00:00 — Five frameworks, one underlying algorithm
Why AlphaEvolve, FunSearch, GEPA, ADAS, and OpenEvolve are arguably running the same loop in different costumes.02:56 — The three-line engine
Walking through the declarative loop — artifact, evaluator, proposer — that the paper claims is sufficient across all six domains.05:53 — Side information as a gradient analog
The cooking-feedback analogy, the stack-trace line, and the ablation showing rich diagnostics dramatically outperform scalar scores.08:50 — The Pareto frontier and why specialists survive
The Olympic-team intuition for why ranking by average kills diversity, and how per-dimension champions get preserved instead.11:47 — ARC-AGI: an agent that designs its own architecture
How a ten-line seed agent evolved into a four-stage pipeline with verification and fallback, lifting Gemini Flash from 32% to nearly 90%.14:44 — Circle packing and the refiner leapfrog
Beating AlphaEvolve's published record for $3.18, and the two-artifact mechanism that explains why this kind of compounding is possible.17:41 — When multi-task helps and when it hurts
Shared Pareto frontiers across related problems win on CUDA kernels but degrade performance on circle packing across different n.20:38 — Three honest caveats
The frontier-proposer dependency, the AIME result that ties rather than beats GEPA, and why side-information design is itself a craft.23:35 — What changes if the unification holds
Why the next research frontier may be evaluator design rather than yet another specialized optimization framework.Recommended Reading
GEPA: Reflective Prompt Evolution Can Outperform Reinforcement Learning — The authors' prior prompt-optimization system that optimize_anything generalizes — and the one it ties rather than beats on AIME, making it essential context for the unification thesis.FunSearch: Mathematical discoveries from program search with large language models — One of the five systems the episode lines up as having its own bespoke framework, and an early demonstration of LLM-in-the-loop discovery on math problems like circle-packing-adjacent geometry.On the Measure of Intelligence — Chollet's original framing of ARC-AGI as a reasoning benchmark, useful background for why the 32%-to-90% jump from an evolved 300-line agent is a meaningful result.Illuminating search spaces by mapping elites (MAP-Elites) — The quality-diversity algorithm behind the Pareto-frontier-of-champions intuition the episode unpacks with the Olympics analogy.