
Sign up to save your podcasts
Or


On March 7th, 2026, Andrej Karpathy released autoresearch — a 630-line Python repo that lets an AI agent run autonomous ML experiments overnight. The agent modifies training code, runs 5-minute experiments, keeps improvements, discards failures, and repeats. ~100 cycles while you sleep. Karpathy watched val_bpb drop from 1.0 to 0.97 without touching anything. Shopify CEO Tobi Lutke adapted it the same night and got a 19% improvement — with the smaller agent-optimized model eventually outperforming a larger manually configured one. The engineering task has shifted: you're not tuning the model anymore, you're writing the instructions that tell the agent how to tune the model.
karpathy/autoresearch
Three files:
The 630-line constraint is intentional: the entire codebase fits in a model's context window, so the agent can reason about the whole file at once.
Karpathy describes program.md as the "research org code" — you're not programming the model, you're programming the organization that does the research. The human's job shifts from running experiments to writing the instructions that tell the agent how to run experiments well.
His README opens with fictional lore: the agents are now in the 10,205th generation of the codebase, "a self-modifying binary that has grown beyond human comprehension." He's joking. The repo exists and you can clone it tonight.
By Daily Tech FeedOn March 7th, 2026, Andrej Karpathy released autoresearch — a 630-line Python repo that lets an AI agent run autonomous ML experiments overnight. The agent modifies training code, runs 5-minute experiments, keeps improvements, discards failures, and repeats. ~100 cycles while you sleep. Karpathy watched val_bpb drop from 1.0 to 0.97 without touching anything. Shopify CEO Tobi Lutke adapted it the same night and got a 19% improvement — with the smaller agent-optimized model eventually outperforming a larger manually configured one. The engineering task has shifted: you're not tuning the model anymore, you're writing the instructions that tell the agent how to tune the model.
karpathy/autoresearch
Three files:
The 630-line constraint is intentional: the entire codebase fits in a model's context window, so the agent can reason about the whole file at once.
Karpathy describes program.md as the "research org code" — you're not programming the model, you're programming the organization that does the research. The human's job shifts from running experiments to writing the instructions that tell the agent how to run experiments well.
His README opens with fictional lore: the agents are now in the 10,205th generation of the codebase, "a self-modifying binary that has grown beyond human comprehension." He's joking. The repo exists and you can clone it tonight.