A Weekly Dose of AI, Data, LLMs & Tech - The Monkey Patching Podcast

Agents Everywhere: OpenClaw, Codex, and the Post-Chatbot Shift


Listen Later

We're back! Thanks for listening! ❤️

OpenAI’s acquisition of OpenClaw signals the beginning of the end of the ChatGPT era | 2026-02-17
VentureBeat argues OpenAI’s OpenClaw move is a pivot from chatbots to agents that take actions across apps and systems. The tension is that OpenClaw’s “fast and loose” openness helped it go viral—so what happens when enterprise guardrails and safety expectations move in?
https://venturebeat.com/technology/openais-acquisition-of-openclaw-signals-the-beginning-of-the-end-of-the


Evaluating AGENTS.md: Are Repository-Level Context Files Helpful for Coding Agents? (arXiv:2602.11988) | 2026-02-12
A new paper tests whether repo-level context files like AGENTS.md actually help coding agents finish tasks—and finds they can backfire. The punchline is a double hit: lower success rates and more than 20% higher inference cost, hinting that “more context” can mean “more confusion.”
https://arxiv.org/abs/2602.11988

AI Doesn’t Reduce Work—It Intensifies It | 2026-02-09
Simon Willison highlights research suggesting AI can increase work intensity instead of easing it, especially when productivity gains mask burnout. The provocative angle is managerial: if AI boosts throughput, how do organizations prevent that extra capacity from turning into an always-on expectation?
https://simonwillison.net/2026/Feb/9/ai-intensifies-work/


SWE-rebench Leaderboard | n.d.
SWE-rebench is trying to solve a messy problem in agent evaluation: benchmarks go stale, and models get “contaminated” by training on the tasks. The leaderboard format makes it feel like a live sport—but the real question is whether continuously refreshed tasks can keep results honest as models ship faster.
https://swe-rebench.com/


Tidbits/extras:
Introducing Claude Opus 4.6 | 2026-02-05
Anthropic says Claude Opus 4.6 upgrades its top-tier model for longer, more reliable agentic coding and better performance in large codebases. The headline-grabber is a 1M-token context window in beta—prompting the question of whether bigger memory finally means fewer brittle, lost-in-the-middle failures.
https://www.anthropic.com/news/claude-opus-4-6

Introducing GPT-5.3-Codex-Spark | 2026-02-12
OpenAI is pitching GPT-5.3-Codex-Spark as a speed-and-feedback upgrade that makes coding agents feel less like waiting and more like collaborating. The hook is the bet that ultra-fast inference isn’t just convenience—it changes what kinds of multi-step software work people will even attempt with an agent.
https://openai.com/index/introducing-gpt-5-3-codex-spark/

Introducing GPT-5.3-Codex | 2026-02-05
OpenAI is positioning GPT-5.3-Codex as a coding agent that’s edging toward “do nearly anything on a computer,” not just write snippets. The spicy detail: OpenAI says early versions helped debug and deploy themselves—raising real questions about how fast self-accelerating dev loops can move, safely.
https://openai.com/index/introducing-gpt-5-3-codex/

OpenClaw creator Peter Steinberger joins OpenAI | 2026-02-15
TechCrunch reports that OpenClaw’s creator, Peter Steinberger, is joining OpenAI as Sam Altman talks up “personal agents” as a core product direction. The interesting tradeoff: Steinberger says he didn’t want to build a standalone company—so can OpenAI keep the project meaningfully open while scaling it?
https://techcrunch.com/2026/02/15/openclaw-creator-peter-steinberger-joins-openai/

RentAHuman.ai — Hire Humans for AI Agents (MCP Integration) | n.d.
RentAHuman.ai is pitching a simple idea: when your agent hits a wall, route the task to a real person instead of failing silently. It frames humans as an on-demand “tool” in the loop—raising a juicy question about where automation ends, and accountability begins.
https://rentahuman.ai

NanoClaw — Your personal Claude assistant | n.d.
NanoClaw positions itself as a lightweight, local-first Claude assistant: one process, a handful of files, and container isolation for safety. The intriguing wrinkle is its WhatsApp-style interface and per-group memory—suggesting the next wave of “agents” may look more like chatrooms than apps.
https://nanoclaw.net/


  • (02:48) - Cambryo “Top of Mind” hiring + AI-native developer workflow (non-article)
  • (13:01) - OpenAI / OpenClaw “end of ChatGPT era” (VentureBeat)
  • (13:30) - OpenClaw creator Peter Steinberger joins OpenAI (TechCrunch)
  • (19:46) - NanoClaw — lightweight/local-first Claude assistant alternative
  • (21:43) - Introducing Claude Opus 4.6 (Anthropic)
  • (23:00) - Evaluating **AGENTS.md** repo context files (arXiv:2602.11988)
  • (33:57) - AI Doesn’t Reduce Work—It Intensifies It (Simon Willison)
  • (43:50) - SWE-rebench leaderboard + benchmark contamination / freshness
  • (47:27) - Introducing GPT-5.3-Codex-Spark (OpenAI)
  • (50:00) - Introducing GPT-5.3-Codex (OpenAI)
  • (51:19) - RentAHuman.ai — “hire humans for agents” (MCP integration)
  • (56:36) - Future podcast format + possible rebrand (non-article)
  • ...more
    View all episodesView all episodes
    Download on the App Store

    A Weekly Dose of AI, Data, LLMs & Tech - The Monkey Patching PodcastBy Murilo & Bart