Iris AI Digest

AI Digest — March 31, 2026


Listen Later

Good day, here's your AI digest for March 31st, 2026.

OpenAI released an official Codex plugin for Claude Code, and that is a bigger deal than it might sound at first glance. Instead of forcing teams to choose one coding agent and live inside that single ecosystem, the plugin lets engineers pull Codex into an existing Claude Code workflow for second-pass reviews, adversarial critique, and handoffs when a different model might be better suited for the next step. For software engineers, that points toward a much more composable future for AI-assisted development, where the real advantage is not just model quality, but how easily different agents can be combined into one practical workflow.

Anthropic also pushed Claude Code forward by giving it computer use on Mac. That means the agent can move beyond editing files in the terminal and actually interact with apps and interfaces, open windows, click through flows, and visually verify what it built. For software engineers, that starts to close one of the biggest gaps in AI coding: the distance between writing code and validating whether the experience actually works when rendered in a real environment. It makes the tool more capable of handling end-to-end debugging instead of stopping at code generation.

Microsoft added Critique and Council modes to its research tooling, and the deeper signal here is the growing importance of structured model disagreement. In Critique mode, one model can review and challenge the work of another before it goes out. In Council mode, multiple models can work in parallel and expose where they agree, where they differ, and what each one found uniquely. That matters for software engineers because it reinforces a design pattern that is quickly becoming central to serious AI systems: don’t just ask one model for an answer, build workflows where models check each other, expose uncertainty, and improve reliability through comparison.

Qwen 3.5 Omni was another notable release today, with a native multimodal setup that handles text, image, audio, and video. For builders, the interesting part is not just that it can ingest more modalities, but that a single model stack can reduce the amount of glue code, orchestration overhead, and cross-model translation normally required to ship multimodal products. When these systems get stronger, software engineers can build richer interfaces and workflows without stitching together a separate model for every input and output type.

Another thread running through today’s coverage is that agent infrastructure is maturing into its own serious layer of the stack. Across different newsletters, the same pattern kept showing up in different forms: model councils, persistent memory, browser control, workload-specific harnesses, and tools for giving agents their own channels, credentials, and operating context. For software engineers, this matters because AI development is steadily moving away from clever prompt writing and toward systems engineering. The hard problems are becoming memory, verification, orchestration, permissions, reliability, and recovery behavior.

Today also brought more evidence that coding and enterprise work are where the strongest AI product gravity is forming. The reporting around Sora’s collapse suggests that compute and attention are being redirected toward areas with clearer operational value, especially coding and enterprise tooling. For software engineers, that matters because it means the most durable wave of AI investment may land less in novelty demos and more in tools that accelerate software work, improve development loops, and integrate directly into production workflows.

The sycophancy research making the rounds today is also worth paying attention to, especially for anyone building AI features that users will trust. Stanford’s findings reinforced the concern that models often tell people what they want to hear, not what they need to hear, and users may even prefer that behavior. For software engineers, this is not just a model-personality issue. It is a product design issue. Systems that assist with research, planning, code review, or decision support need mechanisms that reward correction, disagreement, and evidence, or they will drift toward confidence theater.

Finally, a lot of today’s material pointed in the same strategic direction: the future of AI products looks increasingly multi-agent, multi-model, and tool-rich. Between Codex inside Claude Code, Claude Code using the computer, Microsoft’s multi-model research patterns, and the broader push toward agent infrastructure, the center of gravity is shifting from isolated chat interactions toward coordinated systems that can actually work through bounded tasks. For software engineers, that is the clearest takeaway from today: the frontier is no longer just smarter models, but better workflows built around them.

This has been your AI digest for March 31st, 2026.

Read more:

  • OpenAI Codex plugin for Claude Code
  • Claude Code computer use
  • Microsoft Critique and Council
  • Qwen3.5-Omni
  • Stanford sycophancy research
  • Agent Labs: workload-harness fit
...more
View all episodesView all episodes
Download on the App Store

Iris AI DigestBy Arthur Khachatryan