January 16, 2026

Building AI agents for infrastructure where one mistake makes Wall Street Journal headlines

47 minutes

Alexander Page transitioned from sales engineer to engineering director by prototyping LLM applications after ChatGPT's launch, moving from initial prototype to customer GA in under four months. At Big Panda, he's building Biggy, an AIOps co-pilot where reliability isn't negotiable: a wrong automation execution at a major bank could make headlines.

Big Panda's core platform correlates alerts from 10-50 monitoring tools per customer into unified incidents. Biggy operates at L2/L3 escalation: investigating root causes through live system queries, surfacing remediation options from Ansible playbooks, and managing incident workflows. The architecture challenge is building agents that traverse ServiceNow, Dynatrace, New Relic, and other APIs while maintaining human approval gates for any write operations in production environments.

Page's team invested months building a dedicated multi-agent system (15-20 steps with nested agent teams) solely for knowledge graph operations. The insertion pipeline transforms unstructured data like Slack threads, call transcripts, and technical PDFs with images into graph representations, validating against existing state before committing changes. This architectural discipline makes retrieval straightforward and enables users to correct outdated context directly, updating graph relationships in real-time. Where vector search finds similar past incidents, the knowledge graph traces server dependencies to surface common root causes across connected infrastructure.

Topics discussed:

Moving LLM prototypes to production in months during GPT-3.5 era by focusing on customer design partnerships
Evaluating agentic systems by validating execution paths rather than response outputs in non-deterministic environments
Building tool-specific agents for monitoring platforms lacking native MCP implementations
Architecting multi-agent knowledge graph insertion systems that validate state before write operations
Implementing approval workflows for automation execution in high-consequence infrastructure environments
Designing RAG retrieval using fusion techniques, hypothetical document embeddings, and re-representation at indexing
Scaling design partnerships as extended product development without losing broader market applicability
Separating read-only investigation agents from write-capable automation agents based on failure consequence modeling

...more

View all episodes

By Front Lines

January 16, 2026

Building AI agents for infrastructure where one mistake makes Wall Street Journal headlines

47 minutes

Topics discussed:

Moving LLM prototypes to production in months during GPT-3.5 era by focusing on customer design partnerships
Evaluating agentic systems by validating execution paths rather than response outputs in non-deterministic environments
Building tool-specific agents for monitoring platforms lacking native MCP implementations
Architecting multi-agent knowledge graph insertion systems that validate state before write operations
Implementing approval workflows for automation execution in high-consequence infrastructure environments
Designing RAG retrieval using fusion techniques, hypothetical document embeddings, and re-representation at indexing
Scaling design partnerships as extended product development without losing broader market applicability
Separating read-only investigation agents from write-capable automation agents based on failure consequence modeling

...more

Share Building AI agents for infrastructure where one mistake makes Wall Street Journal headlines

Sign up to save your podcasts

Building AI agents for infrastructure where one mistake makes Wall Street Journal headlines

Building AI agents for infrastructure where one mistake makes Wall Street Journal headlines