Just Now Possible

When AI Becomes Your SRE: How Incident.io Is Automating Incident Response


Listen Later

Guests

  • Lawrence Jones, Founding Engineer at Incident.io
  • Ed Dean Product Lead for AI at Incident.io
  • Key Takeaways

    • AI’s biggest impact comes from compressing time—identifying causes minutes instead of hours.
    • Retrieval-augmented reasoning still benefits from simplicity: deterministic tagging and re-ranking often beat complex vector setups.
    • Post-incident “time travel” evals let teams score AI accuracy after they know what really happened.
    • Building trust in AI isn’t just about precision—it’s about showing reasoning and uncertainty in ways humans understand.
    • Mentioned Tools & Concepts

      • Slack as the interface for human-AI collaboration
      • PGVector and Postgres for retrieval experiments
      • RAG (Retrieval-Augmented Generation)
      • Multi-agent orchestration
      • “AI as your company’s immune system”
      • Chapters

        00:00 Meet the Founders: Lawrence and Ed
        00:41 Introduction to Incident.io
        01:25 Evolution of Incident.io Products
        02:14 Understanding SRE and Its Importance
        04:01 Real-World Incident Management
        05:51 The Role of AI in Incident Management
        10:12 Challenges and Innovations in AI SRE
        12:14 Prototyping and Iterating AI Solutions
        16:25 Refining Retrieval Strategies
        21:52 Balancing AI and Human Interaction
        32:06 User Experience and Trust in AI Systems
        36:08 Interactive Slack Integration
        37:08 Understanding the AI Investigation Process
        37:50 Parallel Checks and Data Sources
        38:35 Building Hypotheses and Refining Findings
        40:09 Human-Agent Collaboration
        49:23 Evaluating AI Effectiveness
        a01:04:13 Future Developments and Integrations

        ...more
        View all episodesView all episodes
        Download on the App Store

        Just Now PossibleBy Teresa Torres