
Sign up to save your podcasts
Or
Nicolay here,
Today I have the chance to talk to Charity Majors, CEO and co-founder of Honeycomb, who recently has been writing about the cost crisis in observability.
"Your source of truth is production, not your IDE - and if you can't understand your code there, you're flying blind."
The key insight is architecturally simple but operationally transformative: replace your 10-20 observability tools with wide structured events that capture everything about a request in one place. Most teams store the same request data across metrics, logs, traces, APM, and error tracking - creating a 20X cost multiplier while making debugging nearly impossible because you're reconstructing stories from fragments.
Charity's approach flips this: instrument once with rich context, derive everything else from that single source. This isn't just about cost - it's about giving engineers the connective tissue to understand distributed systems. When you can correlate "all requests failing from Android version X in region Y using language pack Z," you find problems in minutes instead of days.
The second is putting developers on call for their own code. This creates the tight feedback loop that makes engineers write more reliable software - because nobody wants to get paged at 3am for their own bugs.
In the podcast, we also touch on:
π‘ Core Concepts
πΆ Connect with Charity:
πΆ Connect with Nicolay:
β±οΈ Important Moments
π οΈ Tools & Tech Mentioned
π Recommended Resources
Nicolay here,
Today I have the chance to talk to Charity Majors, CEO and co-founder of Honeycomb, who recently has been writing about the cost crisis in observability.
"Your source of truth is production, not your IDE - and if you can't understand your code there, you're flying blind."
The key insight is architecturally simple but operationally transformative: replace your 10-20 observability tools with wide structured events that capture everything about a request in one place. Most teams store the same request data across metrics, logs, traces, APM, and error tracking - creating a 20X cost multiplier while making debugging nearly impossible because you're reconstructing stories from fragments.
Charity's approach flips this: instrument once with rich context, derive everything else from that single source. This isn't just about cost - it's about giving engineers the connective tissue to understand distributed systems. When you can correlate "all requests failing from Android version X in region Y using language pack Z," you find problems in minutes instead of days.
The second is putting developers on call for their own code. This creates the tight feedback loop that makes engineers write more reliable software - because nobody wants to get paged at 3am for their own bugs.
In the podcast, we also touch on:
π‘ Core Concepts
πΆ Connect with Charity:
πΆ Connect with Nicolay:
β±οΈ Important Moments
π οΈ Tools & Tech Mentioned
π Recommended Resources