The InfoQ Podcast

Ben Sigelman, Co-Creator of Dapper & OpenTracing API, on Observability


Listen Later

Ben Sigelman is the CEO of Lightstep and the author of the Dapper paper that spawned distributed tracing discussions in the software industry. On the podcast today, Ben discusses with Wes observability, and his thoughts on logging, metrics, and tracing. The two discuss detection and refinement as the real problem when it comes to diagnosing and troubleshooting incidents with data. The podcast is full of useful tips on building and implementing an effective observability strategy.
Why listen to this podcast:
- If you’re getting woke up for an alert, it should actually be an emergency. When that happens, things to think about include: when did this happen, how quickly is it changing, how did it change, and what things in my entire system are correlated with that change.
- A reality that seems to be happening in our industry is that we’re coupling the move to microservices with a move to allowing teams to fully self-determine technology stacks. This is dangerous because we’re not at the stage where all languages/tools/frameworks are equivalent.
- While a service mesh offers a great potential for integrations at layer 7 many people have unrealistic expectations on how much observability will be enabled by a service mesh. The service mesh does a great job of showing you the communication between the services, but often the details get lost in the work that’s being done inside the service. Service owners need to still do much more work to instrument applications.
- Too many people focus on the 3 Pillars of Observability. While logs, metrics, and tracing are important, observability strategy ought to be more focused on the core workflows and needs around detection and refinement.
- Logging about individual transactions is better done with tracing. It’s unaffordable at scale to do otherwise.
- Just like logging, metrics about individual transactions are less valuable. Application level metrics such as how long a queue is are metrics that are truly useful.
- The problem with metrics are the only tools you have in a metrics system to explain the variations that you’re seeing is grouping by tags. The tags you want to group by have high cardinality, so you can’t group them. You end up in a catch 22.
- Tracing is about taking traces and doing something useful with them. If you look at hundreds or thousands of tracing, you can answer really important questions about what’s changing in terms of workloads and dependencies about a system with evidence.
- When it comes to serverless, tracing is more important than ever because everything is so ephemeral. Node is one of the most popular serverless languages/frameworks and, unfortunately, also one of the hardest of all to trace.
- The most important thing is to make sure that you choose something portable for the actual instrumentation piece of a distributed tracing system. You don’t want to go back and rip out the instrumentation because you want to switch vendors. This is becoming conventional wisdom.
More on this: Quick scan our curated show notes on InfoQ https://bit.ly/2PPIdeE
You can also subscribe to the InfoQ newsletter to receive weekly updates on the hottest topics from professional software development. bit.ly/24x3IVq
Subscribe: www.youtube.com/infoq
Like InfoQ on Facebook: bit.ly/2jmlyG8
Follow on Twitter: twitter.com/InfoQ
Follow on LinkedIn: www.linkedin.com/company/infoq
Check the landing page on InfoQ: https://bit.ly/2PPIdeE
...more
View all episodesView all episodes
Download on the App Store

The InfoQ PodcastBy InfoQ

  • 4.8
  • 4.8
  • 4.8
  • 4.8
  • 4.8

4.8

37 ratings


More shows like The InfoQ Podcast

View all
Hanselminutes with Scott Hanselman by Scott Hanselman

Hanselminutes with Scott Hanselman

377 Listeners

Software Engineering Radio - the podcast for professional software developers by se-radio@computer.org

Software Engineering Radio - the podcast for professional software developers

272 Listeners

.NET Rocks! by Carl Franklin and Richard Campbell

.NET Rocks!

246 Listeners

The Changelog: Software Development, Open Source by Changelog Media

The Changelog: Software Development, Open Source

282 Listeners

The Cloudcast by Massive Studios

The Cloudcast

152 Listeners

Thoughtworks Technology Podcast by Thoughtworks

Thoughtworks Technology Podcast

42 Listeners

Software Engineering Daily by Software Engineering Daily

Software Engineering Daily

627 Listeners

Soft Skills Engineering by Jamison Dance and Dave Smith

Soft Skills Engineering

270 Listeners

AWS Podcast by Amazon Web Services

AWS Podcast

203 Listeners

Engineering Culture by InfoQ by InfoQ

Engineering Culture by InfoQ

12 Listeners

CoRecursive: Coding Stories by Adam Gordon Bell - Software Developer

CoRecursive: Coding Stories

189 Listeners

Kubernetes Podcast from Google by Abdel Sghiouar, Kaslin Fields

Kubernetes Podcast from Google

181 Listeners

Practical AI by Practical AI LLC

Practical AI

189 Listeners

The Stack Overflow Podcast by The Stack Overflow Podcast

The Stack Overflow Podcast

64 Listeners

The Pragmatic Engineer by Gergely Orosz

The Pragmatic Engineer

52 Listeners