The Tech Trek

Data Orchestration and Open Source Strategy


Listen Later

Pete Hunt, CEO of Dagster Labs, joins Amir Bormand to break down why modern data teams are moving past task based orchestration, and what it really takes to run reliable pipelines at scale. If you have ever wrestled with Apache Airflow pain, multi team deployments, or unclear data lineage, this conversation will give you a clearer mental model and a practical way to think about the next generation of data infrastructure.


Key Takeaways

• Data orchestration is not just scheduling, it is the control layer that keeps data assets reliable, observable, and usable

• Asset based thinking makes debugging easier because the system maps code directly to the data artifacts your business depends on

• Multi team data platforms need isolation by default, without it, shared dependencies and shared failures become a tax on every team

• Good software engineering practices reduce data chaos, and the tools can get simpler over time as best practices harden

• Open source makes sense for core infrastructure, with commercial layers reserved for features larger teams actually need


Timestamped Highlights

00:00:50 What Dagster is, and why orchestration matters for every data driven team

00:04:18 The origin story, why critical institutions still cannot answer basic questions about their data

00:07:02 The architectural shift, moving from task based workflows to asset based pipelines

00:08:25 The multi tenancy problem, why shared environments break down across teams, and what to do instead

00:11:21 The path out of complexity, why software engineering best practices are the unlock for data teams

00:17:53 Open source as a strategy, what belongs in the open core, and what belongs in the paid layer


A Line Worth Repeating

Data orchestration is infrastructure, and most teams want their core infrastructure to be open source.


Pro Tips for Data and Platform Teams

• If debugging feels impossible, you may be modeling your system around tasks instead of the data assets the business actually consumes

• If multiple teams share one codebase, isolate dependencies and runtime early, shared Python environments become a silent reliability risk

• Reduce cognitive load by tightening concepts, fewer new nouns usually means a smoother developer experience


Call to Action

If this episode helped you rethink data orchestration, follow the show on Apple Podcasts and Spotify, and subscribe so you do not miss future conversations on data, AI, and the infrastructure choices that shape real outcomes.

...more
View all episodesView all episodes
Download on the App Store

The Tech TrekBy Elevano

  • 5
  • 5
  • 5
  • 5
  • 5

5

74 ratings