The Python Podcast.__init__

Go From Notebook To Pipeline For Your Data Science Projects With Orchest


Listen Later

Summary

Jupyter notebooks are a dominant tool for data scientists, but they lack a number of conveniences for building reusable and maintainable systems. For machine learning projects in particular there is a need for being able to pivot from exploring a particular dataset or problem to integrating that solution into a larger workflow. Rick Lamers and Yannick Perrenet were tired of struggling with one-off solutions when they created the Orchest platform. In this episode they explain how Orchest allows you to turn your notebooks into executable components that are integrated into a graph of execution for running end-to-end machine learning workflows.

Announcements
  • Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great.
  • When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With the launch of their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. Go to pythonpodcast.com/linode and get a $100 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show!
  • Your host as usual is Tobias Macey and today I’m interviewing Rick Lamers and Yannick Perrenet about Orchest, a development environment designed for building data science pipelines from notebooks and scripts.
  • Interview
    • Introductions
    • How did you get introduced to Python?
    • Can you start by giving an overview of what Orchest is and the story behind it?
    • Who are the users that you are building Orchest for and what are their biggest challenges?
      • What are some examples of the types of tools or workflows that they are using now?
      • What are some of the other tools or strategies in the data science ecosystem that Orchest might replace? (e.g. MLFlow, Metaflow, etc.)
      • What problems does Orchest solve?
      • Can you describe how Orchest is implemented?
        • How have the design and goals of the project changed since you first started working on it?
        • What is the workflow for someone who is using Orchest?
        • What are some of the sharp edges that they might run into?
        • What is the deployable unit once a pipeline has been created?
          • How do you handle verification and promotion of pipelines across staging and production environments?
          • What are the interfaces available for integrating with or extending Orchest?
            • How might an organization incorporate a pipeline defined in Orchest with the rest of their data orchestration workflows?
            • How are you approaching governance and sustainability of the Orchest project?
            • What are the most interesting, innovative, or unexpected ways that you have seen Orchest used?
            • What are the most interesting, unexpected, or challenging lessons that you have learned while building Orchest?
            • When is Orchest the wrong choice?
            • What do you have planned for the future of the project and company?
            • Keep In Touch
              • Rick
                • ricklamers on GitHub
                • LinkedIn
                • @RickLamers on Twitter
                • Yannick
                  • yannickperrenet on GitHub
                  • LinkedIn
                  • Picks
                    • Tobias
                      • Fresh Bagels
                      • Rick
                        • Vaex
                        • Yannick
                          • Cookiecutter
                          • Pyenv
                          • Links
                            • Orchest
                            • Geoffrey Hinton
                            • Yann LeCun
                            • CoffeeScript
                            • Vim
                            • GAN == Generative Adversarial Network
                            • Git
                            • SQL
                            • BigQuery
                            • Software Carpentry
                              • Podcast Episode
                              • Google Colab
                              • Airflow
                                • Podcast Episode
                                • Kedro
                                  • Data Engineering Podcast Episode
                                  • nbdev
                                    • Podcast Episode
                                    • Papermill
                                      • Data Engineering Podcast Episode
                                      • MLFlow
                                      • Metaflow
                                        • Podcast Episode
                                        • DVC
                                          • Podcast Episode
                                          • Andrew Ng
                                          • Kubeflow
                                          • Lua
                                          • Caddy
                                          • Traefik
                                          • DAG == Directed Acyclic Graph
                                          • Jupyter Enterprise Gateway
                                          • Streamlit
                                          • Kubernetes
                                          • Dagster
                                            • Podcast.__init__ Episode
                                            • Data Engineering Podcast Episode
                                            • DBT
                                              • Data Engineering Podcast Episode
                                              • GitLab
                                              • Spark
                                              • ETL
                                              • The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

                                                ...more
                                                View all episodesView all episodes
                                                Download on the App Store

                                                The Python Podcast.__init__By Tobias Macey

                                                • 4.4
                                                • 4.4
                                                • 4.4
                                                • 4.4
                                                • 4.4

                                                4.4

                                                100 ratings


                                                More shows like The Python Podcast.__init__

                                                View all
                                                TED Talks Daily by TED

                                                TED Talks Daily

                                                11,280 Listeners

                                                6 Minute English by BBC Radio

                                                6 Minute English

                                                1,779 Listeners

                                                The Changelog: Software Development, Open Source by Changelog Media

                                                The Changelog: Software Development, Open Source

                                                285 Listeners

                                                Data Skeptic by Kyle Polich

                                                Data Skeptic

                                                474 Listeners

                                                Talk Python To Me by Michael Kennedy

                                                Talk Python To Me

                                                585 Listeners

                                                Software Engineering Daily by Software Engineering Daily

                                                Software Engineering Daily

                                                630 Listeners

                                                The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence) by Sam Charrington

                                                The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)

                                                429 Listeners

                                                Super Data Science: ML & AI Podcast with Jon Krohn by Jon Krohn

                                                Super Data Science: ML & AI Podcast with Jon Krohn

                                                295 Listeners

                                                Python Bytes by Michael Kennedy and Brian Okken

                                                Python Bytes

                                                212 Listeners

                                                Syntax - Tasty Web Development Treats by Wes Bos & Scott Tolinski - Full Stack JavaScript Web Developers

                                                Syntax - Tasty Web Development Treats

                                                984 Listeners

                                                DataFramed by DataCamp

                                                DataFramed

                                                267 Listeners

                                                Practical AI by Practical AI LLC

                                                Practical AI

                                                196 Listeners

                                                The Real Python Podcast by Real Python

                                                The Real Python Podcast

                                                136 Listeners

                                                Last Week in AI by Skynet Today

                                                Last Week in AI

                                                275 Listeners

                                                Latent Space: The AI Engineer Podcast by swyx + Alessio

                                                Latent Space: The AI Engineer Podcast

                                                64 Listeners