Data Engineering Podcast

How Data Engineering Teams Power Machine Learning With Feature Platforms


Listen Later

Summary

Feature engineering is a crucial aspect of the machine learning workflow. To make that possible, there are a number of technical and procedural capabilities that must be in place first. In this episode Razi Raziuddin shares how data engineering teams can support the machine learning workflow through the development and support of systems that empower data scientists and ML engineers to build and maintain their own features.

Announcements
  • Hello and welcome to the Data Engineering Podcast, the show about modern data management
  • Introducing RudderStack Profiles. RudderStack Profiles takes the SaaS guesswork and SQL grunt work out of building complete customer profiles so you can quickly ship actionable, enriched data to every downstream team. You specify the customer traits, then Profiles runs the joins and computations for you to create complete customer profiles. Get all of the details and try the new product today at dataengineeringpodcast.com/rudderstack
  • Your host is Tobias Macey and today I'm interviewing Razi Raziuddin about how data engineers can empower data scientists to develop and deploy better ML models through feature engineering
  • Interview
    • Introduction
    • How did you get involved in the area of data management?
    • What is feature engineering is and why/to whom it matters?
      • A topic that commonly comes up in relation to feature engineering is the importance of a feature store. What are the tradeoffs for that to be a separate infrastructure/architecture component?
      • What is the overall lifecycle of a feature, from definition to deployment and maintenance?
        • How is this distinct from other forms of data pipeline development and delivery?
        • Who are the participants in that workflow?
        • What are the sharp edges/roadblocks that typically manifest in that lifecycle?
        • What are the interfaces that are needed for data scientists/ML engineers to be able to self-serve their feature management?
          • What is the role of the data engineer in supporting those interfaces?
          • What are the communication/collaboration channels that are necessary to make the overall process a success?
          • From an implementation/architecture perspective, what are the patterns that you have seen teams build around for feature development/serving?
          • What are the most interesting, innovative, or unexpected ways that you have seen feature platforms used?
          • What are the most interesting, unexpected, or challenging lessons that you have learned while working on feature engineering?
          • What are the resources that you find most helpful in understanding and designing feature platforms?
          • Contact Info
            • LinkedIn
            • Parting Question
              • From your perspective, what is the biggest gap in the tooling or technology for data management today?
              • Closing Announcements
                • Thank you for listening! Don't forget to check out our other shows. Podcast.__init__ covers the Python language, its community, and the innovative ways it is being used. The Machine Learning Podcast helps you go from idea to production with machine learning.
                • Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.
                • If you've learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story.
                • To help other people find the show please leave a review on Apple Podcasts and tell your friends and co-workers
                • Links
                  • FeatureByte
                  • DataRobot
                  • Feature Store
                  • Feast Feature Store
                  • Feathr
                  • Kaggle
                  • Yann LeCun
                  • The intro and outro music is from The Hug by The Freak Fandango Orchestra / CC BY-SA

                    Sponsored By:

                    • Rudderstack: ![Rudderstack](https://files.fireside.fm/file/fireside-uploads/images/c/c6161a3f-a67b-48ef-b087-52f1f1573292/CKNV8HZ6.png)
                    Introducing RudderStack Profiles. RudderStack Profiles takes the SaaS guesswork and SQL grunt work out of building complete customer profiles so you can quickly ship actionable, enriched data to every downstream team. You specify the customer traits, then Profiles runs the joins and computations for you to create complete customer profiles. Get all of the details and try the new product today at [dataengineeringpodcast.com/rudderstack](https://www.dataengineeringpodcast.com/rudderstack)

                    Support Data Engineering Podcast

                    ...more
                    View all episodesView all episodes
                    Download on the App Store

                    Data Engineering PodcastBy Tobias Macey

                    • 4.5
                    • 4.5
                    • 4.5
                    • 4.5
                    • 4.5

                    4.5

                    142 ratings


                    More shows like Data Engineering Podcast

                    View all
                    This Week in Startups by Jason Calacanis

                    This Week in Startups

                    1,301 Listeners

                    The Changelog: Software Development, Open Source by Changelog Media

                    The Changelog: Software Development, Open Source

                    288 Listeners

                    The a16z Show by Andreessen Horowitz

                    The a16z Show

                    1,107 Listeners

                    Software Engineering Daily by Software Engineering Daily

                    Software Engineering Daily

                    630 Listeners

                    Risky Business by Risky Business Media

                    Risky Business

                    373 Listeners

                    Talk Python To Me by Michael Kennedy

                    Talk Python To Me

                    583 Listeners

                    Super Data Science: ML & AI Podcast with Jon Krohn by Jon Krohn

                    Super Data Science: ML & AI Podcast with Jon Krohn

                    308 Listeners

                    NVIDIA AI Podcast by NVIDIA

                    NVIDIA AI Podcast

                    347 Listeners

                    Syntax - Tasty Web Development Treats by Wes Bos & Scott Tolinski - Full Stack JavaScript Web Developers

                    Syntax - Tasty Web Development Treats

                    988 Listeners

                    Practical AI by Practical AI LLC

                    Practical AI

                    211 Listeners

                    Dwarkesh Podcast by Dwarkesh Patel

                    Dwarkesh Podcast

                    549 Listeners

                    The Data Engineering Show by The Firebolt Data Bros

                    The Data Engineering Show

                    9 Listeners

                    Latent Space: The AI Engineer Podcast by Latent.Space

                    Latent Space: The AI Engineer Podcast

                    104 Listeners

                    This Day in AI Podcast by Michael Sharkey, Chris Sharkey

                    This Day in AI Podcast

                    227 Listeners

                    The AI Daily Brief: Artificial Intelligence News and Analysis by Nathaniel Whittemore

                    The AI Daily Brief: Artificial Intelligence News and Analysis

                    683 Listeners