AI Engineering Podcast

Accelerate Development And Delivery Of Your Machine Learning Projects With A Comprehensive Feature Platform


Listen Later

Summary
In order for a machine learning model to build connections and context across the data that is fed into it the raw data needs to be engineered into semantic features. This is a process that can be tedious and full of toil, requiring constant upkeep and often leading to rework across projects and teams. In order to reduce the amount of wasted effort and speed up experimentation and training iterations a new generation of services are being developed. Tecton first built a feature store to serve as a central repository of engineered features and keep them up to date for training and inference. Since then they have expanded the set of tools and services to be a full-fledged feature platform. In this episode Kevin Stumpf explains the different capabilities and activities related to features that are necessary to maintain velocity in your machine learning projects.
Announcements
  • Hello and welcome to the Machine Learning Podcast, the podcast about machine learning and how to bring it from idea to delivery.
  • Building good ML models is hard, but testing them properly is even harder. At Deepchecks, they built an open-source testing framework that follows best practices, ensuring that your models behave as expected. Get started quickly using their built-in library of checks for testing and validating your model’s behavior and performance, and extend it to meet your specific needs as your model evolves. Accelerate your machine learning projects by building trust in your models and automating the testing that you used to do manually. Go to themachinelearningpodcast.com/deepchecks today to get started!
  • Do you wish you could use artificial intelligence to drive your business the way Big Tech does, but don’t have a money printer? Graft is a cloud-native platform that aims to make the AI of the 1% accessible to the 99%. Wield the most advanced techniques for unlocking the value of data, including text, images, video, audio, and graphs. No machine learning skills required, no team to hire, and no infrastructure to build or maintain. For more information on Graft or to schedule a demo, visit themachinelearningpodcast.com/graft today and tell them Tobias sent you.
  • Data powers machine learning, but poor data quality is the largest impediment to effective ML today. Galileo is a collaborative data bench for data scientists building Natural Language Processing (NLP) models to programmatically inspect, fix and track their data across the ML workflow (pre-training, post-training and post-production) – no more excel sheets or ad-hoc python scripts. Get meaningful gains in your model performance fast, dramatically reduce data labeling and procurement costs, while seeing 10x faster ML iterations. Galileo is offering listeners a free 30 day trial and a 30% discount on the product there after. This offer is available until Aug 31, so go to themachinelearningpodcast.com/galileo and request a demo today!
  • Your host is Tobias Macey and today I’m interviewing Kevin Stumpf about the role of feature platforms in your ML engineering workflow
Interview
  • Introduction
  • How did you get involved in machine learning?
  • Can you describe what you mean by the term "feature platform"? 
    • What are the components and supporting capabilities that are needed for such a platform?
  • How does the availability of engineered features impact the ability of an organization to put ML into production?
  • What are the points of friction that teams encounter when trying to build and maintain ML projects in the absence of a fully integrated feature platform?
  • Who are the target personas for the Tecton platform? 
    • What stages of the ML lifecycle does it address?
  • Can you describe how you have designed the Tecton feature platform? 
    • How have the goals and capabilities of the product evolved since you started working on it?
  • What is the workflow for an ML engineer or data scientist to build and maintain features and use them in the model development workflow?
  • What are the responsibilities of the MLOps stack that you have intentionally decided not to address? 
    • What are the interfaces and extension points that you offer for integrating with the other utilities needed to manage a full ML system?
  • You wrote a post about the need to establish a DevOps approach to ML data. In keeping with that theme, can you describe how to think about the approach to testing and validation techniques for features and their outputs?
  • What are the most interesting, innovative, or unexpected ways that you have seen Tecton/Feast used?
  • What are the most interesting, unexpected, or challenging lessons that you have learned while working on Tecton?
  • When is Tecton the wrong choice?
  • What do you have planned for the future of the Tecton feature platform?
Contact Info
  • LinkedIn
  • @kevinmstumpf on Twitter
  • kevinstumpf on GitHub
Parting Question
  • From your perspective, what is the biggest barrier to adoption of machine learning today?
Links
  • Tecton
    • Data Engineering Podcast Episode
  • Uber Michaelangelo
  • Feature Store
  • Snowflake
    • Data Engineering Podcast Episode
  • DynamoDB
  • Train/Serve Skew
  • Lambda Architecture
  • Redis
The intro and outro music is from Hitman’s Lovesong feat. Paola Graziano by The Freak Fandango Orchestra/[CC BY-SA 3.0](https://creativecommons.org/licenses/by-sa/3.0/?utm_source=rss&utm_medium=rss
...more
View all episodesView all episodes
Download on the App Store

AI Engineering PodcastBy Tobias Macey

  • 4.3
  • 4.3
  • 4.3
  • 4.3
  • 4.3

4.3

6 ratings


More shows like AI Engineering Podcast

View all
The Cloudcast by Massive Studios

The Cloudcast

153 Listeners

a16z Podcast by Andreessen Horowitz

a16z Podcast

994 Listeners

Software Engineering Daily by Software Engineering Daily

Software Engineering Daily

629 Listeners

Super Data Science: ML & AI Podcast with Jon Krohn by Jon Krohn

Super Data Science: ML & AI Podcast with Jon Krohn

296 Listeners

NVIDIA AI Podcast by NVIDIA

NVIDIA AI Podcast

322 Listeners

Data Engineering Podcast by Tobias Macey

Data Engineering Podcast

139 Listeners

AI Today Podcast: Artificial Intelligence Insights, Experts, and Opinion by AI & Data Today

AI Today Podcast: Artificial Intelligence Insights, Experts, and Opinion

144 Listeners

Practical AI by Practical AI LLC

Practical AI

189 Listeners

The Stack Overflow Podcast by The Stack Overflow Podcast

The Stack Overflow Podcast

63 Listeners

Last Week in AI by Skynet Today

Last Week in AI

281 Listeners

Machine Learning Street Talk (MLST) by Machine Learning Street Talk (MLST)

Machine Learning Street Talk (MLST)

88 Listeners

No Priors: Artificial Intelligence | Technology | Startups by Conviction

No Priors: Artificial Intelligence | Technology | Startups

124 Listeners

Latent Space: The AI Engineer Podcast by swyx + Alessio

Latent Space: The AI Engineer Podcast

63 Listeners

The AI Daily Brief (Formerly The AI Breakdown): Artificial Intelligence News and Analysis by Nathaniel Whittemore

The AI Daily Brief (Formerly The AI Breakdown): Artificial Intelligence News and Analysis

423 Listeners

AI + a16z by a16z

AI + a16z

33 Listeners