GOTO - The Brightest Minds in Tech

Scaling Machine Learning with Spark • Adi Polak & Holden Karau


Listen Later

This interview was recorded for the GOTO Book Club.
gotopia.tech/bookclub

Read the full transcription of the interview here

Adi Polak - VP of Developer Experience at Treeverse & Contributing to lakeFS OSS
Holden Karau - Co-Author of "Kubeflow for Machine Learning" & many more books & Open Source Engineer at Netflix

DESCRIPTION
Learn how to build end-to-end scalable machine learning solutions with Apache Spark. With this practical guide, author Adi Polak introduces data and ML practitioners to creative solutions that supersede today's traditional methods. You'll learn a more holistic approach that takes you beyond specific requirements and organizational goals--allowing data and ML practitioners to collaborate and understand each other better.

Scaling Machine Learning with Spark examines several technologies for building end-to-end distributed ML workflows based on the Apache Spark ecosystem with Spark MLlib, MLflow, TensorFlow, and PyTorch. If you're a data scientist who works with machine learning, this book shows you when and why to use each technology.

You will:
• Explore machine learning, including distributed computing concepts and terminology
• Manage the ML lifecycle with MLflow
• Ingest data and perform basic preprocessing with Spark
• Explore feature engineering, and use Spark to extract features
• Train a model with MLlib and build a pipeline to reproduce it
• Build a data system to combine the power of Spark with deep learning
• Get a step-by-step example of working with distributed TensorFlow
• Use PyTorch to scale machine learning and its internal architecture

* Book description: © O’Reilly

The interview is based on the book "Scaling Machine Learning with Spark"

RECOMMENDED BOOKS
Adi Polak • Machine Learning with Apache Spark
Holden Karau, Trevor Grant, Boris Lublinsky, Richard Liu & Ilan Filonenko • Kubeflow for Machine Learning
Holden Karau • Distributed Computing 4 Kids
Holden Karau • Scaling Python with Dask
Holden Karau & Boris Lublinsky • Scaling Python with Ray
Holden Karau & Rachel Warren • High Performance Spark
Holden Karau, Konwinski, Wendell & Zaharia • Learning Spark
Holden Karau & Krishna Sankar • Fast Data Processing with Spark 2nd Edition

Bluesky
Twitter
Instagram
LinkedIn
Facebook

CHANNEL MEMBERSHIP BONUS
Join this channel to get early access to videos & other perks:
https://www.youtube.com/channel/UCs_tLP3AiwYKwdUHpltJPuA/join

Looking for a unique learning experience?
Attend the next GOTO conference near you! Get your ticket: gotopia.tech

SUBSCRIBE TO OUR YOUTUBE CHANNEL - new videos posted daily!

...more
View all episodesView all episodes
Download on the App Store

GOTO - The Brightest Minds in TechBy GOTO

  • 4.6
  • 4.6
  • 4.6
  • 4.6
  • 4.6

4.6

5 ratings


More shows like GOTO - The Brightest Minds in Tech

View all
Hanselminutes with Scott Hanselman by Scott Hanselman

Hanselminutes with Scott Hanselman

378 Listeners

Software Engineering Radio - the podcast for professional software developers by se-radio@computer.org

Software Engineering Radio - the podcast for professional software developers

264 Listeners

The Changelog: Software Development, Open Source by Changelog Media

The Changelog: Software Development, Open Source

285 Listeners

The Cloudcast by Massive Studios

The Cloudcast

154 Listeners

Thoughtworks Technology Podcast by Thoughtworks

Thoughtworks Technology Podcast

41 Listeners

Talk Python To Me by Michael Kennedy

Talk Python To Me

583 Listeners

Software Engineering Daily by Software Engineering Daily

Software Engineering Daily

631 Listeners

Syntax - Tasty Web Development Treats by Wes Bos & Scott Tolinski - Full Stack JavaScript Web Developers

Syntax - Tasty Web Development Treats

988 Listeners

REWORK by 37signals

REWORK

208 Listeners

CoRecursive: Coding Stories by Adam Gordon Bell - Software Developer

CoRecursive: Coding Stories

185 Listeners

Kubernetes Podcast from Google by Abdel Sghiouar, Kaslin Fields

Kubernetes Podcast from Google

182 Listeners

The Stack Overflow Podcast by The Stack Overflow Podcast

The Stack Overflow Podcast

63 Listeners

Oxide and Friends by Oxide Computer Company

Oxide and Friends

47 Listeners

Latent Space: The AI Engineer Podcast by swyx + Alessio

Latent Space: The AI Engineer Podcast

62 Listeners

The Pragmatic Engineer by Gergely Orosz

The Pragmatic Engineer

51 Listeners