DataTalks.Club

Build Your Own Data Pipeline - Andreas Kretz


Listen Later

We talked about:

  • Andreas’s background
  • Why data engineering is becoming more popular
  • Who to hire first – a data engineer or a data scientist?
  • How can I, as a data scientist, learn to build pipelines?
  • Don’t use too many tools
  • What is a data pipeline and why do we need it?
  • What is ingestion?
  • Can just one person build a data pipeline?
  • Approaches to building data pipelines for data scientists
  • Processing frameworks
  • Common setup for data pipelines — car price prediction
  • Productionizing the model with the help of a data pipeline
  • Scheduling
  • Orchestration
  • Start simple
  • Learning DevOps to implement data pipelines
  • How to choose the right tool
  • Are Hadoop, Docker, Cloud necessary for a first job/internship?
  • Is Hadoop still relevant or necessary?
  • Data engineering academy
  • How to pick up Cloud skills
  • Avoid huge datasets when learning
  • Convincing your employer to do data science
  • How to find Andreas

  • Links:

    • LinkedIn: https://www.linkedin.com/in/andreas-kretz
    • Data engieering cookbook: https://cookbook.learndataengineering.com/
    • Course: https://learndataengineering.com/

    • Join DataTalks.Club: https://datatalks.club/slack.html

      Our events: https://datatalks.club/events.html

      ...more
      View all episodesView all episodes
      Download on the App Store

      DataTalks.ClubBy DataTalks.Club

      • 5
      • 5
      • 5
      • 5
      • 5

      5

      7 ratings


      More shows like DataTalks.Club

      View all
      Radiolab by WNYC Studios

      Radiolab

      43,981 Listeners

      Hidden Brain by Hidden Brain, Shankar Vedantam

      Hidden Brain

      43,720 Listeners

      The Knowledge Project by Shane Parrish

      The Knowledge Project

      2,672 Listeners

      Super Data Science: ML & AI Podcast with Jon Krohn by Jon Krohn

      Super Data Science: ML & AI Podcast with Jon Krohn

      304 Listeners

      Data Engineering Podcast by Tobias Macey

      Data Engineering Podcast

      146 Listeners

      The Real Python Podcast by Real Python

      The Real Python Podcast

      141 Listeners

      Huberman Lab by Scicomm Media

      Huberman Lab

      29,222 Listeners

      The Ezra Klein Show by New York Times Opinion

      The Ezra Klein Show

      16,042 Listeners

      ReThinking by TED

      ReThinking

      618 Listeners

      Data Career Podcast: Helping You Land a Data Analyst Job FAST by Avery Smith - Data Career Coach

      Data Career Podcast: Helping You Land a Data Analyst Job FAST

      161 Listeners

      The Analytics Engineering Podcast by dbt Labs, Inc.

      The Analytics Engineering Podcast

      28 Listeners

      The Tucker Carlson Show by Tucker Carlson Network

      The Tucker Carlson Show

      16,880 Listeners