DataTalks.Club

Data Intensive AI - Bartosz Mikulski


Listen Later

In this podcast episode, we talked with Bartosz Mikulski about Data Intensive AI.


About the Speaker:

Bartosz is an AI and data engineer. He specializes in moving AI projects from the good-enough-for-a-demo phase to production by building a testing infrastructure and fixing the issues detected by tests. On top of that, he teaches programmers and non-programmers how to use AI. He contributed one chapter to the book 97 Things Every Data Engineer Should Know, and he was a speaker at several conferences, including Data Natives, Berlin Buzzwords, and Global AI Developer Days. 


In this episode, we discuss Bartosz’s career journey, the importance of testing in data pipelines, and how AI tools like ChatGPT and Cursor are transforming development workflows. From prompt engineering to building Chrome extensions with AI, we dive into practical use cases, tools, and insights for anyone working in data-intensive AI projects. Whether you’re a data engineer, AI enthusiast, or just curious about the future of AI in tech, this episode offers valuable takeaways and real-world experiences.


0:00 Introduction to Bartosz and his background

4:00 Bartosz’s career journey from Java development to AI engineering

9:05 The importance of testing in data engineering

11:19 How to create tests for data pipelines

13:14 Tools and approaches for testing data pipelines

17:10 Choosing Spark for data engineering projects

19:05 The connection between data engineering and AI tools

21:39 Use cases of AI in data engineering and MLOps

25:13 Prompt engineering techniques and best practices

31:45 Prompt compression and caching in AI models

33:35 Thoughts on DeepSeek and open-source AI models

35:54 Using AI for lead classification and LinkedIn automation

41:04 Building Chrome extensions with AI integration

43:51 Comparing Cursor and GitHub Copilot for coding

47:11 Using ChatGPT and Perplexity for AI-assisted tasks

52:09 Hosting static websites and using AI for development

54:27 How blogging helps attract clients and share knowledge

58:15 Using AI to assist with writing and content creation


🔗 CONNECT WITH Bartosz

LinkedIn: https://www.linkedin.com/in/mikulskibartosz/

Github: https://github.com/mikulskibartosz

Website: https://mikulskibartosz.name/blog/


🔗 CONNECT WITH DataTalksClub

Join the community - https://datatalks.club/slack.html

Subscribe to our Google calendar to have all our events in your calendar - https://calendar.google.com/calendar/r?cid=ZjhxaWRqbnEwamhzY3A4ODA5azFlZ2hzNjBAZ3JvdXAuY2FsZW5kYXIuZ29vZ2xlLmNvbQ

Check other upcoming events - https://lu.ma/dtc-events

LinkedIn - https://www.linkedin.com/company/datatalks-club/

Twitter - https://twitter.com/DataTalksClub Website - https://datatalks.club/

...more
View all episodesView all episodes
Download on the App Store

DataTalks.ClubBy DataTalks.Club

  • 5
  • 5
  • 5
  • 5
  • 5

5

7 ratings


More shows like DataTalks.Club

View all
Software Engineering Radio - the podcast for professional software developers by se-radio@computer.org

Software Engineering Radio - the podcast for professional software developers

273 Listeners

TED Talks Daily by TED

TED Talks Daily

11,177 Listeners

The Changelog: Software Development, Open Source by Changelog Media

The Changelog: Software Development, Open Source

285 Listeners

Freakonomics Radio by Freakonomics Radio + Stitcher

Freakonomics Radio

31,896 Listeners

Super Data Science: ML & AI Podcast with Jon Krohn by Jon Krohn

Super Data Science: ML & AI Podcast with Jon Krohn

296 Listeners

Machine Learning Guide by OCDevel

Machine Learning Guide

764 Listeners

DataFramed by DataCamp

DataFramed

268 Listeners

Learning Bayesian Statistics by Alexandre Andorra

Learning Bayesian Statistics

66 Listeners

The Real Python Podcast by Real Python

The Real Python Podcast

138 Listeners

声动早咖啡 by 声动活泼

声动早咖啡

263 Listeners