Share AI Stories
Share to email
Share to Facebook
Share to X
By Neil Leiser
The podcast currently has 52 episodes available.
Our guest today is Loubna Ben Allal, Machine Learning Engineer at Hugging Face 🤗 .
In our conversation, Loubna first explains how she built two impressive code generation models: StarCoder and StarCoder2. We dig into the importance of data when training large models and what can be done on the data side to improve LLMs performance.
We then dive into synthetic data generation and discuss the pros and cons. Loubna explains how she built Cosmopedia, a dataset fully synthetic generated using Mixtral 8x7B.
Loubna also shares career mistakes, advice and her take on the future of developers and code generation.
If you enjoyed the episode, please leave a 5 star review and subscribe to the AI Stories Youtube channel.
Cosmopedia Dataset: https://huggingface.co/blog/cosmopedia
StarCoder blog post: https://huggingface.co/blog/starcoder
Follow Loubna on LinkedIn: https://www.linkedin.com/in/loubna-ben-allal-238690152/
Follow Neil on LinkedIn: https://www.linkedin.com/in/leiserneil/
---
(00:00) - Intro
(02:00) - How Loubna Got Into Data & AI
(03:57) - Internship at Hugging Face
(06:21) - Building A Code Generation Model: StarCoder
(12:14) - Data Filtering Techniques for LLMs
(18:44) - Training StarCoder
(21:35) - Will GenAI Replace Developers?
(25:44) - Synthetic Data Generation & Building Cosmopedia
(35:44) - Evaluating a 1B Params Model Trained on Synthetic Data
(43:43) - Challenges faced & Career Advice
Our guest today is Petar Veličković, Staff Research Scientist at Google DeepMind and Affiliated Lecturer at University of Cambridge.
In our conversation, we first dive into how Petar got into Graph ML and discuss his most cited paper: Graph Attention Networks. We then dig into DeepMind where Petar shares tips and advice on how to get into this competitive company and explains the difference between research scientists and research engineering roles.
We finally talk about applied work that Petar worked on including building Google Maps' ETA algorithm and an AI coach football coach assistant to help Liverpool FC improve corner kicks.
If you enjoyed the episode, please leave a 5 star review and subscribe to the AI Stories Youtube channel.
Graph Attention Networks Paper: https://arxiv.org/abs/1710.10903
ETA Prediction with Graph Neural Networks in Google Maps: https://arxiv.org/abs/2108.11482
TacticAI: an AI assistant for football tactics (with Liverpool FC): https://arxiv.org/abs/2402.01306
Follow Petar on LinkedIn: https://www.linkedin.com/in/petarvelickovic/
Follow Neil on LinkedIn: https://www.linkedin.com/in/leiserneil/
---
(00:00) - Intro
(02:44) - How Petar got into AI
(06:14) - GraphML and Geometric Deep Learning
(10:10) - Graph Attention Networks
(17:00) - Joining DeepMind
(20:24) - What Makes DeepMind People Special?
(22:28) - Getting into DeepMind
(24:36) - Research Scientists Vs Research Engineer
(30:40) - Petar's Career Evolution at DeepMind
(35:20) - Importance of Side Projects
(38:30) - Building Google Maps ETA Algorithm
(47:30) - Tactic AI: Collaborating with Liverpool FC
(01:03:00) - Career advice
Our guest today is Lewis Tunstall, LLM Engineer and researcher at Hugging Face and book author of "Natural Language Processing with Transformers".
In our conversation, we dive into topological machine learning and talk about giotto-tda, a high performance topological ml Python library that Lewis worked on. We then dive into LLMs and Transformers. We discuss the pros and cons of open source vs closed source LLMs and explain the differences between encoder and decoder transformer architectures. Lewis finally explains his day-to-day at Hugging Face and his current work on fine-tuning LLMs.
If you enjoyed the episode, please leave a 5 star review and subscribe to the AI Stories Youtube channel.
Link to Train in Data courses (use the code AISTORIES to get a 10% discount): https://www.trainindata.com/courses?affcode=1218302_5n7kraba
Natural Language Processing with Transformers book: https://www.oreilly.com/library/view/natural-language-processing/9781098136789/
Giotto-tda library: https://github.com/giotto-ai/giotto-tda
KTO alignment paper: https://arxiv.org/abs/2402.01306
Follow Lewis on LinkedIn: https://www.linkedin.com/in/lewis-tunstall/
Follow Neil on LinkedIn: https://www.linkedin.com/in/leiserneil/
---
(00:00) - Intro
(03:00) - How Lewis Got into AI
(05:33) - From Kaggle Competitions to Data Science Job
(11:09) - Get an actual Data Science Job!
(15:18) - Deep Learning or Excel?
(19:14) - Topological Machine Learning
(28:44) - Open Source VS Closed Source LLMs
(41:44) - Writing a Book on Transformers
(52:33) - Comparing BERT, Early Transformers, and GPT-4
(54:48) - Encoder and Decoder Architectures
(59:48) - Day-To-Day Work at Hugging Face
(01:09:06) - DPO and KTO
(01:12:58) - Stories and Career Advice
Our guest today is Maria Vecthomova, ML Engineering Manager at Ahold Delhaize and Co-Founder of Marvelous MLOps.
In our conversation, we first talk about code best practices for Data Scientists. We then dive into MLOps, discuss the main components required to deploy a model in production and get an overview of one of Maria's project where she built and deployed a fraud detection algorithm. We finally talk about content creation, career advice and the differences between an ML and an MLOps engineer.
If you enjoyed the episode, please leave a 5 star review and subscribe to the AI Stories Youtube channel.
Link to Train in Data courses (use the code AISTORIES to get a 10% discount): https://www.trainindata.com/courses?affcode=1218302_5n7kraba
Check out Marvelous MLOps: https://marvelousmlops.substack.com/
Follow Maria on LinkedIn: https://www.linkedin.com/in/maria-vechtomova/
Follow Neil on LinkedIn: https://www.linkedin.com/in/leiserneil/
---
(00:00) - Intro
(02:59) - Maria’s Journey to MLOps
(08:50) - Code Best Practices
(18:39) - MLOps Infrastructure
(29:10) - ML Engineering for Fraud Detection
(40:42) - Content Creation & Marvelous MLOps
(49:01) - ML Engineer vs MLOps Engineer
(56:00) - Stories & Career Advice
Our guest today is Reah Miyara. Reah is currently working on LLMs evaluation at OpenAI and previously worked at Google and IBM.
In our conversation, Reah shares his experience working as a product lead for Google's graph-based machine learning portfolio. He then explains how he joined OpenAI and his role there. We finally talk about LLMs evaluation, AGI, LLMs safety and the future of the field.
If you enjoyed the episode, please leave a 5 star review and subscribe to the AI Stories Youtube channel.
Link to Train in Data courses (use the code AISTORIES to get a 10% discount): https://www.trainindata.com/courses?affcode=1218302_5n7kraba
Follow Reah on LinkedIn: https://www.linkedin.com/in/reah/
Follow Neil on LinkedIn: https://www.linkedin.com/in/leiserneil/
---
(00:00) - Intro
(03:09) - Getting into AI and Machine Learning
(08:33) - Why Stay in AI?
(11:39) - From Software Engineer to Product Manager
(18:27) - Experience at Google
(25:28) - Applications of Graph ML
(31:10) - Joining OpenAI
(35:15) - LLM Evaluation
(44:30) - The Future of GenAI and LLMs
(55:48) - Safety Metrics for LLMs
(1:00:30) - Career Advice
Our guest today is Erwin Huizenga, Machine Learning Lead at Google and expert in Applied AI and LLMOps.
In our conversation, Erwin first discusses how he got into the field and his previous experiences at SAS and IBM. We then talk about his work at Google: from the early days of cloud computing when he joined the company to his current work on Gemini. We finally dive into the world of LLMOps and share insights on how to evaluate LLMs, how to monitor their performances and how to deploy them.
If you enjoyed the episode, please leave a 5 star review and subscribe to the AI Stories Youtube channel.
Link to Train in Data courses (use the code AISTORIES to get a 10% discount): https://www.trainindata.com/courses?affcode=1218302_5n7kraba
Erwin's LLMOps coursera course: https://www.deeplearning.ai/short-courses/llmops/
Follow Erwin on LinkedIn: https://www.linkedin.com/in/erwinhuizenga/
Follow Neil on LinkedIn: https://www.linkedin.com/in/leiserneil/
---
(00:00) - Intro
(05:04) - Early Experiences
(15:51) - Joining Google
(20:20) - Early Days of Cloud Computing
(26:18) - Advantages of Cloud Infrastructure
(30:09) - Gemini and its Launch
(37:32) - Gemini vs Other LLMs
(46:15) - LLMOps
(50:50) - Evaluating and Monitoring LLMs
(57:34) - Deploying LLMs vs Traditional ML Models
(01:01:07) - Personal Stories and Career Insights
Our guest today is Andras Palffy, Co-Founder of Perciv AI: a startup offering AI based software solutions to build robust and affordable autonomous systems.
In our conversation, we first talk about Andras' PhD focusing on road users detection. We dive into AI applied to autonomous driving and discuss the pros and cons of the most common pieces of hardware: cameras, lidars and radars. We then focus on Perciv AI. Andras explains why he decided to focus on radars and how he uses Deep Learning algorithms to enable autonomous systems. He finally gives his take on the future of autonomous vehicles and shares learnings from his experience in the field.
If you enjoyed the episode, please leave a 5 star review and subscribe to the AI Stories Youtube channel.
Link to Train in Data courses (use the code AISTORIES to get a 10% discount): https://www.trainindata.com/courses?affcode=1218302_5n7kraba
To learn more about Perciv AI: https://www.perciv.ai/
Follow Andras on LinkedIn: https://www.linkedin.com/in/andraspalffy/
Follow Neil on LinkedIn: https://www.linkedin.com/in/leiserneil/
---
(00:00) - Intro
(02:57) - Andras' Journey into AI
(06:11) - Getting into Robotics
(10:15) - Evolution of Computer Vision Algorithms
(13:38) - PhD on Autonomous Driving & Road Users Detection
(28:01) - Launching Perciv AI
(35:19) - Augmenting Radars Performance with AI
(44:45) - Inside Perciv AI: Roles, Challenges, and Stories
(48:43) - Future of Autonomous Vehicles and Road Safety
(51:46) - Solving a Technical Challenge with Camera Calibration
(54:12) - Andras' First Self-Driving Car Experience
(56:09) - Career Advice
Our guest today is Franziska Kirschner, Co-Founder of Intropy AI and ex AI & Product Lead at Tractable: the world’s first computer vision unicorn.
In our conversation, we dive into Franziska's PhD, her career at Tractable and her experience building deep learning algorithms for computer vision products. She explains how she climbed the ladder from intern to AI Lead and shares how she launched new AI product lines generating £ millions in revenues.
If you enjoyed the episode, please leave a 5 star review and subscribe to the AI Stories Youtube channel.
Link to Train in Data courses (use the code AISTORIES to get a 10% discount): https://www.trainindata.com/courses?affcode=1218302_5n7kraba
Follow Franziska on LinkedIn: https://www.linkedin.com/in/frankirsch/
Follow Neil on LinkedIn: https://www.linkedin.com/in/leiserneil/
---
(00:00) - Introduction
(03:08) - Franziska's Journey into AI
(05:17) - Franziska's PhD in Condensed Matter Physics
(15:12) - Transition from Physics to AI
(19:20) - Deep Learning & Impact at Tractable
(33:21) - AI Researcher vs AI Product Manager
(37:52) - The Impact of AI on Scrapyards
(43:14) - Key Steps in Launching New AI Products
(53:31) - Founding Intropy AI
(01:00:37) - The Potato Travels
(01:04:10) - Advice for Career Progression
Our guest today is Maxime Labonne, GenAI Expert, book author and developer of NeuralBeagle14-7B, one of the best performing 7B params model on the open LLM leaderboard.
In our conversation, we dive deep into the world of GenAI. We start by explaining how to get into the field and resources needed to get started. Maxime then goes through the 4 steps used to build LLMs: Pre training, supervised fine-tuning, human feedback and merging models. Throughout our conversation, we also discuss RAG vs fine-tuning, QLoRA & LoRA, DPO vs RLHF and how to deploy LLMs in production.
If you enjoyed the episode, please leave a 5 star review and subscribe to the AI Stories Youtube channel.
Link to Train in Data courses (use the code AISTORIES to get a 10% discount): https://www.trainindata.com/courses?affcode=1218302_5n7kraba
Check out Maxime's LLM course: https://github.com/mlabonne/llm-course
Follow Maxime on LinkedIn: https://www.linkedin.com/in/maxime-labonne/
Follow Neil on LinkedIn: https://www.linkedin.com/in/leiserneil/
---
(00:00) - Intro
(02:37) - From Cybersecurity to AI
(06:05) - GenAI at Airbus
(13:29) - What does Maxime use ChatGPT for?
(15:31) - Getting into GenAI and learning resources
(22:23) - Steps to build your own LLM
(26:44) - Pre-training
(29:16) - Supervised fine-tuning, QLoRA & LoRA
(34:45) - RAG vs fine-tuning
(37:53) - DPO vs RLHF
(41:01) - Merging Models
(45:05) - Deploying LLMs
(46:52) - Stories and career advice
Our guest today is Harpreet Sahota, Deep Learning Developer Relations Manager at Deci AI.
In our conversation, we first talk about Harpreet’s work as a Biostatistician and dive into A/B testing. We then talk about Deci AI and Neural Architecture Search (NAS): the algorithm used to build powerful deep learning models like YOLO-NAS. We finally dive into GenAI where Harpreet shares 7 prompting tips and explains how Retrieval Augmented Generation (RAG) works.
If you enjoyed the episode, please leave a 5 star review and subscribe to the AI Stories Youtube channel.
Link to Train in Data courses (use the code AISTORIES to get a 10% discount): https://www.trainindata.com/courses?affcode=1218302_5n7kraba
Follow Harpreet on LinkedIn: https://www.linkedin.com/in/harpreetsahota204/
Follow Neil on LinkedIn: https://www.linkedin.com/in/leiserneil/
---
(00:00) - Intro
(02:34) - Harpreet's Journey into Data Science
(07:00) - A/B Testing
(17:50) - DevRel at Deci AI
(26:25) - Deci AI: Products and Services
(32:22) - Neural Architecture Search (NAS)
(36:58) - GenAI
(39:53) - Tools for Playing with LLMs
(42:56) - Mastering Prompt Engineering
(46:35) - Retrieval Augmented Generation (RAG)
(54:12) - Career Advice
The podcast currently has 52 episodes available.
111,258 Listeners