April 08, 2026

Llama 2 & 3 Safety: Soumya Batra on Agentic AI Training

22 minutes

In this episode of The Data Engineering Show, host Benjamin Wagner sits down with Soumya Batra, founder and CEO of WisePort AI and former tech lead at Meta where she led safety efforts for Llama 2 and Llama 3, to explore the evolution of NLP, the complete lifecycle of foundation model training, and why the next AI frontier lies in natively agentic systems rather than simply scaling larger transformers.

What You'll Learn:

Why historical NLP work becomes obsolete with each paradigm shift: Understand how Bayesian networks, RNNs, and LSTMs each dominated until replaced - and why current transformer-scaling dogma will likely face the same fate
How to structure the foundation model training lifecycle for safety: Learn the three critical phases - pretraining (data mix optimization), supervised fine-tuning (instruction alignment), and reinforcement learning (human preference integration)—and where safety interventions deliver maximum leverage
The counterintuitive data strategy for pretraining safety: Discover why removing all toxic content actually weakens model robustness, and how maintaining a precise balance preserves the model's ability to classify and refuse harmful requests
How dual reward models maximize both helpfulness and safety: See why combining helpfulness and safety objectives (as done in Llama 3) ensures every training sample reinforces both capabilities simultaneously rather than creating trade-offs
What "natively agentic" means and why it matters more than LLM-powered agents: Learn how foundational agentic models dynamically explore action spaces at inference time instead of relying on fixed developer-defined scaffolding, unlocking domain-agnostic workflows
How to build a foundational AI startup without massive training datasets: Understand why synthetic data generation, deterministic task validation, and deep domain expertise can substitute for Internet-scale language corpora in the agentic space

If you enjoyed this episode, make sure to subscribe, rate, and review it on Apple Podcasts, Spotify, and YouTube Podcasts. Instructions on how to do this are here.

About the Guest(s)
Soumya Batra is the Founder and CEO of WisePort AI, a foundational AI company specializing in agentic AI systems. With over twelve years of expertise in NLP and machine learning, she previously served as a Tech Lead and Applied Research Scientist at Meta, where she led safety and controllability efforts for both Llama 2 and Llama 3. Her career spans foundational work at Carnegie Mellon University, Microsoft, and Meta, establishing her as a pioneering voice in conversational AI and foundation model development. In this episode, Soumya demystifies the journey from traditional NLP to large language models, revealing how safety and controllability are embedded across the entire model lifecycle—from pretraining through reinforcement learning. Her insights on the future of agentic AI and the limitations of current scaling-only approaches provide essential perspective for data engineers and ML practitioners navigating the rapidly evolving AI landscape.

Quotes
"I did not know then that this would become my career for the next decade." - Soumya

"Whatever work that I've done in the past becomes irrelevant all of a sudden." - Soumya

"There is always a notion of, yes, this is the big thing, and then no, it's not anymore." - Soumya

"I really think that we are going to be proven wrong once again about scaling transformers being the only way to achieve general intelligence." - Soumya

"Safety was an issue even back then, even though we were training in such controlled settings." - Soumya

"If you don't put some toxic content there, then it will lose the ability to classify it and it'll be much easier to break the safety later on." - Soumya

"In the post training phase, we are giving it that ability to be able to answer users' questions." - Soumya

"The next unlock will now come from foundational agent models that are natively agentic, which will unlock use cases that look unimaginable to us right now." - Soumya

"Natively agentic means the foundational model itself needs to dynamically explore the action space, rather than scaffolding around existing LLMs." - Soumya

"The real unlock comes from creating your own use cases, creating your own synthetic data, and going deep into a few workflows." - Soumya

Resources
Connect on LinkedIn:

Soumya Batra - https://in.linkedin.com/in/soumyabatra
Benjamin Wagner - https://www.linkedin.com/in/wagjamin

Websites:

WisePort AI – https://www.wiseport.ai
Firebolt - https://www.firebolt.io

Articles & Research Papers:

LLaMA: Open and Efficient Foundation Language Models – Meta AI Research
Lima: Less Is More for Alignment – Stanford & Meta AI Research

Educational Institutions:

Carnegie Mellon University - Language Technologies Institute (ATI)

The Data Engineering Show is brought to you by firebolt.io and handcrafted by our friends over at: fame.so

Previous guests include: Joseph Machado of Linkedin, Metthew Weingarten of Disney, Joe Reis and Matt Housely, authors of The Fundamentals of Data Engineering, Zach Wilson of Eczachly Inc, Megan Lieu of Deepnote, Erik Heintare of Bolt, Lior Solomon of Vimeo, Krishna Naidu of Canva, Mike Cohen of Substack, Jens Larsson of Ark, Gunnar Tangring of Klarna, Yoav Shmaria of Similarweb and Xiaoxu Gao of Adyen.

Check out our three most downloaded episodes:

Zach Wilson on What Makes a Great Data Engineer
Joe Reis and Matt Housley on The Fundamentals of Data Engineering
Bill Inmon, The Godfather of Data Warehousing

...more

View all episodes

By The Firebolt Data Bros

3.8

88 ratings

April 08, 2026

Llama 2 & 3 Safety: Soumya Batra on Agentic AI Training

22 minutes

Why historical NLP work becomes obsolete with each paradigm shift: Understand how Bayesian networks, RNNs, and LSTMs each dominated until replaced - and why current transformer-scaling dogma will likely face the same fate
How to structure the foundation model training lifecycle for safety: Learn the three critical phases - pretraining (data mix optimization), supervised fine-tuning (instruction alignment), and reinforcement learning (human preference integration)—and where safety interventions deliver maximum leverage
The counterintuitive data strategy for pretraining safety: Discover why removing all toxic content actually weakens model robustness, and how maintaining a precise balance preserves the model's ability to classify and refuse harmful requests
How dual reward models maximize both helpfulness and safety: See why combining helpfulness and safety objectives (as done in Llama 3) ensures every training sample reinforces both capabilities simultaneously rather than creating trade-offs
What "natively agentic" means and why it matters more than LLM-powered agents: Learn how foundational agentic models dynamically explore action spaces at inference time instead of relying on fixed developer-defined scaffolding, unlocking domain-agnostic workflows
How to build a foundational AI startup without massive training datasets: Understand why synthetic data generation, deterministic task validation, and deep domain expertise can substitute for Internet-scale language corpora in the agentic space