
Sign up to save your podcasts
Or
In this episode of Crazy Wisdom, I, Stewart Alsop, sit down with Naman Mishra, CTO of Repello AI, to unpack the real-world security risks behind deploying large language models. We talk about layered vulnerabilities—from the model, infrastructure, and application layers—to attack vectors like prompt injection, indirect prompt injection through agents, and even how a simple email summarizer could be exploited to trigger a reverse shell. Naman shares stories like the accidental leak of a Windows activation key via an LLM and explains why red teaming isn’t just a checkbox, but a continuous mindset. If you want to learn more about his work, check out Repello's website at repello.ai.
Check out this GPT we trained on the conversation!
Timestamps
00:00 - Stewart Alsop introduces Naman Mishra, CTO of Repel AI. They frame the episode around AI security, contrasting prompt injection risks with traditional cybersecurity in ML apps.
05:00 - Naman explains the layered security model: model, infrastructure, and application layers. He distinguishes safety (bias, hallucination) from security (unauthorized access, data leaks).
10:00 - Focus on the application layer, especially in finance, healthcare, and legal. Naman shares how ChatGPT leaked a Windows activation key and stresses data minimization and security-by-design.
15:00 - They discuss red teaming, how Repel AI simulates attacks, and Anthropic’s HackerOne challenge. Naman shares how adversarial testing strengthens LLM guardrails.
20:00 - Conversation shifts to AI agents and autonomy. Naman explains indirect prompt injection via email or calendar, leading to real exploits like reverse shells—all triggered by summarizing an email.
25:00 - Stewart compares the Internet to a castle without doors. Naman explains the cat-and-mouse game of security—attackers need one flaw; defenders must lock every door. LLM insecurity lowers the barrier for attackers.
30:00 - They explore input/output filtering, role-based access control, and clean fine-tuning. Naman admits most guardrails can be broken and only block low-hanging fruit.
35:00 - They cover denial-of-wallet attacks—LLMs exploited to run up massive token costs. Naman critiques DeepSeek’s weak alignment and state bias, noting training data risks.
40:00 - Naman breaks down India’s AI scene: Bangalore as a hub, US-India GTM, and the debate between sovereignty vs. pragmatism. He leans toward India building foundational models.
45:00 - Closing thoughts on India’s AI future. Naman mentions Sarvam AI, Krutrim, and Paris Chopra’s Loss Funk. He urges devs to red team before shipping—"close the doors before enemies walk in."
Key Insights
4.9
6969 ratings
In this episode of Crazy Wisdom, I, Stewart Alsop, sit down with Naman Mishra, CTO of Repello AI, to unpack the real-world security risks behind deploying large language models. We talk about layered vulnerabilities—from the model, infrastructure, and application layers—to attack vectors like prompt injection, indirect prompt injection through agents, and even how a simple email summarizer could be exploited to trigger a reverse shell. Naman shares stories like the accidental leak of a Windows activation key via an LLM and explains why red teaming isn’t just a checkbox, but a continuous mindset. If you want to learn more about his work, check out Repello's website at repello.ai.
Check out this GPT we trained on the conversation!
Timestamps
00:00 - Stewart Alsop introduces Naman Mishra, CTO of Repel AI. They frame the episode around AI security, contrasting prompt injection risks with traditional cybersecurity in ML apps.
05:00 - Naman explains the layered security model: model, infrastructure, and application layers. He distinguishes safety (bias, hallucination) from security (unauthorized access, data leaks).
10:00 - Focus on the application layer, especially in finance, healthcare, and legal. Naman shares how ChatGPT leaked a Windows activation key and stresses data minimization and security-by-design.
15:00 - They discuss red teaming, how Repel AI simulates attacks, and Anthropic’s HackerOne challenge. Naman shares how adversarial testing strengthens LLM guardrails.
20:00 - Conversation shifts to AI agents and autonomy. Naman explains indirect prompt injection via email or calendar, leading to real exploits like reverse shells—all triggered by summarizing an email.
25:00 - Stewart compares the Internet to a castle without doors. Naman explains the cat-and-mouse game of security—attackers need one flaw; defenders must lock every door. LLM insecurity lowers the barrier for attackers.
30:00 - They explore input/output filtering, role-based access control, and clean fine-tuning. Naman admits most guardrails can be broken and only block low-hanging fruit.
35:00 - They cover denial-of-wallet attacks—LLMs exploited to run up massive token costs. Naman critiques DeepSeek’s weak alignment and state bias, noting training data risks.
40:00 - Naman breaks down India’s AI scene: Bangalore as a hub, US-India GTM, and the debate between sovereignty vs. pragmatism. He leans toward India building foundational models.
45:00 - Closing thoughts on India’s AI future. Naman mentions Sarvam AI, Krutrim, and Paris Chopra’s Loss Funk. He urges devs to red team before shipping—"close the doors before enemies walk in."
Key Insights
224,206 Listeners
26,331 Listeners
5,266 Listeners
2,637 Listeners
1,878 Listeners
4,703 Listeners
3,684 Listeners
8,759 Listeners
5,410 Listeners
28,335 Listeners
15,313 Listeners
8 Listeners
985 Listeners
14,668 Listeners
0 Listeners
69 Listeners
0 Listeners