Generative AI Group Podcast

Week of 2025-02-02


Listen Later

Alex: Hello and welcome to The Generative AI Group Digest for the week of 02 Feb 2025!
Maya: We're Alex and Maya.
Alex: First up, we’re talking about DeepSeek and its impact on AI hardware and research.
Maya: DeepSeek has been a huge buzz lately! Alex, have you seen the market reaction?
Alex: Yeah! Nvidia’s stock took a 14% dip because of worries DeepSeek might reduce GPU demand.
Maya: Why would DeepSeek lower GPU demand?
Alex: Because they optimized training to use fewer GPUs, but as Paras Chopra pointed out, greater efficiency usually means you end up buying more, not less — it’s called Jevons paradox.
Maya: So, more efficiency actually drives demand up? That's counterintuitive!
Alex: Exactly! DeepSeek’s final training run reportedly cost just $6 million on around 2000 H800 GPUs, quite low compared to billions spent by others, but that’s just one run, excluding failed experiments.
Maya: And teams are racing to replicate and improve DeepSeek’s work, right?
Alex: Yes, the community is actively building on it. ASK Sathvik shared plans for a DeepSeek hackathon on Feb 8, focusing on GPUs, food, and innovation!
Maya: Sounds exciting! Next, let's move on to reinforcement learning (RL) on LLMs and what it means for model training.
Alex: Absolutely! There was a great discussion with ASK Sathvik and Paras Chopra on how RL differs from supervised fine-tuning (SFT).
Maya: How does RL improve models differently from SFT?
Alex: RL rewards the model when it arrives at the correct final answer after reasoning steps, unlocking better generalization and emergent reasoning behaviors, unlike SFT where models predict tokens step by step.
Maya: So RL acts more like learning from experience rather than just memorizing?
Alex: Exactly, but it requires a strong base model to work well. Sathvik is experimenting with this on smaller models and targeted datasets to save compute.
Maya: That suggests affordable, scalable RL could be game-changing.
Alex: Indeed! Paras Chopra emphasized creativity and ambition over GPU scarcity — breakthroughs can happen even on academic budgets.
Maya: Interesting! Now, let's move on to agentic AI frameworks and tooling.
Alex: Right! Folks here use low-code and no-code platforms like Langflow, CrewAI, and cloud-based frameworks like Microsoft Autogen or Google Gemini for building agentic workflows.
Maya: What’s an example of a production use case?
Alex: Harinder Takhar shared deploying an agentic workflow for AML and fraud checks — a sequence of 25-30 verification tasks built by non-engineers.
Maya: That’s a huge time saver! Any tips on managing complex agent retries?
Alex: Bharat Shetty suggested using state machines, like xstate.js, to handle retry loops with flexible control instead of hardcoding every edge.
Maya: Great engineering insight! Next, let’s discuss the Indian AI research ecosystem and the India AI mission.
Alex: Paras Chopra shared some eye-opening stats — India’s NeurIPS paper share is only 0.8%, a wake-up call and an opportunity.
Maya: So what’s the plan?
Alex: India’s AI mission, led by Aakrit Vaish, is encouraging teams to build state-of-the-art reasoning models with open proposals reviewed monthly.
Maya: Exciting! Many here want to collaborate to form super teams for foundational models.
Alex: And the government is launching a dataset platform to support model builders, though it’ll take time to scale.
Maya: That’s a big step for data infrastructure! Now, let's touch on open-source and hosted LLMs in production.
Alex: DeepSeek R1 models are being hosted on servers with RTX 4090 GPUs, but the full 617B parameter model needs multi-GPU setups or distilled versions.
Maya: Hosting is tricky! Anyone shared benchmarks?
Alex: David from 1Legion shared tokens per second benchmarks for various DeepSeek R1 models and offers testers access.
Maya: That’s useful! What about inference latency and API reliability?
Alex: Azure's DeepSeek API is reportedly slow, sometimes below 1 token per second, possibly due to high scale-up costs.
Maya: So many providers exist, but performance and costs vary widely.
Alex: Exactly. Portkey, Fireworks, Groq, and others offer competing services with different trade-offs.
Maya: Alright, next — reasoning models and chain-of-thought (CoT) outputs.
Alex: DeepSeek R1 and OpenAI’s O1 models both show emergent personalities and "thinking" steps during inference.
Maya: Why is showing CoT valuable?
Alex: It helps humans understand model reasoning and can build trust; some say it makes for addictive interactions.
Maya: And some are working on visualizing CoT and knowledge graphs to harness these thought tokens better.
Alex: Paras Chopra even suggested startup ideas there!
Maya: And last topic — IT services companies and AI transformation.
Alex: There was an extensive discussion on whether Indian IT services firms will innovate or get disrupted.
Maya: What’s the consensus?
Alex: While some see lack of R&D investment, others argue IT firms may pivot to AI-human hybrid services and "AI transformations" leveraging deep domain knowledge and client trust.
Maya: That makes sense; these firms have survived previous tech waves and might do so again.
Alex: Absolutely. But we all agree that startups and focused AI initiatives are crucial to pushing innovation further.
Maya: Here’s a pro tip you can try today: If you’re overwhelmed by data evaluation, try building a Google Sheets script that leverages an LLM API to rank and filter responses based on natural language criteria — it’s a productivity booster!
Maya: Alex, how would you use that in your workflow?
Alex: I’d use it to quickly screen research proposals or community submissions, saving hours of manual sorting. Plus, it’s a neat way to integrate AI into everyday tools without heavy engineering.
Alex: Remember, the most important thing is creativity and ambition — access to GPUs is just one part of the equation.
Maya: Don’t forget, agentic AI frameworks and reasoning models are making AI more interactive and trustworthy, changing how we build applications.
Maya: That’s all for this week’s digest.
Alex: See you next time!
...more
View all episodesView all episodes
Download on the App Store

Generative AI Group PodcastBy