July 06, 2025

Week of 2025-07-06

6 minutes

Alex: Hello and welcome to The Generative AI Group Digest for the week of 06 Jul 2025!

Maya: We're Alex and Maya.

Alex: First up, we’re talking about handling complex Excel sheets with multiple pivots and headers for AI agents. SaiVignan Malyala shared the challenge of feeding these dynamic Excel formats to large language models—LLMs don’t quite get the structure right.

Maya: That sounds tricky! Alex, have you tried parsing Excel files with multiple pivot tables before?

Alex: Not directly, but Krishna pointed out that we have libraries that can parse pivots and complex headers, like openpyxl or pandas with pivot_table extraction, which help convert sheets into more structured data!

Maya: So once the Excel is parsed properly, the LLM can work better?

Alex: Exactly! Krishna said, “If that is parsed, then it shouldn’t be an issue to sending that to an LLM.” The key is preprocessing with a reliable Excel parser before passing data on.

Maya: Great tip! Next, let’s move on to AI-powered virtual try-on tools.

Alex: Akshay Taneja highlighted Fashn AI, which stood out after testing 20+ virtual try-ons. Plus, startups like Alphabake and Weshop AI offer tools that let users swap specific accessories like hats or watches, though some focus on face swap rather than full try-on.

Maya: I love the idea of selective swapping! Does this mean we’re inching closer to truly interactive shopping experiences powered by AI?

Alex: Absolutely! This personalization gives users control over exactly what they want to try virtually, making online shopping more engaging and accurate.

Maya: Next, let's talk about function calls in Gemini compositional API. Bharat reported an issue where Gemini executes the first function but returns text instead of chaining the next function call.

Alex: Right, Sid suggested that traditional function calling might be less reliable here and recommended generating and running code directly using tools like a code interpreter to handle sequences more robustly.

Maya: That seems powerful but complex. Adit mentioned smolAgents tackling this issue by abstracting function calls, making agents more powerful but also potentially unpredictable.

Alex: So for deep workflows, using code-based orchestration or agent frameworks like smolAgents can improve reliability.

Maya: Next, nano-vLLM caught some attention. SaiVignan shared Deepseek’s lightweight vLLM implementation, which is fully offline and open source.

Alex: Though Adarsh clarified it’s actually a side project by one of Deepseek’s researchers, not an official product. But it’s exciting to see lightweight vLLM tools for edge deployment growing.

Maya: And to tie it in, Nikhil Bhaskaran launched a fully offline, open source full-stack coding agent on Product Hunt – perfect for local development without cloud dependency!

Alex: On to prompt engineering—Paras Chopra raised a great point: Claude’s system prompts run for many pages, unlike our shorter prompts. Tanisha and Varun elaborated on how longer prompts act like behavioral programming languages to set guardrails for broad generalization.

Maya: So in production use, prompts are often pages long for safety and boundary setting, while academic prompts are short for disambiguation. That’s fascinating!

Alex: Yes! Ajay shared prompts spanning about 3,000 lines with many guardrails. Others suggest layering prompts or guardrails over multiple stages to cut latency and reduce hallucinations.

Maya: So prompt engineering is evolving into software engineering? Sounds like a big shift!

Alex: It is! Paras even suggested dissecting major model providers’ prompts as a mini research project.

Maya: Speaking of deep research, AllenAI released sci-arena benchmarking foundation models across scientific literature. That’s a useful resource for anyone building research agents!

Alex: Now, on AI privacy and compliance—Somya asked about using AI in regulated industries like banking. Yash and Bharat noted most prefer privately hosted models for privacy, requiring certifications like SOC and HIPAA.

Maya: Compliance is critical! Rohit Patil emphasized avoiding AI products without enterprise offerings and ensuring data localization.

Alex: And Dilip highlighted that beyond certifications, enforcing policies and procedures is essential for production use.

Maya: On medical AI, Bharat Shetty shared Microsoft’s AI Diagnostic Orchestrator, which combines multiple models to outperform individual ones and human physicians on NEJM cases by a wide margin.

Alex: That’s huge! Though tp53(ashish) cautioned that NEJM cases are rare and not fully representative of real-world practice, more realistic evaluations with messy clinical data are needed.

Maya: Next, thoughts on agent orchestration? Shresth Shukla asked how to manage many tools or agents practically without putting everything into a giant prompt.

Alex: The idea is to apply hierarchical searching—try one toolset first, fall back to another if no matches, reducing cost and complexity.

Maya: That’s a smart way to scale multi-agent systems.

Alex: Listener, here’s a quick tip for you: when building AI workflows with multiple function calls, try implementing a code-based orchestrator that manages dependencies directly, rather than chaining calls purely via the LLM’s outputs.

Maya: Great tip! Alex, how would you use that in your projects?

Alex: I’d create abstract functions that wrap sequences and test them separately, using tools like smolAgents. It keeps things modular and reduces unpredictable LLM errors.

Maya: Nice approach! Wrapping up, Alex: remember, proper data preprocessing before feeding to LLMs is key—like parsing complex Excel sheets into structured formats.

Alex: And don’t forget, prompt engineering is evolving into an art of building layered, software-like guardrails to keep AI safe and effective.

Maya: That’s all for this week’s digest.

Alex: See you next time!

...more

View all episodes