Louise Ai agent - David S. Nishimoto

Louise ai agent : Ironwood Gemni vs Gpt 4.5


Listen Later

Gemini Ironwood excels in complex reasoning and math tasks, achieving a score of 92 on the AIME math benchmark compared to GPT-4.5's score of 36.73. Ironwood's architecture is designed to support large-scale parallel processing, making it ideal for demanding workloads like LLM inference. While GPT-4.5 focuses less on structured reasoning, it lags behind specialized models with a score of 71.4 compared to Gemini's 75.8 on MMLUPro. Gemini also demonstrates superior performance in long-context tasks, achieving 83.1% accuracy versus GPT-4.5's 48.8%, and in multimodal understanding, with scores of 65.9 for Gemini compared to 74.4 for GPT-4.5. Ironwood's 192 GB of high-bandwidth memory (HBM) per chip enhances its capability for memory-intensive workloads. Although GPT-4.5 performs stronger in basic multimodal tasks with a score of 74.4 on MMMU, it struggles with extended contexts. In terms of factual accuracy, GPT-4.5 leads with a score of 62.5 on SimpleQA, while Gemini follows at 52.9, showing reduced hallucinations at 37.1 compared to older models.

...more
View all episodesView all episodes
Download on the App Store

Louise Ai agent - David S. NishimotoBy David Nishimoto