airhacks.fm podcast with adam bien

From SIMD to CUDA with TornadoVM


Listen Later

An airhacks.fm conversation with Michalis Papadimitriou (@mikepapadim) about:
GPU acceleration for LLMs in Java using tornadovm,
evolution from CPU-bound
_multiple_data" rel="noopener noreferrer">SIMD optimizations to GPU memory management,
Alfonso's original Java port of llama.cpp using SIMD and Panama Vector API achieving 10 tokens per second,
TornadoVM's initial hybrid approach combining CPU vector operations with GPU matrix multiplications,
memory-bound nature of LLM inference versus compute-bound traditional workloads,
introduction of persist and consume API to keep data on GPU between operations,
reduction of host-GPU data transfers for improved performance,
comparison with native CUDA implementations and optimization strategies,
JIT compilation of kernels versus static optimization in frameworks like tensorrt,
using LLMs like Claude to optimize GPU kernels,
building MCP servers for automated kernel optimization,
European Space Agency using TornadoVM in production for simulations,
upcoming Metal backend support for Apple Silicon within 6-7 months,
planned support for additional models including Mistral and gemma,
potential for distributed inference across multiple GPUs,
comparison with python and C++ implementations achieving near-native performance,
modular architecture supporting OpenCL PTX and future hardware accelerators,
challenges of new GPU hardware vendors like tenstorrent focusing on software ecosystem,
planned quarkus and langchain4j integration demonstrations

Michalis Papadimitriou on twitter: @mikepapadim

...more
View all episodesView all episodes
Download on the App Store

airhacks.fm podcast with adam bienBy Adam Bien

  • 5
  • 5
  • 5
  • 5
  • 5

5

5 ratings


More shows like airhacks.fm podcast with adam bien

View all
Software Engineering Radio by se-radio@computer.org

Software Engineering Radio

273 Listeners

The Changelog: Software Development, Open Source by Changelog Media

The Changelog: Software Development, Open Source

288 Listeners

Thoughtworks Technology Podcast by Thoughtworks

Thoughtworks Technology Podcast

42 Listeners

Talk Python To Me by Michael Kennedy

Talk Python To Me

584 Listeners

Software Engineering Daily by Software Engineering Daily

Software Engineering Daily

626 Listeners

Soft Skills Engineering by Jamison Dance and Dave Smith

Soft Skills Engineering

284 Listeners

AWS Podcast by Amazon Web Services

AWS Podcast

205 Listeners

Duke's Corner by Jim Grisanzio

Duke's Corner

8 Listeners

Home Assistant Podcast by HK Media

Home Assistant Podcast

71 Listeners

Syntax - Tasty Web Development Treats by Wes Bos & Scott Tolinski - Full Stack JavaScript Web Developers

Syntax - Tasty Web Development Treats

982 Listeners

A Bootiful Podcast by Josh Long

A Bootiful Podcast

30 Listeners

The Stack Overflow Podcast by The Stack Overflow Podcast

The Stack Overflow Podcast

62 Listeners

The Real Python Podcast by Real Python

The Real Python Podcast

140 Listeners

Latent Space: The AI Engineer Podcast by swyx + Alessio

Latent Space: The AI Engineer Podcast

96 Listeners

The Pragmatic Engineer by Gergely Orosz

The Pragmatic Engineer

62 Listeners