airhacks.fm podcast with adam bien

From SIMD to CUDA with TornadoVM


Listen Later

An airhacks.fm conversation with Michalis Papadimitriou (@mikepapadim) about:
GPU acceleration for LLMs in Java using tornadovm,
evolution from CPU-bound
_multiple_data" rel="noopener noreferrer">SIMD optimizations to GPU memory management,
Alfonso's original Java port of llama.cpp using SIMD and Panama Vector API achieving 10 tokens per second,
TornadoVM's initial hybrid approach combining CPU vector operations with GPU matrix multiplications,
memory-bound nature of LLM inference versus compute-bound traditional workloads,
introduction of persist and consume API to keep data on GPU between operations,
reduction of host-GPU data transfers for improved performance,
comparison with native CUDA implementations and optimization strategies,
JIT compilation of kernels versus static optimization in frameworks like tensorrt,
using LLMs like Claude to optimize GPU kernels,
building MCP servers for automated kernel optimization,
European Space Agency using TornadoVM in production for simulations,
upcoming Metal backend support for Apple Silicon within 6-7 months,
planned support for additional models including Mistral and gemma,
potential for distributed inference across multiple GPUs,
comparison with python and C++ implementations achieving near-native performance,
modular architecture supporting OpenCL PTX and future hardware accelerators,
challenges of new GPU hardware vendors like tenstorrent focusing on software ecosystem,
planned quarkus and langchain4j integration demonstrations

Michalis Papadimitriou on twitter: @mikepapadim

...more
View all episodesView all episodes
Download on the App Store

airhacks.fm podcast with adam bienBy Adam Bien

  • 5
  • 5
  • 5
  • 5
  • 5

5

7 ratings


More shows like airhacks.fm podcast with adam bien

View all
The Changelog: Software Development, Open Source by Changelog Media

The Changelog: Software Development, Open Source

288 Listeners

WSJ Tech News Briefing by The Wall Street Journal

WSJ Tech News Briefing

1,650 Listeners

The a16z Show by Andreessen Horowitz

The a16z Show

1,099 Listeners

Software Engineering Daily by Software Engineering Daily

Software Engineering Daily

627 Listeners

Talk Python To Me by Michael Kennedy

Talk Python To Me

582 Listeners

Bits und so by Undsoversum GmbH

Bits und so

25 Listeners

c’t uplink - der IT-Podcast aus Nerdistan by c’t Magazin

c’t uplink - der IT-Podcast aus Nerdistan

6 Listeners

heiseshow by heise online

heiseshow

2 Listeners

REWORK by 37signals

REWORK

210 Listeners

AWS Podcast by Amazon Web Services

AWS Podcast

204 Listeners

The Real Python Podcast by Real Python

The Real Python Podcast

141 Listeners

LANZ & PRECHT by ZDF, Markus Lanz & Richard David Precht

LANZ & PRECHT

325 Listeners

Der KI-Podcast by ARD

Der KI-Podcast

12 Listeners

Foojay.io, the Friends Of OpenJDK! by Foojay.io

Foojay.io, the Friends Of OpenJDK!

0 Listeners

The Economics Show by Financial Times

The Economics Show

147 Listeners