Share SSMs and Transformers: Tradeoffs and Inductive Biases

Copy link

July 09, 2025

SSMs and Transformers: Tradeoffs and Inductive Biases

20 minutes

Source : https://goombalab.github.io/blog/2025/tradeoffs/

This source explores the fundamental differences and trade-offs between State Space Models (SSMs) and Transformers, particularly in the context of sequence modeling and large language models (LLMs).

It defines SSMs by their three key ingredients: state size, state expressivity, and training efficiency, contrasting their compressed, constant-size hidden state with the Transformer's linear-scaling token cache.

The author argues that Transformers are best suited for pre-compressed, semantically meaningful data, while SSMs excel in raw, high-resolution data due to their compressive inductive bias.

Ultimately, the piece proposes that hybrid models combining both architectures may offer superior performance by leveraging their complementary strengths, akin to how human intelligence utilizes both fluid memory and external references.

...more

View all episodes

By Benjamin Alloul 🗪 🅽🅾🆃🅴🅱🅾🅾🅺🅻🅼

July 09, 2025

SSMs and Transformers: Tradeoffs and Inductive Biases

20 minutes

Source : https://goombalab.github.io/blog/2025/tradeoffs/

...more

Sign up to save your podcasts