
Sign up to save your podcasts
Or


Jamba is a new large language model developed by AI21 Labs that introduces a novel hybrid architecture. The model interleaves traditional Transformer layers with Mamba (a state-space model) layers, and integrates a Mixture-of-Experts (MoE) module to increase capacity without proportionally increasing compute requirements.
This hybrid approach addresses the fundamental limitations of pure Transformer models, which suffer from high memory and compute requirements for long contexts due to the growing key-value (KV) cache. It also improves upon pure Mamba models, which can sometimes struggle to match the in-context learning capabilities of Transformers.
Key highlights of the Jamba model include:
To encourage further community research and exploration into hybrid Attention-Mamba architectures, AI21 Labs has made the base model publicly available under a permissive Apache 2.0 license.
By Yun WuJamba is a new large language model developed by AI21 Labs that introduces a novel hybrid architecture. The model interleaves traditional Transformer layers with Mamba (a state-space model) layers, and integrates a Mixture-of-Experts (MoE) module to increase capacity without proportionally increasing compute requirements.
This hybrid approach addresses the fundamental limitations of pure Transformer models, which suffer from high memory and compute requirements for long contexts due to the growing key-value (KV) cache. It also improves upon pure Mamba models, which can sometimes struggle to match the in-context learning capabilities of Transformers.
Key highlights of the Jamba model include:
To encourage further community research and exploration into hybrid Attention-Mamba architectures, AI21 Labs has made the base model publicly available under a permissive Apache 2.0 license.