February 28, 2026

EP066: Llama 2 Ghost Attention And Safety Secrets

19 minutes

The paper introduces Llama 2, a family of pretrained and fine-tuned large language models (LLMs) developed by Meta, ranging in scale from 7 billion to 70 billion parameters.

Here are the key highlights from the paper:

Pretraining Improvements: Llama 2 was pretrained on 2 trillion tokens from a new mix of publicly available data. Compared to its predecessor (Llama 1), Llama 2 features a 40% larger pretraining corpus, double the context length (4096 tokens), and utilizes grouped-query attention (GQA) for better inference scalability in larger models.
Llama 2-Chat: The authors specifically developed and released Llama 2-Chat, a version fine-tuned and optimized for dialogue use cases. This was achieved through Supervised Fine-Tuning (SFT) and iterative Reinforcement Learning with Human Feedback (RLHF), which included both rejection sampling and Proximal Policy Optimization (PPO).
Novel Techniques: The researchers introduced Ghost Attention (GAtt), a method designed to help the model maintain system instructions and consistency across multiple turns of a conversation. They also observed emergent behaviors in the model, such as the ability to temporally organize knowledge and utilize external tools in a zero-shot context.
Safety and Alignment: A major focus of the paper is the responsible development of LLMs. The team conducted extensive safety tuning using safety-specific RLHF, context distillation, and rigorous red-teaming exercises with experts to identify and mitigate risks like toxic language, bias, and harmful activities.
Performance: According to extensive human evaluations and automated benchmarks, Llama 2-Chat outperforms existing open-source chat models in helpfulness and safety. Furthermore, it performs on par with several prominent closed-source models, such as ChatGPT and PaLM.
Open Availability: The Llama 2 models are released openly for both research and commercial use to encourage collaboration, democratize access, and promote responsible AI innovation within the community.

...more

View all episodes

By Yun Wu

February 28, 2026

EP066: Llama 2 Ghost Attention And Safety Secrets

19 minutes

The paper introduces Llama 2, a family of pretrained and fine-tuned large language models (LLMs) developed by Meta, ranging in scale from 7 billion to 70 billion parameters.

Here are the key highlights from the paper:

Pretraining Improvements: Llama 2 was pretrained on 2 trillion tokens from a new mix of publicly available data. Compared to its predecessor (Llama 1), Llama 2 features a 40% larger pretraining corpus, double the context length (4096 tokens), and utilizes grouped-query attention (GQA) for better inference scalability in larger models.
Llama 2-Chat: The authors specifically developed and released Llama 2-Chat, a version fine-tuned and optimized for dialogue use cases. This was achieved through Supervised Fine-Tuning (SFT) and iterative Reinforcement Learning with Human Feedback (RLHF), which included both rejection sampling and Proximal Policy Optimization (PPO).
Novel Techniques: The researchers introduced Ghost Attention (GAtt), a method designed to help the model maintain system instructions and consistency across multiple turns of a conversation. They also observed emergent behaviors in the model, such as the ability to temporally organize knowledge and utilize external tools in a zero-shot context.
Safety and Alignment: A major focus of the paper is the responsible development of LLMs. The team conducted extensive safety tuning using safety-specific RLHF, context distillation, and rigorous red-teaming exercises with experts to identify and mitigate risks like toxic language, bias, and harmful activities.
Performance: According to extensive human evaluations and automated benchmarks, Llama 2-Chat outperforms existing open-source chat models in helpfulness and safety. Furthermore, it performs on par with several prominent closed-source models, such as ChatGPT and PaLM.
Open Availability: The Llama 2 models are released openly for both research and commercial use to encourage collaboration, democratize access, and promote responsible AI innovation within the community.

...more

Share EP066: Llama 2 Ghost Attention And Safety Secrets

Sign up to save your podcasts

EP066: Llama 2 Ghost Attention And Safety Secrets

EP066: Llama 2 Ghost Attention And Safety Secrets