Preference Optimization

By SaiKrishna Rallabandi

To help understand literature in preference optimization

... more

· Education

Download on the App Store

Download on the App Store

Get it on Google Play

FAQs about Preference Optimization:

How many episodes does Preference Optimization have?

The podcast currently has 1 episodes available.

Preference Optimization episodes:

October 08, 2024 ASFT: Aligned Supervised Fine-Tuning through Absolute Likelihood
This paper proposes a new method for fine-tuning large language models (LLMs) called Aligned Supervised Fine-Tuning (ASFT). ASFT addresses limitations of existing Direct Preference Optimization (DPO) methods by optimizing the absolute likelihood of generating human-preferred responses rather than relying on relative likelihoods. Unlike DPO, ASFT does not require a reference model and is less sensitive to the initial state of the model, leading to more efficient and robust training. The authors demonstrate the effectiveness of ASFT through extensive experiments on various benchmark datasets, showing significant performance improvements compared to existing methods.
...more
12min

FAQs about Preference Optimization:

How many episodes does Preference Optimization have?

The podcast currently has 1 episodes available.