Linear Digressions

Interleaving

07.22.2019 - By Ben Jaffe and Katie MalonePlay

Download our free app to listen on your phone

Download on the App StoreGet it on Google Play

If you’re Google or Netflix, and you have a recommendation or search system as part of your bread and butter, what’s the best way to test improvements to your algorithm? A/B testing is the canonical answer for testing how users respond to software changes, but it gets tricky really fast to think about what an A/B test means in the context of an algorithm that returns a ranked list. That’s why we’re talking about interleaving this week—it’s a simple modification to A/B testing that makes it much easier to race two algorithms against each other and find the winner, and it allows you to do it with much less data than a traditional A/B test.

Relevant links:

https://medium.com/netflix-techblog/interleaving-in-online-experiments-at-netflix-a04ee392ec55

https://www.microsoft.com/en-us/research/publication/predicting-search-satisfaction-metrics-with-interleaved-comparisons/

https://www.cs.cornell.edu/people/tj/publications/joachims_02b.pdf

More episodes from Linear Digressions