
Sign up to save your podcasts
Or


This research paper introduces Prequal, a novel load balancer designed to minimise latency in large-scale distributed systems like YouTube. Unlike traditional load balancers that focus on balancing CPU usage, Prequal prioritises estimated latency and requests in flight, actively probing servers for real-time load information. Extensive testing on YouTube and a controlled testbed demonstrated that Prequal significantly reduces tail latency, error rates, and resource consumption, compared to weighted round-robin and other load balancing strategies. The paper details Prequal's design, including its asynchronous probing mechanism and hot-cold lexicographic rule for replica selection, and its superior performance is attributed to its ability to dynamically adapt to heterogeneous server capacities and varying workloads.
By Sanket MakhijaThis research paper introduces Prequal, a novel load balancer designed to minimise latency in large-scale distributed systems like YouTube. Unlike traditional load balancers that focus on balancing CPU usage, Prequal prioritises estimated latency and requests in flight, actively probing servers for real-time load information. Extensive testing on YouTube and a controlled testbed demonstrated that Prequal significantly reduces tail latency, error rates, and resource consumption, compared to weighted round-robin and other load balancing strategies. The paper details Prequal's design, including its asynchronous probing mechanism and hot-cold lexicographic rule for replica selection, and its superior performance is attributed to its ability to dynamically adapt to heterogeneous server capacities and varying workloads.