Linear Digressions

Optimized Optimized Web Crawling

11.04.2018 - By Ben Jaffe and Katie MalonePlay

Download our free app to listen on your phone

Download on the App StoreGet it on Google Play

Last week’s episode, about methods for optimized web crawling logic, left off on a bit of a cliffhanger: the data scientists had found a solution to the problem, but it wasn’t something that the engineers (who own the search codebase, remember) liked very much. It was black-boxy, hard to parallelize, and introduced a lot of complexity to their code. This episode takes a second crack, where we formulate the problem a little differently and end up with a different, arguably more elegant solution.

Relevant links:

http://www.unofficialgoogledatascience.com/2018/07/by-bill-richoux-critical-decisions-are.html

http://www.csc.kth.se/utbildning/kth/kurser/DD3364/Lectures/KKT.pdf

More episodes from Linear Digressions