Optimized Web Crawling
Linear Digressions - A podcast by Ben Jaffe and Katie Malone
Categories:
Got a fun optimization problem for you this week! It’s a two-for-one: how do you optimize the web crawling logic of an operation like Google search so that the results are, on average, as up-to-date as possible, and how do you optimize your solution of choice so that it’s maintainable by software engineers in a huge distributed system? We’re following an excellent post from the Unofficial Google Data Science blog going through this problem. Relevant links: http://www.unofficialgoogledatascience.com/2018/07/by-bill-richoux-critical-decisions-are.html