Building a Search Engine
You Are Here
I want to know what good is a web search engine that returns 324,909,188 ‘matches’ to my keyword. That’s like saying, Good news, we’ve located the product you’re looking for. It’s on Earth.
W. Bruce Cameron
Building a Search Engine#
We all know the basic purpose and use of an internet search engine. But can we build our own, even if rudimentary?
We can. It won’t compete with Google’s, DuckDuckGo, or Brave. But we can, with some Python, build out an engine that uses the first implementation of the Google search engine circa 1998.
We’ll roughly follow the recipe laid out by Larry Page and Sergey Brin in their 1998 paper The PageRank Citation Ranking: Bringing Order to the Web
Using a simple form of PageRank, we’ll:
build a crawler, i.e., a program that, using a starter page will go from page to page following anchor tags (a.k.a., links) to find related pages
build an index, i.e., creating a database of the documents or pages found and mapping content to their urls
use that index, find pages - ranked by relevance - to any given search query
![../_images/searchengine.png](../_images/searchengine.png)
Fig. 1 Steps to building a search engine#