It only takes half a second for Google to return a search based on keywords you type in, but there’s a whole lot more happening behind the scenes to give you the results you need.
Matt Cutts, software engineer head of Google’s webspam team, says “There are three things you need to do to be the best search engine in the world. First, you need to crawl the web comprehensively and deeply, then you want to rank or serve those pages and return the most relevant ones first.”
Google crawls the web on a daily basis. In 2003, Google switched to crawling a significant amount of the Internet each day. By scouring the web each day for new content, it incrementally updated its index.
To keep it fresh, page rank is the key deciding factor as to how likely you are to see a link. Google basically takes page rank as the primary determinant and the more page rank you have — that is, the more people that link to you and the more reputable those people are — the more likely it is that Google will discover your page relatively early in the crawl.
Google also places a lot of emphasis on word order. Finding the right balance between word proximity, page reputation and links pointing to it is the key.
Google then sends that query out to hundreds of different machines all at once, which look through their fraction of the web that has been indexed to find the best match.
“What’s the best page that matches this query across our entire index?” Cutts said. “We take that page and we try to show it with a useful snippet, so we show the keywords in the context of the document and get it all back in under half a second.”