Video: Basics of How Search Works Explained for Developers

Description


Mariya Moeva, a product manager at Google, explains to developers the basics of how Google Search works.

As an overall, the summary of the 2 things that should be done for better website visibility in Google Search is:
  1. Help Google find the content. 
  2. Help Google evaluate the content.

Transcription


John Mueller: Let's take a quick look at how search works.

Mariya Moeva: Alright, so in order to be successful as a developer in search, you need to know at least the basics of how it works. And I'm going to take you through the super high-level picture of how it works.

If you are interested in the details, google.com/jobs, welcome to apply. And then we can go into a lot more detail.

Let's get started with the super high-level picture. So, we generally talk about 3 things:
  1. First, Crawl & Discovery.
  2. Then, Indexing
  3. And finally, Ranking and Serving
So, I'm going to show you very briefly what each of this is about.

1. Crawl & Discovery


In order for us to be able to show anything in the search results, first, we need to be aware of it that it exists.

We have a series of systems that are going around following links on the web and downloading web pages: HTML files, and all the different resources that come into making a website like JavaScript files, CSS, images. Those systems, collectively are crawlers and we call them Google bot.

The Goal for us is to find everything that is fresh, new, interesting, relevant and important and to do that in an efficient way.

In order to know which URLs to crawl and in which order, we have another set of systems which are known as schedulers. They queue the URLs for the crawlers to go and fetch, and all of this then get stored.

You may think that this is a pretty simple process? But If you start thinking that we have to do this 20 billion times per day, then you kind of get an idea [that] it's a little bit trickers than it seems at first sight.

In fact, in 2016, we've seen a hundred and thirty trillion pages. And every new link that we see, usually there are 2 more links that we've never seen before. So, there is constantly new stuff and we have to decide what to crawl and what to update, and to do this in the most efficient manner.

Once we find the content, we have a series of other tasks: First, we have to make sure that we are allowed to access that content. And for that, we will first go — every time we access the site — we'll go to your file, called robots.text which is a pretty simple file containing instructions to search engines and other crawlers, and it tells you, this is ok to fetch and this is not okay and we obey this very strictly. So that's the first thing, we try to find on a website.

The other we'll try to do is to get as much content as possible without troubling the normal work of the server so the website can function and serves its clients as usual.

Then finally, we'll try to handle errors gracefully.

So, as a developer, you have 2 tasks here. First — if you remember again that we do fetches 20 billion times a day and we see trillions of pages every year — is that your content should be really easy to discover.

Ways to do that is to submit to us a list of URLs they have, like a sitemap. Or, also check that all the resources that are necessary for your site to be rendered are accessible to our crawlers.

2. Indexing


So when we fetch everything that we were able to fetch, we go to the next stage, and that is indexing.

Here we are going to parse the content and into this comes things like what language is this page, images, is there a title? is there a description? and other different elements on the page.

To do that, we also try to render the page and as a developer, especially if you are building a lot of edge cutting fancy things, you have to keep in mind that currently, the search systems are using chrome 41 to render pages. So, not all of the different functionalities that you might be thinking about could be supported by the search rendering systems.

If you want to find out more, I would suggest that you have a look at the talk that John did earlier today in the morning in case you didn't wake up at 8:30 to see it. It will be available on YouTube and you'll be able to see a lot more about what we support in the search and how to render things properly.

Given the huge amount of pages on the web, we also don't want to index more than one of each unique thing. So, we have a lot of systems in place to eliminate duplicates and to keep only one copy of each thing.

And finally, we don't want error pages and we also don't want any spam. So, we kick all of that out and everything else that we want to keep will be put in the index and we process it so that it's ready to be served to users when they search.

So for you as a developer here I guess, it's important to remember that key elements like titles and descriptions are available in each page that users are creating and then also to check how it's rendered.

3. Ranking and Serving


And then finally, once we have everything in the index; when users start searching, we are going to pull a set of pages that we think are relevant results. We are going to add a bunch of information that we've already accumulated to them, of how they are important they and how they relate to the user's query. And then we are going to show them in some specific order that we think it's most relevant for this user.

So this is mostly on our site [their site: Google Webmaster Tools], you don't need to worry about anything here if your content is already accessible and easy to render.

But if you are interested in ranking and search quality, again google.com/jobs. There's plenty of problems to solve.

Conclusion


So, now that you know how search works. Let's have a summary of the 2 things to remember:
  1. You have to help us find the content.
  2. You have to help us evaluate the content.
If you are able to do these two things, you are pretty much set as a developer.

If you found this helpful, please support our channel by liking and sharing this video with your friends. And do not forget to subscribe to our channel for more videos like this.


Comments