The search engine is a program basically which identifies items in its database based on keyword or a search query entered by a user to either retrieve information or to identify a particular website on the internet.
How Search Engine Creates Database
Search Engine function in a 3 stage model
Crawling —-> Indexing ——> Serving
Crawling – This is the stage of finding content, Search engines use Web Crawlers, Web crawlers go by many names, including spiders, robots, and bots. Information is fetched using Crawlers / Bots / Spiders
This is the 1st step in finding out what pages exist on WWW. Google (Search Engine) doesn’t have a central registry of all web pages (Basically information keeps changing, hence Google must constantly search for new pages and add them to its list of known pages. Crawling is the process of discovery. )
How does a web crawler work?
These crawlers require a path or bridge to move from one page to another.
Search engines identify or visit websites through the links on pages. Wondering how new websites will get crawled in such cases? if you have a new website without links connecting your pages to others, you can ask search engines to crawl your site by submitting your URL on Google Search Console
The Process of indexing starts after a page is discovered, here Google (Search Engine) tries to understand what is page or content about.
This is one of the important parts of 3 stage model. This is where content is categorized. We will learn about how this impacts the overall working of the search engine in the next stage, Serving.
Serving (and ranking)
Assuming a query is triggered by a user, Google then tries to find the best or most relevant answer from its index based on many factors (200+ factors). Google tries to determine the highest quality answers, and factor in other considerations that will provide the best user experience and most appropriate answer, by considering things such as the user’s location, user’s language, and the device (desktop or phone). For example, a query for “car repair shops” would show different answers to a user in the US than it would to a user in the UK. This is based on their algorithm and Google doesn’t accept payment to rank pages higher, and ranking is done programmatically.
So let’s go back in praising the Indexing mechanism, why are we stressing in this process so much?
There are millions of pages added every minute, so what if the indexation was not categorized, the entire system would have collapsed.
Let’s take an example of stepping into a Medical store to buy a bunch of medicines, like usual assuming the store is crowded each person handing over their list (list of medicines) to the salesperson within no time that person is stepping out of the store with all the medicine, knowing the complexity of the names with which medicines are named that speed to procure those medicines from different shelves is amazing. But imagine if this system was not in place and medicines were not organized and kept each person will take hours and hours to get what they want.
Now coming to search engine, when a user triggers are query in the SE (search engine) within a fraction of a second the SERP (Search engine ranking page) is populated with the most relevant results. This is the power of data categorization, as we all know by now there 200+ ranking factors so basically when a user triggers a query, 200+ questions are asked to their data centers or database whatever we want to call it, and procure the relevant result. Hence we can say indexation is one of the important parts of all.
This wraps the 1st part towards learning Search Engine Optimization, we felt it’s important to understand the platform for which we would be optimizing, in the next few articles we will a little more about search engine its history and how they have evolved over the years.