How does google know what you’re looking for on the web? How can anyone have that much information about everything? Are google developers constantly seeking information about various subjects in order to be able to provide you with relevant answers to all the questions you type into that search box?
Well the simple answer is no. Google doesn’t really “know” what you’re looking for, their developers aren’t trying to find and properly sort all the information you’re looking for either, instead something more interesting is happening under that search box. Since 1998, Google’s mission has always been “to organize the world’s information and make it universally accessible and useful” and I think we can all agree they’ve accomplished that. Ever since its founding, Google has been constantly mapping the web, hundreds of billions of pages to create something called an index (which is kinda like a massive library that you look through whenever you’re searching for something) but the problem with that is if you’re searching for something you usually wanna find it as quick as you possibly can and so going through millions of indexed pages that may or may not contain what you’re looking for is pretty inefficient, that is where Google’s ranking algorithms come into play. The algorithms first try to understand what you’re looking for as you type in the search box, this is why you see your question/sentence/word being finished before you actually finish typing it, the algorithms are basically suggesting or guessing what you might be trying to type in based on what other users have typed into the search box, they’re also very helpful if your spelling is incorrect. Next, the algorithms sift through millions of possible matches in the index and try to return the page with the most relevant information at the top of the search results.
Now that you have your results, you can go through each page as needed, depending on whether or not you find what you’re looking for on the first page returned (but you most likely won’t have to go through more than 3 pages). As you can tell by now, the use of ranking algorithms is very handy here, but how exactly do the algorithms decide on what should be at the top of your search results? – there’s a lot of factors that go into ranking, here’s a few of those factors:
- Word location and the number of times it appears on the page
- User location
- How pages link to each other
- How important and trustworthy the page is
Google indexes billions of pages on the web and almost all of them might contain at least a word you included in your search, one of the many factors that go into ranking is _ where in the page that word is located _. Pages containing the word you search for in their title for instance, will probably be at the top of your results. A word can also appear 2 times in one page and maybe 16 times in another, the algorithms take that into consideration also.
User location is another useful factor for ranking pages, where a search happens. Say for example you just search for the word “population”, After giving you the definition of the term, Google will probably return a page with the population of the country you’re currently in. _ User location _ is also useful for the obvious stuff, if you search “coffee shop”, Google should show you coffee shops near you, not a Starbucks located across the country.
Another factor which Google has actually been using since the beginning is How pages link to each other. This helps the algorithms understand what the pages they sift through are actually about, whether or not they contain similar information.
How important and trustworthy the page is depends on user feedback but not entirely. Because cyber-crime is a thing, there are a lot of web pages online that exists to scam people, Google’s algorithms take this factor into consideration as well, the security of everyday Google users is important. Pages that may be harmful are scanned and filtered out of your search to not expose you to potential scams or worse crime.
_ When a web page is uploaded _ is another factor I should mention but it can be overtaken by factors like _ relevance and trustworthy of a page , think of it this way – cyber-criminals are constantly finding ways to cause damage or scam users online, millions of malicious websites are built and uploaded everyday, but because a malicious page was uploaded yesterday does not mean Google will show this at the top of your search results (even if it contains all the words you searched for) – the factor of _ trustworthy _ becomes top priority compared to **_time of upload.** The factor of when a page was uploaded is mostly important when it contains information about current emerging topics, if you search for covid-19 you’re more likely to get results with information uploaded yesterday or even a few hours ago.
Google processes approximately 70,000 search queries every second, translating to 5.8 billion searches per day and approximately 2 trillion global searches per year. The average person conducts between three and four searches each day looking for answers ASAP and if you’ve ever wondered how all of this is possible, now you know.
Alright cool
Top comments (0)