DEV Community

Cover image for We need to Speak about Google Code Quality
Thomas Hansen
Thomas Hansen

Posted on

We need to Speak about Google Code Quality

For decades landing a job at Google was most software developers' wet dream. Google's job interview was ridiculously difficult, and there are entire libraries written about how to land a job at Google.

I've got an AI chatbot company, and because of that, I have to somehow relate to Google code. Google code is everywhere, and creating a website without using Google code is almost impossible for these reasons. Google Analytics being one reason, and Google reCAPTCHA being another reason. If you've got a website, there's a 99% probability that your site is running some Google code.

I always knew Google code was bad. Lots of people would tell me about its problems when the subject came up. However, just how bad it was I didn't realise before this week.

reCAPTCHA is JunkWare

For a year I didn't understand why our website couldn't make it up to anything more than the high 60s on Page Speed Insights. I did everything I could, but nothing would make the site score higher than 70.

Yesterday I was able to completely get rid of the last piece of Google JavaScript on my site, and immediately our site scored in the high 90s.

Page Speed Insights without Google ReCAPTCHA

I had tried everything at that point, knowing a high score on Page Speed Insights is crucial for SEO and other things required to create a highly converting website, but nothing seemed to work before I ripped out all Google code from the site. In the following video I demonstrate how Google's reCAPTCHA is actually consuming 1.7MB of bandwidth by simply embedding it on your page.

I need to emphasise that 1.7MB for an edge library doing simple CAPTCHA logic is literally a disaster. There's no way to sugar coat this fact, so I'll just say it out loud as it is ...

Google reCAPTCHA is a hot smoking pile of garbage! Junkware who's primary effect on your website is to literally destroy it!

For reference purposes, the Magic CAPTCHA library I created is 20KB. That implies reCAPTCHA is 85 times larger than my own stuff. I should probably clarify that I wrote Magic CAPTCHA in a couple of hours.

I've got 42 years of software developer experience, but I'm not that good. However, according to neutral metrics, this implies I am singlehandedly almost 2 orders of magnitudes better of a software developer than the entire team that created reCAPTCHA is combined!

I could do the same exercise with Google Analytics and Google Tag Manager, but luckily I don't need to, since Plausible already did. A piece of advice, rip out Google Analytics and use Plausible instead. It first of all doesn't destroy your website, and secondly it doesn't violate the GDPR - So you can embed it on your site without having to warn your visitors about that they're being spied on by Google.

How's their backend code?

David Sugar already pointed out how Google Maps code was, but you don't need to believe David in fact. Just check your site's traffic in fact.

We've got an AI chatbot as one of our products. This chatbot is embedded on our site. This gives us incredibly high quality feedback about what our users wants and what they're looking for. The reasons are fairly simple, some of our users are using the AI chatbot, and the questions they're asking gives us insight into why they came to our site in the first place.

If we're to sum up the effect of the traffic Google is sending to us, it would be as follows ...

Google is sending us exclusively junk traffic

Literally, they're only sending us users who's got absolutely no interest what so ever in our product. Instead they're sending us mostly users who are looking for a free alternative to ChatGPT to do their homework. Poor students in 3rd world countries, trying to cheat at school. Literally, 98% of all traffic originating from Google has the above profile.

I've played with the idea of blocking Google out from indexing our site for these reasons in fact. However, if their spider is equally bad as their search engine, it probably wouldn't even work at this point ... 😳

How is it possible for "the smartest search engine on earth" to only send us irrelevant traffic?

I personally believe our site is doing a very good job at explaining what it's about. Literally, the H1 header says;

AI for Customer Service. Put ChatGPT on your Website

And the subtitle says "Create your own AI chatbot from your own data and embed on your website". How is it possible to send traffic to such a website from users that are looking for an AI chatbot to do their homework?

Because of my work I happen to know a lot about embeddings and AI related search. In fact, we've got our own AI-based search component. Ours is several orders of magnitudes better than Google's apparently.

The only conclusion you can draw from the above fact, is that Google's primary product, their search engine that is, is also literally a hot smoking pile of garbage. Even Google's bread and butter product is arguably broken to the point where it's got no idea of what it's doing.

At this point we'd be better of going back to Yahoo from 1995 ... 😕

YouTube Ads

Have you noticed how all your YouTube ads have gone completely bonkers the last couple of years? Personally about 80% of ads YouTube is serving me are Greek ads. I don't speak Greek. This implies that companies are paying Google to show ads to people who have no ability to even understand what the ad is about.

For the record, the remaining 20% of ads I'm forced to watch on YouTube is also 100% completely irrelevant for me, and I haven't clicked a single ad in 5 years because of it.

I have no problem clicking a relevant ad if it seems interesting - But Google is 100% incapable of showing me anything that's even remotely interesting to me. This is kind of weird, realising companies are spending trillions of dollars on Google ads, so obviously, somewhere out there, there purely logically have to be somebody paying Google for an ad I actually want to see. Still somehow, Google seems to be incapable of showing me these ads for reasons I cannot explain using logic and reason ...

The only remaining conclusion is that Google's backend code is also, you guessed it; "A hot smoking pile of garbage"

Wrapping up

Google used to have some of the best products on the planet 20 years ago. They were the inspiration of our industry, and seemed to deliver high quality products, at a pace nobody could follow. Yet for weird reasons, at some point, everything simply turned into junkware.

I have no idea how this happened, but I know one thing - Which is that if you own GOOG shares today, my advice to you is to fire sell these as crazy - Because in the end ...

Quality always wins! And Google has none of it!

Top comments (2)

Collapse
 
joelbonetr profile image
JoelBonetR 🥇

Yes but does yours collect telemetry and data to send back home?
Info like IP address, resources loaded in the site (styles, images, ...), Google user account information (if any in context), behavior (like scrolling on a page, moving the mouse, clicking on links, time spent completing forms, typing patterns...) 😂

That might sound like much but just to avoid being that kind of guy that drops the bomb half way through and leaves... this kind of data aggregation with a threshold to "determine" somewhat "accurately" whether the client is a human or a bot is applied in tones of places, in some it is way more aggressive and even implies different levels of security applied to the user in context at any given point in time (banking software is an example of both), I'm pretty sure Apple also uses that for their ID services and so on.

That it might be garbage, code-wise I don't know really... but I mean, probably. The odds tell me that is much more likely that a piece of code would be garbage than otherwise 😂😂

Cheers

Collapse
 
polterguy profile image
Thomas Hansen • Edited
  • IP address is passed on by the request to the JS file
  • Google user account information too, assuming they're using cookies with wildcard domains
  • Behaviour is literally an ['onfoo','onwhatever'].forEach(el => body.addEventHandler(...)) invocation, probably ~150 bytes in total
  • Resource loading, and having turned on JS, is arguably also a requirement with my PoW CAPTCHA, since the JS file contains the "public key".
  • Etc, etc, etc ...

And in the end, all reCAPTCHA is giving you is "a probability" of that it's a bot or not ...

My CAPTCHA library isn't as strong arguably as reCAPTCHA, and there are legitimate reasons for using something stronger sometimes, since after all it's a simple while loop creating SHA256 values, with an incremented seed value until n trailing zeros have been created. However, my PoW CAPTCHA would probably scare away 99% of the most annoying bots, which at least for us, is "good enough", and I suspect also for most others, since after all, what 99% of all CAPTCHA usage actually is about is reducing spam through "contact us" forms on WP sites. In fact, simply adding a JavaScript invocation would probably scare away 95% of bots ...

Can you bypass it? Yes! Would you want to? Especially at workload 4? Probably not, because at workload 4, we're talking 10 cents per request in electricity probably ...

Besides, eliminating spam is impossible. In fact, there are hundreds of thousands of people in Bangladesh and India who's jobs it is to literally solve 500+ CAPTCHA jobs for customers in the EU and the US per hour, where behind the API there are literally humans sitting in front of their computer screens solving CAPTCHAs. So even a perfect CAPTCHA is still paradoxically broken by design, since the cost per CAPTCHA for such jobs is often around 1 to 5 cents ...

I suspect my PoW CAPTCHA is actually better for these particular problems since the electricity cost would outweigh the salary costs if you put it at workload 4 ... (4 consecutive ZERO hex value, 65,565 SHA256 values on average). At workload 5, I'd be running my M3 CPU non stop for 45 to 200 seconds to solve the CAPTCHA ...

In fact, if you measure web traffic in total, ignoring images and videos, I'm willing to bet that 90% of all web traffic in the world is downloading garbage Google JS, implying if we collectively removed Google frontend garbage, we'd collective as a specie, increase the speed of the internet as a whole by at least 300 to 400 percent ...

Believing you'll somehow need 1.7MB of JavaScript and 13 HTTP requests to create a simple CAPTCHA is not only madness and garbage code, it's also incredibly arrogant - Especially considering it's 100% impossible to be considered "a good website" according to Google's own metrics if you use their garbage code ...

Having a good score on "Core Web Vitals" is crucial for scoring high in the SERP. Scoring high in the SERP is literally impossible if you add Google JS. I assume that irony is not lost on you ...

Implying by Google's own standards, their code is fundamentally broken. Notice, Google's words not mine ...

This comes in addition to that you're violating GDPR with anything downloaded from Google, including their JS, to the point where you'll have to destroy the user experience, by adding "disclaimer popups" that scares the living crap out of users simply because you wanted to use their reCAPTCHA lib ...

I'm willing to be $100 on that both you and me could easily refactor reCAPTCHA and eliminate 90% of its problems and size in a couple of days of hammering our IDEs ...

In fact, this is such a big problem, if you check out the file comment to reCAPTCHA's JS file, you will see the following disclaimer ...

DO NOT EDIT THIS CODE

So obviously, a lot of their own users have tried to download their lib to try to improve upon it themselves, and in the process broken it ...

My library is not perfect, you can circumvent it. Google's code is not perfect, you can circumvent it too. However, my code is 20KB and Google's code is 1.7MB. This implies Google's code is 100% perfectly useless, and should never be added to your website, regardless of what needs you have to eliminate bots ...

Google ReCAPTCHA is fundamentally broken!

As to banking code? You cannot legally use reCAPTCHA, at least not in Europe, because of privacy issues ...