DEV Community

loading...

How I removed google analytics and still have good data to analyze

bias profile image Tobias Nickel Originally published at tnickel.de Updated on ・6 min read

It was just recently, that I opened my google analytics account and added it to this website. I wanted to get some insights, about my website's visitors. But compared to the google search console, there was not much information interesting to me.

And actually I worried a little. Is it legal to just add analytics? analytics was easy to add, just adding a script tag into my page. In EU it is needed to inform the user about non essential cookies. Before setting one, it is needed to ask the users consent. However, analytics got added using a static html tag and there is no way to control what cookies are set immediately.

I was not sure if I should create that script tag dynamically after asking the user, using some client side javascript. and would analytics still work?

On the internet when searching for analytics without cookies there are many websites advising to use motomo. It is a very good solution made with php and mysql. But for my little blog setting up this server seem a little to much. Also because I would have to look that I keep it up do date and do some more security measures. For real production application, google analytics and motomo, both will be a better choice recording lots of data you don't know now you want to have in the future.

My Solution to do Analytics without Analytics

I added a little script into my website. Instead of cookies it uses local storage. local storage can not be used to track users across other websites. So I think this should comply with the law. Also in the storage there is nothing stored to identify the user.


// analytics
const lastViewTime = parseInt(localStorage.getItem('lastViewTime')) || 0;
const viewCount = parseInt(localStorage.getItem('viewCount')) || 0;
const lastViewPage = localStorage.getItem('lastViewedPage') || '';

localStorage.setItem('lastViewTime', Date.now())
localStorage.setItem('viewCount', viewCount+1)
localStorage.setItem('lastViewedPage', document.location.href);

fetch('/api/pageViews', {
  method: 'POST',
  headers: {
    'Content-Type': 'application/json'
  },
  body: JSON.stringify({
    page: document.location.href,
    viewCount,
    time: Date.now(),
    lastViewTime: lastViewTime,
    lastViewPage: lastViewPage,
    userLanguage: navigator.language,
    userAgent: navigator.userAgent,
    referrer: document.referrer,
    dayTime: parseInt(req.body.dayTime+''),
  })
})  
  .then( r => r.json())
  .then(data => console.log('pageViewResult:', data);

Enter fullscreen mode Exit fullscreen mode

On the server I just dump this information into a jsonl file, meaning one json log entry each line. It can easily be converted to csv for analyses via excel. Draw some charts or count per interval weekly and monthly interval.

const router = require('express').Router();
module.export.pageViewRouter = router;

const file = fs.createWriteStream(fileName, {
  flags: 'a' // 'a' means appending (old data will be preserved)
});

router.post('/api/pageViews',async (req,res) => {
  res.json(true);
  file.write(JSON.stringify({
    page: body.page,
    time: Date.now(),
    userLanguage: (req.body.userLanguage+'').substr(0,500),
    userAgent: userAgent.id,
    viewCount: parseInt(req.body.viewCount),
    lastViewTime: parseInt(req.body.lastViewTime+''),
    lastViewPage: req.body.lastViewPage,
    referrer: req.body.referrer,
    dayTime: new Date().getHours()
  })+'\n', (err)=>{
    if(err) console.log(err)
  });
});
Enter fullscreen mode Exit fullscreen mode

Do you see, that I do not check if the browser supports the fetch API and modern arrow functions? I was thinking about it, and decided that I don't need to care about old browser compatibility for this optional feature.

You see all the fields that are getting stored. These are what I came up with. That I think are interesting. To be honest, the API shown is not exactly the one running at tnickel.de, but the concept is this. On my running implementation I validate the received data, store urls and user agent string into a separate json file database and write the id into the log file. But with this example you can understand how you can implement the server side yourself.

How others do it

As by chance: The dev.to community, was just asked about analytics tools. And I described my little solution. The comment received a reply by Charanjit Chana, saying he is using a similar solution, here is what I found on his websites source code (it was minified, so I formatted it a little):

function allowedToTrack() {
  return !(window.doNotTrack || navigator.doNotTrack || navigator.msDoNotTrack || window.external && "msTrackingProtectionEnabled" in window.external) || "1" != window.doNotTrack && "yes" != navigator.doNotTrack && "1" != navigator.doNotTrack && "1" != navigator.msDoNotTrack && !window.external.msTrackingProtectionEnabled()
}
if (allowedToTrack()) {
  let o = Math.floor(8999999 * Math.random()) + 1e6;
  let n = window.innerHeight + "x" + window.innerWidth; 
  // this request then set the cookie. 
  fetch("https://123.charanj.it/xyz/api/" + o + "/false/" + n);
}

if (void 0 !== console) {
  console.log("%c👋 Hey!", "font-size: 16px; font-weight: 600");
  console.log("%cIf you can see this I would love to hear from you.", "font-size: 16px;");
  console.log("%cYou can find me at https://twitter.com/cchana.", "font-size: 16px;");
  console.log("%cUse the hashtag #cchanaconsole", "font-size: 16px;");
  console.log("%c🤙 🖖", "font-size: 16px;");
}
Enter fullscreen mode Exit fullscreen mode

Seems as head of development he is interested in finding new developer talents for his team. I like the allowToTrack function used before the analytics request is made. This request then set a cookie, so multiple page views can be related to the same user and session. I don't know about the rules in England after it left the EU, but I believe in Germany, an additional popup banner would be needed. Other than me, Charanjit is interested in the users screen resolution to know what to optimize the page for.

How do you analytics on your website?

You now have seen two valid approaches to building the client side for collecting analytics information. With this article, I hope you find how how this website does analytics, without tracing the users all over the internet and even into their darkest dreams.

Update January

In a number of comments people point out, that storing identification data in local storage is under the law similar to storing it directly as a cookie.

I was thinking this would be OK, because it would mean, that you can not be tracked with it over other websites. But anyway, I did not store personal identifiers. Or did I?

I think at this point you have to really believe the website operator try to trick you. And if they really wanted, it would be easier to simply show a cookie banner and get consent.

But lets us pretend I wanted to track you personal journey on my(your) website. With the recorded information, there is the viewCount and ViewTime the current and last URL. Just like that these information can plot a journey, but are not connected to a person. However when I or any other web provider with such solution plan to connect journeys with user information that could be possible by: provide a feature or content on the page that require authentication. At the moment of authentication it would be possible to connect that user with his already journey. And that is not good.

Here is some idea, that can make it more difficult for you to connect a journey to a user, but still maintain good insights to users in general.

  1. Round the timestamps to a full minute or several minutes.
  2. Same with the viewCount. I came up with the following function. The function still allow you to know if there are regular users or just random spontanious visitors.
function normalizeViewCound(count){
  const sqrt = parseInt(Math.sqrt(count).toString())
  return sqrt * sqrt;
}
Enter fullscreen mode Exit fullscreen mode

So here is the version that I currently use for my website:


const lastViewTime = parseInt(localStorage.getItem('lastViewTime')) || 0;
const viewCount = parseInt(localStorage.getItem('viewCount')) || 0;
const lastViewPage = localStorage.getItem('lastViewedPage') || '';

const now = Date.now();
const visitTime = now - (now % 60000); // normalize the time

localStorage.setItem('lastViewTime', visitTime)
localStorage.setItem('viewCount', viewCount + 1)
localStorage.setItem('lastViewedPage', document.location.href);

function normalizeViewCound(count){
  const sqrt = parseInt(Math.sqrt(count).toString())
  return sqrt * sqrt;
}

fetch('/api/pageViews', {
  method: 'POST',
  headers: {
    'Content-Type': 'application/json'
  },
  body: JSON.stringify({
    page: document.location.href,
    viewCount: normalizeViewCound(viewCount),
    time: visitTime,
    lastViewTime: lastViewTime,
    lastViewPage: lastViewPage,
    userLanguage: navigator.language,
    userAgent: navigator.userAgent,
    referrer: document.referrer,
    dayTime: new Date(visitTime).getHours()
  })
}).then(function (r) {
  return r.json();
}).then(function (data) {
  console.log('pageViewResult:', data)
});
Enter fullscreen mode Exit fullscreen mode

With these changes, the privacy of my and your users is greatly improved. However, I can't really give legal advice here and know for certain if the measures are enough. Maybe it is just easier to just show the users a cookie information and shameless track them into their most private dreams.

Discussion (28)

pic
Editor guide
Collapse
artis3n profile image
Ari Kalfus

Cloudflare recently released privacy-preserving analytics - essentially sacrificing a bit of accuracy by not individually identifying users, tracking them with cookies or local storage, etc. May be worth checking out

blog.cloudflare.com/privacy-first-...

Collapse
bias profile image
Tobias Nickel Author

looks like a good option. I was also interested to see if I could measure myself and with analytics in parallel and then compare if the numbers match up.

The cloudflare offer say it is free for everyone. maybe I give it a shot. Thanks

Collapse
prashanthr profile image
Prashanth Rajaram (He/Him)

This looks like a great privacy focused option, thanks for the recommendation!

Collapse
cchana profile image
Charanjit Chana • Edited

Small correction, I’m not setting any cookies here. doNotTrack Is a browser feature and the checks is run on each page load otherwise I would have had to store something somewhere to identify the choice that was made which is exactly what I wanted to avoid.

I’m also not sure what is happening with cookie notices from 1st January, but I’m not at all interested in being a data controller when it comes to personal data so I’m happy to keep it as simple as possible!

Really appreciate the mention and I hope someone finds it useful!

Collapse
mikenikles profile image
Mike Nikles

I developed my own solution, your-analytics.org.

It's open source and can be self-hosted. It's work in progress and a fun side project, hoping others find it useful too.

Collapse
feldev profile image
Félix Paradis

Looks neat!

Collapse
mikenikles profile image
Mike Nikles

Thanks Félix. It's a rough first version to test the infrastructure 🙂. The project will be my focus again starting next week. Lots to be added in terms of features and metrics that get captured.

Collapse
prashanthr profile image
Prashanth Rajaram (He/Him) • Edited

Cool article! Great breakdown of how to move away from GA to be more privacy focused.

Based on my research and discussions in this thread, here's a list I've coined for analytics tools in no particular order:

Collapse
bias profile image
Tobias Nickel Author

thanks, yes, there are some very good options to chose from.

I am sure, self coding is not for everyone, but it is a great way to get some insights, not only to your visitors but also into the topic in general.

Collapse
svitekpavel profile image
Pavel Svitek

Did you see Countly? count.ly/

Collapse
smaffulli profile image
Stefano Maffulli

I noticed just now that count.ly is blocked by my default installation of Pi-hole. I wonder why that is considered a bad domain

Collapse
bias profile image
Tobias Nickel Author • Edited

no, I didn't. But it looks very good. it also can be installed myself.

When I see all the alternatives in these comments, I get thinking to install them all at once on my blog.

Collapse
okbrown profile image
Orlando Brown

At the moment it seems the focus is on the technical solutions. But let's think about something else. What is it that's really important to know?

For example it was mentioned that "screen size" was required to build mobile first optimisations. This is a great example of using data to make your site more accessible for a specific cohort of viewers.

Taking this a step further, consider what else can be used to give you insights. Simple stuff like location, gender, devices, where viewers drop off (heatmaps?) is all good but what do you really want to know that's going to aid in giving the best experience to your viewers/users.

This way round should drive your technical solution or decision on how best go about analytics.

Collapse
feldev profile image
Félix Paradis

I started using piratepx.com
It's a dead simple tracking pixel. It's 100% free and open source; a hobby project from a solo dev.
It's nowhere near as good as any other analytics solution, but it gives the ballpark estimate I'm after with 1 extra tiny network request and it's definitely legal everywhere.

inb4, but I think in the eyes of the law, Local Storage === Cookies

Collapse
michaelcurrin profile image
Michael

When I looked into Google Analytics and GDPR before, i found that there even though cookies are stored that the info is not personally identifying and therefore does not need consent.

That may have been inaccurate or things might have changed. I just found this guide which says in Sept 2020, Google launched a consent mode so you can integrate with CookieBot (or maybe OneTrust which we use at work). And if the user doesn't give consent, you still get user agent and some other fields covered in the article.

The article also talks about user ID and anonymous IP (the latter it says is necessary for compliance). So it seems like Google Analytics tracking is only compliant if you configure it a certain way.

cookiebot.com/en/google-analytics-...

BTW I've come across Adobe Analytics but haven't used it. And i heard of Monomo before in a forum but it looks like it is self hosted so thats a barrier for setting up unless you have the time and know-how.

Collapse
joshcheek profile image
Josh Cheek

My first thought was to add a middleware to the server that logs this info. I guess there's some risk that it's overly granular and doesn't actually represent page loads. Probably also wouldn't work with single page apps that don't need to make requests across pages. But my app doesn't have sophisticated enough user interactions to need much JS, so I'm pretty sure I could just log all requests and get roughly the same thing without needing to make any additional HTTP requests.

Collapse
gsarig profile image
Giorgos Sarigiannidis

According to that popular WordPress plugin, you can use Google Analytics without a cookie warning, as long as you configure it in a certain way. The instructions are for Google Analytics, so despite the fact that it is a WordPress plugin, the platform seems irrelevant.

Collapse
jkettmann profile image
Johannes Kettmann • Edited

I switched to plausible.io and am very happy with it. Simple to set up, cheap, and provides everything I need

Collapse
michalmxt profile image
Michael Lucht

I am no lawyer, but I am pretty sure that local storage, indexeddb, websql etc. are legally treated just like cookies, so you need to ask for consent on your page.

Collapse
bias profile image
Tobias Nickel Author

right, me neighter. and this article is likely not the end of the story.

guess with the records that i create, it is possible to show single journeys. if in future i add a feature that require authentication, in theory the user could be connected with a journey.

we can think of technical solutions to make it impossible, but you are right, the legal side more difficult. maybe an info banner is needed anyway.

Collapse
gdledsan profile image
Mundo

My hosting has matotomo as a one click install, I can onstall it on every domain I have individually, so that is the way for me.

Collapse
lamka02sk profile image
lamka02sk

Elastic APM is also a good solution

Collapse
levirs565 profile image
Levi Rizki Saputra

Why you do not use Motomo?

Collapse
bias profile image
Tobias Nickel Author

My server is with node.js. Now there is no php+mysql. Just was to lazy setting this up. and did not want to maintain the updates.

When I was a student, I had a project using piwik(motomos old version). it already worked very good. as the development continued all the years, it can only have gotten better.

Collapse
levirs565 profile image
Levi Rizki Saputra

Where you host your server?

Thread Thread
bias profile image
Tobias Nickel Author

I use a v-server from 1and1 in germany.

I like the unlimited traffic on a 100mbit connection. The page might get slow(never got slow so far in 10 years), but the cost is predictable.

however, it seems impossible to use the free let's encrypt https certificates.

I can choose different OS, debian, ubuntu, fedora,... but the software install repos are directed to a internal mirror that contains quite outdated software versions. node.js I can install manual, but docker I can't install on that server.