How to get basic page view statistics?
Second article in the series Analytics with Vanilla JS. Motivation here.
Today we'll look into the impelentation of vanila JS analytics tool that analyses page views.
For the sake of example we need some simple HTML code for our tracker (file example_page.html
). You can add anything you want to the HTML file:
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<script src="js/page_view_tracker.js"></script>
</head>
<body>
<a href="https://www.google.com" class="external">Leave page by going to Google</a>
</body>
</html>
The rest of the code will be in page_view_tracker.js
. First, let's define the function that will allow us to POST
all the gathered data as a string to a specific URL:
function post_data(data, url) {
let xhr = new XMLHttpRequest();
xhr.open("POST", url, true);
xhr.setRequestHeader("Content-Type", "application/json");
xhr.onreadystatechange = function () {
if (xhr.readyState === 4 && xhr.status === 200) {
console.log(xhr.responseText);
}
};
xhr.send(data);
}
data in the string is in JSON format. The server you'll be sending data to can be whatever you prefer: node.js
, Django
, flask
, ... There's even an option to post into Google Docs spreadsheets if you want to avoid the backend.
Data is posted with the following command:
post_data(JSON.stringify(data), "http://0.0.0.0:5000/analytics");
where we defined data object as:
const data = {
"current_page_name": current_page_name
};
Now let's add the rest of the data.
Tracking
Number of views per page: this one is easy. Every time a user visits our website, the post_data
function will be triggered, so we need to add current_page_name
to our data object. It's defined with:
let current_page_name = window.location.href;
In principle, we could get the URL of the current page from the request on the backend by I prefer to have all the data in the JSON
object.
User origin: We want to know from what website the user came from. This information is important because it allows us to track sources of our web site traffic. Are we getting:
- direct traffic (users entering the URL to browser),
- traffic via referrals (links to our site), or
- via organic search (user finds us via a Search engine like Google, Bing, Baidu ...).
In all browsers except the Internet Explorer, the following will give us the source from which user came:
let page_source = document.referrer;
If traffic is dirrect or user used Internet Explorer page_source
will be empty so we set:
if (page_source === "") {
// could be direct traffic or Internet explorer
page_source = "empty";
}
Now we can detect what web browser the user has with something like this, but that doesn't help us to determine the source from which the user came. If you know a workaround, please let me know how to get user origin in IE.
Device screen: We want to know what devices the majority of our users are using. We get device screen size via:
let screen_width = window.screen.width;
let screen_height = window.screen.height;
and screen size that we can draw on with:
let screen_available_width = window.screen.availWidth;
let screen_available_height = window.screen.availHeight;
Browser type, language, time zone: To get the browser type we do:
let browser_type = navigator.userAgent;
the language:
let language = navigator.language;
and the time zone:
let time_zone_offset = Intl.DateTimeFormat().resolvedOptions().timeZone;
Tracking parameters: You can enhance your analytics if you publish URL-s with added parameters. For example, you can use the Urchin Tracking Module or UTM, a format used by Google to track your unique URLs:
http://www.example.com/?utm_source=JohnDoe&utm_medium=mail
By adding parameters to links you share, you can segment the traffic way better during the analysis process. For example: What was published by you, what was shared by others, social media source, ...
Page performance: We want to know how long does it take for our web page to load. For that, we need to understand a bit about web browser events:
- 1.) First, the browser sends the request to the server to get page files.
- 2.) Page files are sent to our device.
- 3.) Then the browser needs to render the web page.
- 4.) Once the web page is rendered,
onload
/load
event is triggered. - 5.) The user views the page.
- 6.) The
onload
/onunload
event happens when the user closes the web page.
The page loading and rendering should happen in a matter of ms. If it doesn't, our user either has a really crapy internet, or we are sending over to many files. Either way, it's good to track that. According to Mozilla docs we can obtain the data about page loading from:
let performance_data = window.performance.timing;
Then get:
let page_load_time = performance_data.loadEventEnd - performance_data.navigationStart;
let request_response_time = performance_data.responseEnd - performance_data.requestStart;
let render_time = performance_data.domComplete - performance_data.domLoading;
We need to trigger page performance monitoring code after the page is loaded. Full code snipet for page perfromance:
window.addEventListener("load", function () {
let performance_data = window.performance.timing;
// calculate request response time: network latency
let request_response_time = ...
// calculate page render time
let render_time = ...
// page load time: wait until load event is finished with setTimeout
setTimeout(function () {
let page_load_time = ...
// Post data to the server
...
}, 0);
});
setTimeOut
is needed because we need to wait for the load
event to finish before we can measure the page load time.
Stay tuned
The full code can be found on my blog at page views. There you'll find the HTML, JS, python
files you need to run the whole thing.
If you have any ideas what else we could track or how let me know in the comment section below.
I'm not a very proficient JavaScript
developer, so there is probably a better way to do some of the parts. Any comments and solutions are welcome. Stay tuned for more. Next week we'll look into page view duration tracking. Why an entire article for that? Well, there are a few edge cases with web page closing that can complicate things.
Top comments (18)
That's an interesting exercise, looking forward to the next posts.
I think what's essential is not amount of data, but just the necessary and right datapoints that answer the right questions. I feel Google Analytics is bloated, difficult to use, it's like "Microsoft Office" of web analytics. We only need a text editor with markdown.
I'd focus more on which data and how to present, instead of too much on collecting everything possible...
Hey 👋 couldn’t agree more one the analogy you made. Might use it in the future 😉
Speaking of simplicity. What would you focus on collecting and visualizing? I have few things in mind but I’m open to suggestions where to take the above series 🙂 Let me now.
In my (relatively short) experience:
It doesn't hurt to have aggregate metrics like time series of sessions, pageviews, geographical distribution of users, etc. I think what's challenging is making it easier (in the UI) to draw answers to simple questions:
But the bottom line is: almost always we want people to take certain "actions" - subscribe to newsletter, buy stuff, start a Trial account, etc.
What I think people really need is to connect sources and efforts to actions.
An "effort" is: "I publish an article about a service I think is valuable to other devs". Or "I post on social media about a new feature or a case study of this service".
A source could be DEVto, in case of article, or Twitter in case of the post.
Even if the person comes in today, but only take the "action" 3 weeks later, I would like to know which "efforts" contributed. There are automation marketing platforms that would give that, but (again) they are so bloated that you need "marketing specialists" to configure them correctly. Not to mention how expensive they are. It's ridiculous, should be really simple and cheap for individuals, SMEs and startups.
Hey thanks for this.
As I mentioned in the first article of the series Motivation I'm going towards analytics that can be integrated directly into product development cycle. What you described definitely fits the bill. Really good point if you don't mind I'll use it :)
On marketing <=> developer relationships I couldn't agree more. Software is useless if nobody's using it and developers are too often way to far from direct customer interaction.
If you ask me. Software development is easy. Selling the product is hard :D
Sure, feel free to use the ideas! Glad I could add something positive to the discussion! Please, keep us posted as you move forward 😉
I couldn't agree more: software is not an obstacle, being able to sell profitably and grow healthy is really hard!
Great. I want to bounce another thought.
Tracking sources of traffic is easy. Proper URL parameters and that's it. But connecting efforts to actions over multiple web page user visits that's a bit harder.
I've come up with a solution that uses
localStorage
but local storage can disabled. Sure the amount of users with disabledlocalStorage
will be small but still.Is there maybe a more elegant way to connect multiple separate web page visits to a single user? I'm probably missing something due to my basic knowledge of JS.
I would suggest generating a unique hash for each visitor. Maybe an MD5 hash of multiple values concatenated:
Store this hash as a cookie in the browser. When the person comes back later, your script can identify it's the same person.
Store the user hash with every interaction (even pageviews). Later, when you're analyzing an action (e.g. "user subscribed to the service"), you can get the user-hash and search every other interaction associated with it in the past.
About tracking the source of traffic, it's not going to be easy anymore. All major browsers are adopting strict "no-referrer" policies now. What this means is:
They're making this switch for privacy and security reasons. It's a good thing, but it will make it a lot more difficult to track sources with precision. Unless we use UTM parameters in the URL, there's no way to know from which page the visitor came.
Hash thats clever 🙂 I was thinking about just some unique user id. Thanks.
————
On URL. Privacy reasons I get and I know that UTMs are more reliable but what are the security issues of referrals?
It's possible for the origin to add user sensitive data in the URL. It's bad practice, but I'd say there's 99.99999% chance that at least a handful of sites is doing that right now.
In that event, the target site can access user data without its consent from the referrer header.
Alright yeah that’s problematic. Never thought of that 🤷♂️
Me neither!.. 😊
I was just pissed at Google Analytics recently and thinking about building something better and started researching, your article caught my attention 😄
Haha nice :D
My disagreement with Google Analytics got on this path to yeah. What encouraged me to start building my own analytics were those guys: usefathom.com
Before that I never thought that one could build reliable custom analytics. But apparently two developers did.
As an improvement, have a look at the Fetch API and Beacon API 🙂
Thanks for pointing that out.
I didn't know about Fetch and Beacon API. Though after a quick google search I noticed that they are not supported in IE. At least according to mozzila.org.
Any thoughts how to perform asynchronous requests in IE?
It would be a performance improvement indeed.
I would recommend not supporting IE anymore, unless absolutely needed. You can always fall back to XHR if fetch is not available (polyfill).
IE is still supported and developed right? What am I missing here?
I checked my personal page and yes percentage of IE users is low but still. Fall back to XHR sounds like a good option.
How would you do the fall back? With try and catch? Or would you try to detect if user is on IE? If yes how? As far as I know there's no consistent way to check browser version.
As I mentioned in the article I'm a bit light on JS since I come from data science. So any tips are welcome :)
IE 11 is still supported by Microsoft, but not for long. Adding a development focus on IE in 2019 is a waste of efforts, unless it's absolutely necessary. You can do it in a progressive way, where a site is usable on IE, but won't support all the features.
Falling back to XHR can be as simple as checking
if (!window.fetch)
.There used to be some specific HTML comments for detecting if browser is IE (not supported in IE 11). But feature detection is a better alternative than browser detection.
Alright didn't know that. Feature detection there is then.
Thanks again.