While looking at a WebPageTest waterfalls, I often come across surprising attributes. A few weeks back, I came across a page that had images with a familiar file naming convention:
https://twitter.com/dougsillars/status/1086196363010949121
This website is using the Mac screenshot tool to create images that they are using online!
Image Optimization
I’ve written a lot about image optimization, on GIFs, Base64 encoding and more (read more at dougsillars.com). My gut feeling was that websites that use screenshots are not optimizing the content for the web. So I decided to find out.
Are Screen Shot Images On The Web Optimized?
The HTTP Archive dataset for January 2019 has data on the load characteristics of ~4M mobile websites. Searching the filenames of all requests made for those sites, I discovered that 36k (~1%) of pages have an image with the name “screen shot” (112k discrete files). To determine if they were optimized or not, I selected 1500 images with the term screen shot in the filename. To keep the data timely, I also only used images with ‘2019’ in the filename.
To create optimized versions of the images in my dataset, I used Cloudinary’s fetch feature which allows cloudinary to optimize images on remote servers. I used the f_auto and q_auto parameters to optimize the format and quality of the images. Cloudinary converts the pngs to jpg (f_auto for curl), and then found the optimal size/quality using SSIM (the q_auto command).
Original:https://example.com/screenshot.png
Optimized: [https://res.cloudinary.com/demo/image/fetch/f\_auto,q\_auto/https://example.com/screenshot.png](https://res.cloudinary.com/demo/image/fetch/f_auto,q_auto/https://example.com/screenshot.png)
To get the file size of each of these images, I used curl to create a csv with the image url and the size downloaded:
xargs -n 1 curl --write-out '%{url\_effective},%{size\_download}\n' --silent --output /dev/null < 1500screenshots.txt >results.txt
Of 1289 results, 11% were within 10% of the original size — so we will call those optimized. However, 86% of the images were reduced by at least 50%, and 73% could be made 75% smaller KB from the original. This confirms my hypothesis, that files with the term “screen shot” in the file name are generally not optimized for web delivery.
Aside: All the images in this post are screen shots.
But with ImageOptim installed on my Mac, I can just “right click” — choose ImageOptimize before uploading to WordPress. One extra click — and all the images are optimized.
Fun with Regex: Dates and Times in Screenshots
So, there are a lot of unoptimized images out there — this is not a huge surprise. But we have some cool extra data here — the exact moment the image was captured. So — when are these screen captures made? Using some fancy regular expressions, we can run the numbers:
Hours that screenshots are taken:
Most screenshots are taken between 9AM and 6PM.
Note : While it looks like there is a huge drop during lunch and a big spike at midnight, I think this is due to my data processing — adding 12 to any time with the “PM” suffix (12:30 PM becomes 24:30 — maths can be hard).
Day of Week
Mostly during the week, and Friday is slightly lower than M-Th.
What year are screenshots taken?
This actually surprised me a bit — for a dataset taken in early January — already 14% of all screenshots were from 2019, and nearly 70% are from the last year. Of course, that means that 30% of screenshots on the web are from 2017 or earlier.
We can further see how recently the photos were taken. Here is the count of screenshots by day for the last 2 years.
The numbers are not terribly important, but we can see that most of the screenshots are recent — and a huge number were taken in January — while the dataset was being collected! 🙂
Conclusion
https://twitter.com/wesbos/status/1090696149323927552
There are many websites that are using screen capture on the Mac as an essential part of their image processing pipeline for regular updates on the web. However, very few of these images have any optimizations performed (86% of images can be reduced in size by at least 50%). Based on this, it is a reasonable assumption that if your website is using Screen Shot images, you have some optimization work to do.
Originally published at dougsillars.com on February 10, 2019.
Top comments (0)