How much lost data does Server-Side Tracking actually help you recover?

June 21, 2021
Michael Taylor
Facebook Twitter WhatsApp LinkedIn

The release of iOS14 has thrown more fuel on the user privacy fire that started with GDPR, ad blockers and ITP. Some marketers are finding ways to do their job without tracking user data, others have explored probabilistic attribution models like marketing mix modeling, and the rest are looking to server-side tracking to claw back what they lost. 

Everything I’ve seen written on server-side tracking has been theoretical – why it should work rather than proof that it does. I’m not one for wishful thinking, so I decided to test it and share the results. Does Server-Side really fix broken tracking?

At the time I put this test live in October 2020, iOS14 had been released the month prior and I was working as an analytics consultant. This website was getting about 60 pageviews a day from ~30 users, so nothing major but enough to see differences. 

A few weeks later I decided to focus on vexpower.com (a simulator-based course on marketing mix modeling) as a solution to this problem, so… I pretty much forgot about the experiment until today.

It was about time I looked into the results and wrote them up – tracking & attribution are increasingly relevant topics. However I have limited time so I haven’t gone deep into this. I thought I’d just get a rough draft out there and see if anyone is interested. If you’re curious about the results and want me to analyze the data further, tweet at me.

Methodology

I wanted to test if server-side tracking actually helped you recover lost data from users that opted out of tracking. The way server-side tracking works, is by firing an event to my own URL (ssa.saxifrage.xyz) rather than to Google Analytics’ servers, then returning a server-side cookie, which is not deleted by Safari and other browsers the way normal cookies are (because they’re typically used for logging in). In theory this helps you get round the ad blockers, because they don’t recognize your own domain as ‘tracking’.

Of course I still wanted to respect my user’s wishes, so I used the ‘anonymize IP’ feature in Google Analytics where I tracked them past the cookie banner. The banner is a simple ‘accept’ or ‘decline’ notice, which sets a cookie if they accept, which I then read to decide to trigger tracking or not.

My website is just a simple blog with no really important conversion metrics, and I don’t run any ads, so my use case for analytics is simply to check if people are reading my content, and what content is popular. I actually don’t need or care about user-level data, so I have no business incentive not to anonymize personally identifiable information.

I wanted to see if I could ‘hack’ Google Analytics' measurement protocol to not even have the GA code on my website (i.e. a cookieless solution), and only send them pageviews and events. I put this code here on my github – it’s open source if you want to use it. There are other privacy-friendly alternatives like Simple and Plausible, but since 56% of the web uses Google, finding a way to still use it without invading user privacy would be pretty cool.

Finally I also wanted to test the new Google Analytics 4 – this is the biggest change to Google Analytics since the release of Universal Analytics, and even though I still haven’t gotten used to the way it looks, I thought it’d be worth seeing if there were measurement differences. There was also an ‘enhanced’ tracking feature that’s on by default, and gathers extra event data automatically, that I wanted to include in the test. 

That left me with 7 different tracking profiles:

  1. My existing Universal Analytics setup, tracking users who accept cookies
  2. Standard Universal Analytics setup, but with anonymize IPs, tracking everyone
  3. Server-Side implementation of Universal Analytics, also with anonymize IPs
  4. Standard Google Analytics 4 setup, with anonymize IPs, tracking everyone
  5. Standard Google Analytics 4 setup, with anonymize IPs, ‘enhanced’ turned off
  6. 1x1 Image Pixel, firing a hit to my tracking server, with anonymize IPs
  7. 1x1 Image Pixel, firing a hit to Google Analytics, with anonymize IPs

All but the last two were implemented with Google Tag Manager on the client side. Even though ad blockers wouldn't know to block the hits to my tracking server, they could potentially block Google Tag Manager, or recognize Google Analytics in the URL if I sent the traffic there instead of to my custom domain. So I hard coded these in my Webflow site code, skipping GTM entirely. I didn’t want to track users who didn’t accept cookies because I live in the EU and don’t want to pay a GDPR fine, but maybe someone braver will try that experiment and see how much anonymizing IP addresses affects measurement?

Results

All the data I’m reporting here was from the last 7 days, 14th June - 20th June, 2021 (today is the 21st). I am looking at Pageviews, Sessions and Users, because there should be big differences across the tracking methodologies. If you don’t have uniquely identifiable information on a user to stitch together their behavior and recognize them again, you’ll track more unique users and maybe even more sessions – so more equals bad. However pageviews should always be the same.

*Embarrassingly, version G had a bug in the code (it declared a variable I had already declared elsewhere), so it completely failed to track anything (and I failed to notice). Another life lesson in avoiding hard-coding wherever possible... but of all the variations to fail, this is the least interesting one, so I’m cool with it. I’ve just pushed a fix, so I might update this later.

The biggest takeaway here is that by only tracking the actions of users who accept cookies, I’m measuring less than half of the pageviews on my content! The number of times my blog posts are read is my number one concern, and I don’t need to track individual users, so this truly makes the case for more anonymized analytics.

The server-side implementation gave the exact same numbers for sessions and pageviews as the standard Universal Analytics setup, though it did report a more accurate User count (a lower count means they managed to connect the dots better). This means Server-Side analytics (at least for my use case) was not worth the effort. Where this could be useful is if you have logged in users, and lots of events happening off site or in an app. It would also be useful if spending money on advertising, because you’d be able to see the users that you acquired come back and buy something in another session. However for the intended use case of ‘recovering data lost by ad blocking’ it doesn’t seem to help.

Surprisingly sticking with Universal Analytics rather than upgrading to GA4 made a bigger difference to the numbers than implementing Server-Side. When Pageviews are the same but Users and Sessions are higher, that’s a bad thing – it means we failed to stitch together those Sessions or recognize those Users when they came back. We inflated Users by 9 and counted 53 more Sessions with GA4 vs UA. This is probably due to some of the way GA4 has changed how events and sessions are calculated, but it looks like a step backwards. Enhanced actually did worse, in that they slightly inflated sessions (you would think they’d be better able to stitch sessions together, by tracking more events automatically).

Finally my pixel experiment was kind of weird. I tracked 54 Users… but there shouldn’t have been any!? I passed 555 (a code for anonymous) in the tracking hit and passed no User ID, so I’m not sure how there could have been any users tracked. I am also unsure as to how it tracked Sessions… is Google passing back a cookie somehow when I ping their API, even through Server-side Tracking? In any event I tracked 30 more Pageviews which was interesting. Potentially these are bot hits(?) or people refreshing the page, but judging by this I think it’s worth adding some sort of ‘log level’ tracking to your app if you care about raw counts of an event.

Update: 28 days data

One question I immediately got was “can we see more than a week’s worth of data”? That’s because Apple’s Intelligent Tracking Prevention in Safari clears cookies after 7 days, so the effect would show up over a longer time period. Here’s the same data but for the last 28 days – 24th May to 20th June 2021.

Looking over a longer 4-week timeframe, we can see that standard UA inflates your User count by about 35%, and your sessions by 6.5%. This is compared to about 25% user inflation, and no inflation of sessions for 7 days data – this difference could very well be the ‘ITP-effect’ we were looking for. GA4 inflated users by only 1% over 28 days, compared to 6% on a week’s data, so this is very likely to be just a sampling or counting issue, nothing major to worry about.

Conclusions

This is a pretty limited experiment, and most businesses aren’t small personal blogs so they have good reasons to track users. However I’d say in this case at least, the answer is that Server-Side Tracking isn’t a silver bullet! It doesn’t seem to perform any better than the standard implementation of GA4 in recovering lost pageview data (i.e. people blocking tracking all together), though it does seem better at stitching together unique Users, deflating unique users by 35%, which is huge! That has important implications for advertising attribution, for example tracking people who clicked on an ad and then came back later to purchase – better stitching together of Users means better attribution to the right channels. The data might have been different if I ignored user privacy and didn’t anonymize IPs, so I’d love to see data there if anyone is willing to test it? There’s definitely something to be said about tracking log level anonymous events with an image pixel, or some better method, but for an 8% boost in Pageviews it’s probably not worth it. If anyone else has run a similar experiment and has different findings, I’d love to hear about it!

November 8, 2022

More to read