# Homegrown Analytics I must start with a quick disclaimer. Replacing the near-universal web analytics with your own is not the right path for *most* people. What I’ve made is nowhere near equivalent, it involves writing my own code, any analysis I want to do is all on me, etc. If you are here because you want to remove GA from your site, you could try one of the more popular alternatives like [Plausible](https://plausible.io) or [Piwik](https://piwik.pro/web-analytics/). Personally, I had looked at Plausible, but $9/mo (which is a completely reasonable price) seemed like a lot for the tiny amount of traffic I get. Ok, so that’s out of the way. ## The motivation I was running performance reports on my site, using Google’s Lighthouse, and while *most of the time* it was scoring high, occasionally it would flag JS issues that were thread blocking, or in this most recent run it had some issues with cookies. In both cases it was Google Analytics (GA) that was called out and that annoyed me. I work hard to make sure my site runs without any script, but GA is my main exception and I have little control over it. > I’m sure I could fix the cookie issues, I’m sure that I’m not doing something right with my implementation of their script, but I’ve always wanted to get rid of this 3rd party dependency *anyway*. I started to think about how little information I really need in terms of analytics, way less than GA provides. Given that, building a replacement shouldn’t be too difficult. > I’m reminded of [the humorous quote]( https://x.com/pinboard/status/761656824202276864) “we do these things not because they are easy, but because we thought they were going to be easy”. ## My requirements As always, I wrote up my basic requirements before doing anything else. The replacement should: - Work without JavaScript (progressive enhancement is fine, so it can do *more* with JS enabled), - Use only first-party cookies, - Provide traffic data daily, and - Track all the data I care about. For that last point, I produced this list of desired reports/views, all viewable by day/week/month: - Overall page views - Views by page - Views by referral source (requires JavaScript) - Views by new visitors vs. returning (implies a cookie or similar) - Views by browser / device - Views by country I have some more random thoughts, like “views by dark vs. light mode” that I don’t currently get in GA (although I could add it with some client-side script), and I’m sure I’ll add more tracking over time (web vitals?), but this is the core set. I’ve put them in priority order as well, and I’ll implement solutions starting with a basic set of tracking. ## The good old 1px image solution A tracking pixel, just a very tiny image requested with an `` tag on your page, is the classic way to add data gathering to a web site without using JavaScript. The image source is an endpoint on your server, and the request is formatted to pass along data via query string parameters as well as everything that comes on the standard HTTP Request (user agent, IP address, accept-language, cookies). I’m going to start with this, because I figure that will be my *base* implementation. I just need something to point it to. ## Handling the image request I could spin up a web server, but to ensure it responds quickly I’d need it to run 24/7, and that’s inefficient for my site’s level of traffic. Instead, I will use another Azure Function (like I did in [the order fulfillment article](/blog/order-fulfillment/)), that responds to a HTTP GET request and puts a message into a Queue. I’ll pull some information from the query string and some from the request. As a start, I’ll be trying to fill up this data structure: - Page Type (homepage, post, content, album, album list, etc.) - Post Title - Canonical (/blog/foo) - Actual URL (https://www.duncanmackenzie.net/blog/foo?bar=buzz) - User-Agent (mostly to give us OS & Browser info, but it’s a complicated string like `Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/124.0.0.0 Safari/537.36`) - IP Address (8.8.8.8) - Accept-Language (en-US,en;q=0.9) - Referrer I’ll create the image tag in my Hugo code, adding readily available info (title, type, and the relative canonical URL) as query string params all at build time. IP, Accept-Language, User-Agent, and URL can all come from the request. Referrer is a problem though, as the request for the *page* will have received it, but the referrer for the image request will be the site page itself. We’ll come back to that in a moment, but with all of this figured out, I was ready to start planning the Azure side of the project. As I mentioned, the image URL will point at an Azure Function, which will extract all the info from above, turn it into a message, push it onto a Queue (for further processing), and return a transparent 1px image. I want this function to return as fast as possible, so I avoid doing any work on the data at this point, just grab it raw and push it to the queue. The function is triggered by a GET request, creates the request object, pushes it onto a Queue, and returns the 1px image. The request object (how it looked to start) ```csharp public class RequestRecord { public string id { get; set; } public string requestTime { get; set; } //simplified version of the day portion (20240503) public string day { get; set; } //the permalink/canonical relative form //of my URLs (like /about or /blog) public string page { get; set; } //the full URL that was requested, //with whatever query strings might be on it public string url { get; set; } public string ip_address { get; set; } public string title { get; set; } public string referrer { get; set; } public string accept_lang { get; set; } public string user_agent { get; set; } public string country { get; set; } public string countryName { get; set; } public bool js_enabled { get; set; } } ``` And the Azure Function ```csharp [FunctionName("event")] public static async Task trackAnalyticsEvent( [HttpTrigger(AuthorizationLevel.Function, "get", Route = null)] HttpRequest req, ILogger log) { try { string connectionString = Environment.GetEnvironmentVariable( "AZURE_STORAGE_CONNECTION_STRING"); QueueClientOptions queueClientOptions = new QueueClientOptions() { MessageEncoding = QueueMessageEncoding.Base64 }; QueueClient queueClient = new QueueClient(connectionString, analyticsEvent, queueClientOptions); RequestRecord request = CreateRequestRecordFromRequest(req, log); if (!FunctionsHelpers.RequestIsCrawler(request) && !FunctionsHelpers.RequestIsLocal(request)) { string message = JsonSerializer.Serialize(request); await queueClient.SendMessageAsync(message); } req.HttpContext.Response.Headers .Add("Cache-Control", "private, max-age=300"); var pxImg = new FileContentResult( TrackingGif, "image/gif"); return pxImg; } catch (Exception e) { log.LogError(e, e.Message); throw; } } private static RequestRecord CreateRequestRecordFromRequest( HttpRequest req, ILogger log) { RequestRecord request = new RequestRecord(); request.ip_address = GetIpFromRequestHeaders(req); request.accept_lang = req.Headers["Accept-Language"]; request.url = req.Headers["Referer"]; request.user_agent = req.Headers["User-Agent"]; var query = req.Query; if (query.ContainsKey("page")) { request.page = query["page"]; } if (query.ContainsKey("title")) { request.title = query["title"]; } if (query.ContainsKey("referrer")) { request.referrer = query["referrer"]; } if (query.ContainsKey("js_enabled")) { request.js_enabled = true; } DateTimeOffset requestTime = DateTimeOffset.UtcNow; Guid id = Guid.NewGuid(); request.id = $"{requestTime:HH':'mm':'ss'.'fffffff}::{id}"; request.requestTime = requestTime.ToString("o"); request.day = requestTime.ToString("yyyyMMdd"); return request; } ``` In my Hugo site, I add this image to the bottom of the page with a partial, creating a version for both the no JavaScript situation and another for when I can execute code. ```go-html-template {{- $rootPath := .Site.Params.homegrownAnalytics }}
``` As you can see in the code, the script version lets me add the referrer, and I also pass along the js_enabled param so I can track what % of visits are hitting my `