First Things First: This Is Absurd

This is my personal website. I 99.9% don’t care about your personal information. That last 0.1% is that I recently started a seminar series, and I need to advertise it, so I care about being able to re-advertise to you if you responded to one of my previous ads. That’s about it. Here’s what this privacy policy should say:

I don’t care about your personal information. I’m only storing it if you give it to me, and I’m not selling it to anyone. Presumably you’re OK with that, or you wouldn’t have given it to me. Enjoy the site!

Actually, it probably shouldn’t even be that β€” I probably shouldn’t even need a privacy policy. Unfortunately, I probably do, and so here you go:

If you use this website (i.e., if you’re reading this page) your information has probably already been sent to at least the US and EU, and maybe elsewhere. Also, at some point, if you haven’t already, you’ll probably get some cookies. If you send an email to privacy@chrisrichardson.info, I’ll use reasonable efforts to remove your data from databases I directly control.

That’s it. That’s the privacy policy.

Let me explain. You see, the Internet isn’t that simple β€” all kinds of software is used, a lot of it is “in the cloud”, much of it is run by third parties, and the world is filled with unscrupulous people and companies. What’s the result? Governments are trying to protect you by writing regulations. In our case, because my servers and I are based in the EU, the relevant regulation is the GDPR. That piece of regulation is 99 articles long! It’s completely absurd that in order to have a personal website I’m meant to have read such a thing, let alone comply with it. Yet, here we are. We’ll come back to the GDPR in a bit, but first things first. Let’s talk about

How The Internet Works

The reality is, if you use this site (or any but the absolute simplest), some of your “personal information” is going to get shared around the Internet, and often stored on other companies’ servers, often in different countries than either you, I, or this server are located. If you’re not OK with that, you should go away now (in fact, you should probably stop using the Internet).

Let me give you a couple of examples.

MailChimp: If you’ve been on my site, you’ve probably noticed that I write a newsletter. If you subscribe to that newsletter, you give me your name and email address so I can send it to you. You’re probably OK with that. But I don’t send it to you by writing you a personal email in my email program. I use a service called MailChimp. When you sign up for my newsletter, your information is stored in my account on their servers. Presumably they don’t do anything with your information, and just let me use it to send you the newsletters, but I have no way of knowing if that’s really true or not. And I definitely don’t know in which countries the servers that store that data are located.

But it’s not just that. If I log into my MailChimp account, I see lots of cool reports. For example, they tell me that for my last newsletter, my top locations for readers were USA (73.3%), Australia (24.4%), and Czech Republic (1%). How do they know that? I don’t know. My best guess is, they looked up the IP address of the device from which you read the message. Did they store the IP address? I have no idea. Will they track whether that same IP address is used to read an email from some else’s mailing list? I definitely have no idea.

And it gets scarier than that. They also show me that my audience is “a combination of male, female, and another identity from 25 and up”. I could guess that they got your location from your IP address, but I can’t even guess how they know your age and gender identity.

Again, if you don’t like this stuff, you should get off the Internet.

Even if you trust me not to misuse your information, that’s not enough, because your information pretty much immediately moves beyond me. And even if I put up a GDPR Compliant “remove all of my data” button, I could try and do that all I wanted, but would have no idea what MailChimp ultimately did. Oh, and by the way, MailChimp isn’t even MailChimp anymore, because they got bought by Intuit. I really can’t explain why one of the world’s largest accounting software companies bought one of the world’s largest email-marketing companies, nor where they’re storing your data, nor what they’re doing with it. πŸ™

Here’s another example.

reCAPATCHA: You know how else the Internet works? It works by being filled with spam-bots and hackers. So, I need some anti-spam technology on my site. There are some really light-weight things I could use that don’t slow the site down at all and don’t gather any information about you β€” for example, I could make you solve a math problem every time you wanted to leave a comment on a blog post. But you know what that would be? That would be a terrible user experience (which neither you nor I want). So, I need to use some other sort of “CAPTCHA“. As of this writing, I’m using “invisible reCAPTCHA v2“. This works great at stopping spam bots, and doesn’t slow the website down except on the specific pages where I use it (e.g., on my “Contact” page), but has the downside that it’s run by Google. And so, some stuff probably gets sent to Google. To be honest, I’m pretty technical, but I have no idea what (if anything) is getting sent to them. Probably your IP address. I don’t care. Do you care? Probably not.

Those are just two examples. The reality is, I don’t know how many other examples there are. Here’s why.

How To Build A Website

In the old days (or today, if you’re a student), you might create a website by actually writing some HTML “code”. Say I want a coming-soon page. It might look like this. Even though that’s only one line of text, it took me 28 lines of code to get it to look like that:

<!DOCTYPE html>
<html lang="en">
  <head>
    <meta charset="UTF-8" />
    <meta name="viewport" content="width=device-width, initial-scale=1.0" />
    <link rel="preconnect" href="https://fonts.googleapis.com" />
    <link rel="preconnect" href="https://fonts.gstatic.com" crossorigin />
    <link
      href="https://fonts.googleapis.com/css2?family=Oleo+Script+Swash+Caps&display=swap"
      rel="stylesheet"
    />
    <title>Static Template</title>
  </head>
  <body style="background-color: #000;">
    <div
      style="
        display: flex;
        justify-content: center;
        align-items: center;
        height: 100vh;
      "
    >
      <h1 style="font-family: 'Oleo Script Swash Caps', cursive; color: #fff;">
        My site is coming soon!
      </h1>
    </div>
  </body>
</html>

No one writes web sites like that today, but even if they did, there’d be a problem. Those three link lines at the beggining are selecting the “fonts” β€” the actual typeface used for the text. That little, one-page, one-line website doesn’t collect any information, let alone your personal information. However, by virtue of setting the typeface, when you view that page your IP address (and probably a bunch of other stuff, like your browser type, browser version, operating system, device, and preferred language) gets sent to Google. German courts recently ruled that this is a violation of GDPR. Until I read that article, I had no idea that using a certain typeface was going to be a GDPR issue. Β―\_(ツ)_/Β―

Of course, this site isn’t built like that; it’s built with a content-management system called WordPress. Nearly half the websites in the world (43.2%) are built on WordPress as of 2022. Since WordPress is so popular, they’ve done something nice, and included a template for a privacy policy. That’s great, but the reason WordPress is so popular is that it’s so powerful and easy to add features. There are nearly 60,000 WordPress “plugins” which with a click of a button can add functionality to a website. That’s amazing! It’s great for website developers, and it’s great for website viewers, because you get a better experience. However, each of those plugins is written by someone, and almost 100% of the time that someone is neither me nor WordPress. And since they’re added with the click of a button, no one really knows what they do. Of course, they provide the functionality that I wanted to add, or I wouldn’t have added them. But they might do other things. They might send your data places. They might give you cookies. There’s no good way for me to know, and so no good way for me to let you know.

Let me give you an example.

You might have noticed that there are some forms on this website. WordPress doesn’t have built-in forms. If you want to add a form, you add a form plugin. Until recently, I was using a plugin called Contact Form 7. Then I decided I was going to start doing live seminars, and I needed some way to sell tickets. The easiest way to do that was to use Stripe. In turn, the easiest way to use Stripe was to switch my form plugin to WP Forms. When I switched to WP Forms, there were some configuration options to stop spam-bots on the forms. As discussed above, I ended up using invisible reCAPTCHA v2. Why? Because all the other options either didn’t work (i.e., I kept getting spam form submissions), or they slowed down the site enough that it would have annoyed you (e.g., Google’s new reCAPTCHA v3 slowed down every page on the site by 700ms. 700ms doesn’t sound like a lot, but any page that takes more than 2 seconds to load is likely to lead to you going away, and 700ms is a huge percentage of that 2 second threshold).

Now where are we? I have no idea what information invisible reCAPTCHA v2 uses (and I really don’t know if there’s a difference between that and reCAPTCHA v3), I have no idea what the WP Forms plugin code does to enable that feature, oh, and by the way, to take payments I had to turn on the Stripe “sub plugin” β€” a plugin within the plugin β€” and I have no idea what it does in terms of sending your information and setting cookies.

Basically, every time I touch the website, I end up making decisions like that. Those decisions impact what happens with your data, but they’re not made because I’m trying to do something with your data β€” they’re made to deliver some feature or result (e.g., the speed improvement choice).

Speaking of speed, let’s come back to

How The Internet Works

We often talk about a “web server”, as if it’s a single, physical computer sitting somewhere. But that’s not often the case. The reality is, the time it takes a page to load really is important to user experience. There are three major components to page-load time β€” how much of the “server” resources are used to generate the page, how much bandwidth is available between you and the server, and the speed of light (i.e., how far away from the server you are).

Web sites solve these problems in two ways. First, they have more than one server, and your requests get distributed between those servers through something called a load-balancer. In my case, both my servers and load-balancer are located in Germany, because that was the best place for me to put them in terms of privacy laws while still meeting the technical requirements I had. But, for sufficiently large/busy websites those servers might exist in different countries, and your traffic (i.e., information) might be sent to any of those countries. But it doesn’t matter, because even though my servers are all in Germany, the second way in which they solve these issues is through content-delivery networks (CDNs).

Most of the content on most websites is “static” β€” meaning it doesn’t change. The server doesn’t actually need to do anything computationally, it just needs to deliver a piece of content to your device. Since we don’t actually need any computational power, there’s no reason not to just put that content as physically close to you as possible. Since it’s close to you, the speed of light issues are reduced (if my server is 1/2 way around the world from you, the content doesn’t have to travel that far, it just needs to travel from your closest copy on the CDN), the bandwidth issues are reduced (the content only has to traverse the “pipe” between you and it, not all the way across the very busy backbone of the Internet), and even server resources are reduced (the request never actually even gets to the “server”, it stops at the edge of the network where the content is sitting).

This is where that German judge’s ruling becomes an issue. You see, his argument was, you (the website owner) don’t have to use Google’s servers, you can host the fonts on your server. That’s true; however, that “server” actually probably includes a CDN, and lots of content is probably actually served from the edge of the network β€” anywhere in the world. And that’s not just Google fonts, it’s any static piece of content, including all the text and images on the site. When your device got a copy of this text that you’re reading, it got it from some CDN edge location, and your device sent to that edge location the exact same information the judge was complaining about having been sent to Google.

In my case, the CDN I’m using is Amazon’s (but it could be Cloudflare’s, or Akamai’s, or any number of other companies’ that provides CDN services). If you’re in Europe, probably you got this content from Amazon’s edge location in Germany. But maybe not. Maybe the Internet decided going to the UK was a better choice. In fact, maybe the Internet made some crazy choice, and sent it to you from some random Amazon CDN edge location β€” say, Australia, or Singapore! Not only do I not control that as the website owner, I have no way of knowing. Moreover, not even Amazon does. The Internet made that decision. So, as I wrote way back up at the top in the actual privacy policy portion of this page, your information has already gone somewhere. I don’t know where, and I don’t know what Amazon did with it. Β―\_(ツ)_/Β― That’s just how the Internet works.

OK. That’s already a lot. But there’s one more thing we need to cover. Let’s talk about

Cookies

Cookies (and several other similar technologies, for example JWT) are ways that a website stores information on your device. If the information were unencrypted, that might be a broad issue, because any other website could read the information, but in practice no one does that, so that’s not the issue. And the main uses of cookies are largely innocuous (e.g., all the uses described in the WordPress privacy policy linked above). I mean, if you log into my website, you probably don’t care that I store a cookie with your authentication tokens, so you don’t have to re-login the next time you come back. And if I put the cookie there (and encrypt the data), I’m the only one who can read it. So, what’s all the hullabaloo?

Here, the problem is the “secret” Internet giant advertising companies β€” companies large enough that many sites are running their software. So, for example, let’s say I’m running a Taboola plugin on this site (I’m not, and I don’t know if they use cookies or not β€” I don’t even know if they make a plugin, but they’re a good example of the kind of company I mean). If that plugin puts a cookie on your device, and you go to one of the millions of other sites undoubtedly running a Taboola plugin, then that plugin can read the data and share that information back with the parent company. Then, they can target ads at you.

There might be some obvious concerns if I could read the data, but as far as I know, none of these systems allow that. In other words, if you came to my site, the putative Taboola plugin almost certainly wouldn’t let me know what (if anything) they stored on your device on other sites. Instead, what they’d likely do is send whatever data they can gather about you back to the mother ship (including the fact that you were on my site).

Let’s give an example. Pretend this site is a travel network, and you come here looking for flights to Bali. Taboola gathers that information in the background. Later, you go to CNN.com, who happens to be running Taboola (again, I don’t know if they do or not, but stick with me). Then, the Taboola plugin on CNN reads your data from the cookie, sends it back to Taboola, Taboola says, “Aha! I know this guy! He was just shopping for trips to Bali,” and promptly displays an ad for Bali hotels to you on the CNN site.

As with the “personal information” discussed in general above, you probably don’t care about this. In fact, you probably want it β€” you want the ads to target you. If I’m a young mother of 2, and not a grandma, I’d rather get ads for diapers than for vitamins for the elderly, even if I’m not in the mood to buy diapers right now. Targeting is good. So, what’s the problem?

Well, the problem is those giant 3rd-party advertisers. Of course, they’re not out to do anything malicious. All they care about is serving you exactly the ads you want, because that’s how they make more money. They don’t make money by sending you ads which you don’t like, and they don’t make money by publicly defaming you. But … they’re filled with people. What if you recently browsed to my-slightly-weird-kink.com, and one of those employees found that in their data, and tweeted, “Hey, @Bob has a slightly weird kink”? You wouldn’t like that, and I guess that’s what the regulation is about? But, a) that’s a highly unlikely edge case, and hardly seems to justify all of these cookie popups we now get at every site we visit; and b) guess what? As with the personal information, we as the website owners can’t actually do anything about it with any reasonableness.

You see, most of those “accept my cookies” popups are, yet again, from some sort of plugin. That plugin can only control the cookies it knows about. And even giant companies with a lot of tech talent can’t do anything about this. Here’s a test. Go into your browsers cookie history, and remove all cookies for any kayak (the travel company) sites. Make sure you don’t have all cookies blocked by your browsers. (You’ll have to figure out how to do those last two things β€”Β they depend on which browser you’re using). Now, go to kayak.ie. You’ll get a nice little GDPR popup about your privacy. Click on “No Thanks”. Now, go back and look at what cookies your browser has stored. Yep. You got a cookie from Kayak even though you told them “no”. What’s in that cookie? Β―\_(ツ)_/Β― Does Kayak know they’re doing that? Β―\_(ツ)_/Β― But if a giant Internet travel company can’t avoid giving you a cookie, what’s small business with 10 or even 100 employees supposed to do? What are they supposed to do if they’re not a tech company, and they hired someone to build their website for them? What’s an individual with a private website like this supposed to do?

Yeah, you get it. I probably already gave you a cookie. So, back to the formal privacy policy:

If you use this website (i.e., if you’re reading this page) your information has probably already been sent to at least the US and EU, and maybe elsewhere. Also, at some point, if you haven’t already, you’ll probably get some cookies. If you send an email to privacy@chrisrichardson.info, I’ll use reasonable efforts to remove your data from databases I directly control.